KLUM: An Urban VNIR and SWIR Spectral Library Consisting of Building Materials

Knowledge about the existing materials in urban areas has, in recent times, increased in importance. With the use of imaging spectroscopy and hyperspectral remote sensing techniques, it is possible to measure and collect the spectra of urban materials. Most spectral libraries consist of either spectra acquired indoors in a controlled lab environment or of spectra from afar using airborne systems accompanied with in situ measurements. Furthermore, most publicly available spectral libraries have, so far, not focused on facade materials but on roofing materials, roads, and pavements. In this study, we present an urban spectral library consisting of collected in situ material spectra with imaging spectroscopy techniques in the visible and near-infrared (VNIR) and short-wave infrared (SWIR) spectral range, with particular focus on facade materials and material variation. The spectral library consists of building materials, such as facade and roofing materials, in addition to surrounding ground material, but with a focus on facades. This novelty is beneficial to the community as there is a shift to oblique-viewed Unmanned Aerial Vehicle (UAV)-based remote sensing and thus, there is a need for new types of spectral libraries. The post-processing consists partly of an intra-set solar irradiance correction and recalculation of reference spectra caused by signal clipping. Furthermore, the clustering of the acquired spectra was performed and evaluated using spectral measures, including Spectral Angle and a modified Spectral Gradient Angle. To confirm and compare the material classes, we used samples from publicly available spectral libraries. The final material classification scheme is based on a hierarchy with subclasses, which enables a spectral library with a larger material variation and offers the possibility to perform a more refined material analysis. The analysis reveals that the color and the surface structure, texture or coating of a material plays a significantly larger role than what has been presented so far. The samples and their corresponding detailed metadata can be found in the Karlsruhe Library of Urban Materials (KLUM) archive.


Introduction
Assessment of materials in urban areas has in recent times increased in importance for several reasons. For one, this knowledge is useful for city planners and researchers while working with city models or simulations where the need for a high level of detail about the buildings, which can include the materials, is important. This can include additional information for 3D building models for formats such as CityGML and its applications [1][2][3] in addition to thermal city simulations [4][5][6]. Secondly, as the urban heat island effect [7] is an increasing occurrence in cities [8], further knowledge about the materials in urban areas can be an indicator on how to tackle and handle the effect [9][10][11][12]. Lastly, with the information about the material, it can be possible to assess the heating and cooling demand in combination with thermal infrared data [13][14][15].
To assess surface materials or land cover classes in urban areas, it is a common and efficient procedure in the remote sensing community to perform classification using hyperspectral or multispectral imagery. To carry out a classification, it is necessary for supervised classifiers, such as Random Forests [16] or Support Vector Machine [17], to have available data for training the classifier and for evaluating the performance using testing data. Hyperspectral data is often used for classification due to the broad wavelength range (e.g., 350-2500 nm). Furthermore, it is also suitable for urban material classification [18,19] as hyperspectral data can ease the distinction of characteristic spectral features due to the large spectral range. Another advantage of hyperspectral data as opposed to multispectral, which consists of spaced spectral ranges, is that hyperspectral data allows the use of gradient calculations.
Spectral libraries containing urban materials can be used as training data for material classification in areas when no prior knowledge is available or when it is not possible to collect ground truth data. Spectral libraries can be based on spectra acquired either in situ [20], in the laboratory [21,22] or by a combination of airborne and in situ data [23][24][25][26][27]. As most material classifications using hyperspectral imaging are acquired from airborne systems, the materials available in spectral libraries reflect those needs, i.e., the urban materials available in spectral libraries are often materials that can be seen from above, such as ground and roof materials. With the increased usage of Unmanned Aerial Vehicles (UAVs), there is also a desire to classify materials on building facades [28,29] using hyperspectral sensors on UAVs [13]. However, facade materials are often not well represented in spectral libraries since the material assessment of facades has, until now, not been common practice. Thus, there is an upcoming need for such spectral libraries as there is a shift to oblique UAV-based remote sensing.
Furthermore, as one material can have different surface structures and textures in addition to various conditions and colors, the material's characteristic spectral features can vary. This can be seen in the study by Kotthaus et al. [25]. This variation depends on the aforementioned examples and is often not well represented in spectral libraries. Additionally, some existing spectral libraries do not provide a photo of the material, which can be a challenging task to either validate, compare or match acquired samples with existing spectral libraries [30,31].
Thus, motivated by the under-representation of spectral libraries with detailed descriptions about their spectra and metadata in addition to libraries with a focus on facade materials, we present in this study the Karlsruhe Library of Urban Materials (KLUM) that fills those gaps. This spectral library consists of collected in situ building material spectra in the visible and near-infrared (VNIR) and short-wave infrared (SWIR) spectral range, specifically in the wavelength range of 350-2500 nm. The samples were acquired in the southwestern German city of Karlsruhe, with a focus on facades material and a large material variation. Furthermore, the material samples are labeled and clustered into classes and subclasses for an easier access to similar material samples. Section 2 briefly gives an overview of current urban spectral libraries. This is followed by the methodology in Section 3 that include the measurement setup, the data processing (intra-set solar irradiance correction and recalculation of reference spectrum) and the material categorization (clustering and spectra validation). The results are presented and discussed in Section 4 and, finally, the study concludes with Section 5.

Existing Urban Spectral Libraries
Urban spectral libraries can serve as a tool for comprehensive overview or as a database for material labeling since they contain the characteristic spectral features of various building materials. Such libraries are often used as training data for material classification, as they provide the important spectral features. Furthermore, libraries that focus on urban areas, in particular urban materials, have been compiled in several countries across the world since building materials can vary regionally due to different available construction materials. Thus, spectral libraries are often generated to represent a particular region and/or with a particular purpose in mind to suit the needs.
Earlier examples of urban spectral libraries with a focus on materials have been made by Price [32], Ben-Dor et al. [27] and Heiden et al. [33] (extended through Heiden et al. [23]). The spectral library made by Ben-Dor et al. [27] used spectra acquired in Tel-Aviv, Israel that had been collected by Price [32]. In their study, they acquired spectra using in situ measurements and airborne images. The study done by Heiden et al. [33] and the extension by Heiden et al. [23], introduced a spectral library consisting of urban materials located in Dresden and Potsdam, Germany. The spectra were acquired and assessed by combining in situ measurements with airborne hyperspectral data. The spectral library of Santa Barbara, US, made by Herold et al. [24], is also based on spectra from both in situ and airborne measurements. The Santa Barbara spectral library has mainly its focus on roads and roofs, whereas a further study regarding the effects on road material aging was later complied by Herold and Roberts [34]. Another example of a spectral library that used the same acquisition combination is DESIREX [26]. The work of Sobrino et al. [35] conducted a study using ground truth data from the spectral library ASTER [21] in combination with DESIREX, and was able to classify urban materials in the city of Madrid, Spain. On the other hand, there are spectral libraries that are purely based on spectra acquired in the laboratories, such as ASTER [21] and the USGS spectral library [22]. Both ASTER and USGS cover a large variation of mainly natural materials but also construction materials. A recent follow-up to the USGS spectral library [22] is the USGS Spectral Library Version 7 [36]. Here, the spectra have been acquired both in situ, from airborne systems and in the laboratory and cover a large variation of mainly natural materials, but also artificial materials. The combination of laboratory and in situ acquisitions was performed to create the spectral library of LUMA-SLUM [25]. Here, material samples were acquired in the city of London, UK, to extend the accessibility to material spectra in the long-wave infrared (LWIR) spectral range. They assessed spectral features from construction materials and provided photos of each sample. Furthermore, LUMA-SLUM contained several samples of the same material, and it could be observed that the spectral features can vary significantly in the VNIR and SWIR spectral range. In situ-based spectral libraries are few, such as the one produced by Nasarudin et al. [20], since the surrounding environment cannot be controlled (e.g., the solar irradiance and the water vapor absorption). This spectral library contains spectra acquired at a university campus in Serdang, Malaysia, and contains mainly roof and ground materials. Most of the aforementioned urban spectral libraries cover only the VNIR and SWIR spectral range but with some exceptions, which can be seen in Table 1.
The publicly available spectral library LUMA-SLUM [25] is a fine example on how material spectra should be presented. This library contains photos of the accessible material samples which is a helpful tool since the library also contains several examples of the same material in various conditions but with varied characteristic spectral features. The recent USGS Spectral Library Version 7 [36] is another example on how to present a spectral library online. Here, an interactive search engine is available for users allows searching and filtering out specific materials. However, photos of the samples are not always provided.
Although the currently existing urban spectral libraries contain a large variation of natural and man-made materials, the main focus has so far been on urban material visible from a nadir point of view (horizontal surfaces facing the sky, such as asphalt, soil and grass) and this can be seen in Table 1. The under-representation of facade material samples in existing spectral libraries can be seen in column Content, where most contain few facade material samples. Spectral libraries often aim to provide an overview of building materials but they do not always provide a detailed description about the surface color nor the structure for all samples, with LUMA-SLUM [25] being one of the few exceptions.

Methods
We first describe the measurement setup in Section 3.1, where we explain the equipment we used and the acquisition procedure. This is followed by Section 3.2 with a focus on the post-processing. Here, we first explain the less work-intense processing steps, such as the detection of outliers and noise in addition to the handling of the spectral ranges where water absorption is present. Lastly, we describe in detail the solar irradiance intra-set correction and the recalculation of the reference spectrum due to signal clipping. We dedicate the last Section 3.3 to material categorization, where we describe the clustering of material of the same composition and we compare those clusters with samples from existing spectral libraries.

Measurement Setup
The in situ spectra are acquired with the high-resolution spectroradiometer ASD FieldSpec-4 Hi-Res (www.malvernpanalytical.com). The FieldSpec spectroradiometer has a spectral range of 350-2500 nm and consists of three sensors; one VNIR sensor (350-1000 nm), one SWIR1 sensor (1001-1800 nm) and one SWIR2 sensor (1801-2500 nm). FieldSpec has a spectral sampling of 1.4 nm and a spectral resolution of 3 nm in the spectral range of the VNIR sensor and a spectral sampling of 1.1 nm and spectral resolution of 8 nm in the spectral range of the SWIR1 and SWIR2 sensors. In total, the spectroradiometer has 2151 channels and a wavelength accuracy of 0.5 nm. The optical fiber probe has a field of view of 25 • .
We collect the spectra from a distance of 20 cm with the coverage area of around 9 cm in diameter. By acquiring the spectra using a smaller coverage area, we reduce the chance of acquiring spectra from different materials. The measurement setup is always the same, the reference spectrum and the measured spectrum are acquired from the same direction by placing the spectralon reference plate on top of the material surface. Thus, the solar incident angle is the same for the reference spectrum and the measured spectrum. One sample consists of a set of 10 spectra, which will be referred to an intra-set, and was acquired within 60 s. A reference spectrum is acquired before each intra-set acquisition by using a 95% spectralon reflectance plate for correction of incoming solar irradiance. Field spectra should be collected at cloud-free conditions to ensure the quality of the spectra. However, the solar irradiance can alter during acquisition due to occasionally passing clouds. Thus, to account for any potential alternations, the solar irradiance was collected throughout the acquisition by using the Qmini spectrometer (www.rgb-photonics.com) as an upward looking spectral reference by having it faced towards the sun. Qmini uses a Charged Couple Device (CCD) detector with a spectral resolution of 1.5 nm and a spectral range of 200-1025 nm. Hence, this spectrometer enables processing and correction of intra-sets.
Furthermore, a photo is acquired of each material sample for usage as a visual reference and control during the post-processing. Additionally, a Global Navigation Satellite System (GNSS) receiver is used throughout the acquisition to record the geolocation of the material samples for possible revisits.
In total, 181 material samples (1810 spectra) are acquired in the area of Karlsruhe, Germany between 2 July and 8 August 2018. The average effective solar incident angle for the 1810 spectra is for the horizontal surfaces 22.91 • and a standard deviation of 3.85 • , while for the vertical surfaces 69.81 • and a standard deviation of 7.26 • . The field survey is carried out on days with sunny weather and with occasionally a few passing clouds. Most of the material samples are acquired directly in situ from buildings, roads, and pavements in the city of Karlsruhe. However, some of the material samples are acquired outdoors at a local building supplier to increase the number of roofing samples. All samples are acquired in the sun. As the samples are located within a city, the locations are selected to reduce the impact from opposite facades and windows. The material samples are mainly but not only man-made materials, such as ceramic, concrete, and plaster with additional samples from natural material such as sandstone, limestone, and granite. Furthermore, various roofing and road materials are additionally acquired to complete the spectral library. The acquired material samples are in various states of weathering and range broadly in age.

Data Post-Processing
As the spectra are acquired in situ, the surrounding environment cannot be regulated in a controlled manner, such as the solar irradiance and absorption of water vapor. Hence, to improve and refine the spectra, we implement a post-processing routine. This routine covers removal of outliers and noise, intra-set correction using the solar irradiance and recalculation of the reference spectrum (due to signal clipping). The processing is done either for just one or for all three FieldSpec sensors. The data processing is implemented in MATLAB.
The proposed processing flow can be seen in Figure 1. We will here explain each step in the processing flow in the order of processing. Due to the spectra being acquired in situ, the first processing step deals with the water vapor absorption. The spectral ranges of 1340-1450 nm, 1780-1970 nm, and 2300-2500 nm are therefore removed. The processing flow is followed by focusing on the measured spectra, denoted with L. First, as an intra-set is acquired in sets of 10 spectra, any apparent outliers caused by unexpected movement of the carrier can easily be manually detected and removed. Around 2% of the 1810 acquired spectra are deemed to be outliers as their spectral features are significantly different than the corresponding intra-set spectra. Secondly, this is followed by the intra-set solar irradiance correction, which is explained in detail in Section 3.2.1. Detection and removal of noise is the following processing step. A sample is flagged as noisy if the maximum intra-set variation is above the threshold of 10% and this is only observed for the SWIR2 sensor. This spectral range is thus removed for around 5% of the samples due to noise. After performing adjustments to the measured spectra L, we can calculate the spectral reflectance R with the reference spectra E 0 and the adjusted measured spectra L.
Followed is the calculation of the average spectral reflectance R for each intra-set. An additional processing step must be added for the reference spectrum E 0 since we discovered that the signal had been cut off for several samples due to signal clipping. This processing step is explained in Section 3.2.2. The recalculation of the reference spectrum E 0 for these samples introduces additional noise which can be detected in the spectral range of 950-1020 nm and thus, this spectral range is excluded for all samples. The noise is caused by the usage of Qmini since it experiences noise in this spectral range. In total, about 20% of the spectral range is removed because of noise. This concludes the processing flow and we receive a material sample that can be further used for sample clustering. The following subsections will now explain in detail the two major processing steps, the intra-set solar irradiance correction and recalculation of the reference spectrum.

Solar Irradiance Intra-Set Correction
A reference spectrum E 0 is acquired before each acquisition of an intra-set but the measured spectra L can alter during the acquisition time. Thus, by using the acquired solar irradiance S(λ) which Qmini collected throughout the acquisition, we can adjust for this alternation and perform an intra-set correction to make L more homogeneous. By appointing the first measured spectrum in the intra-set as the initial measured spectrum, denoted L t r , and extracting the solar irradiance acquired at that time point S t r (λ), we can examine if the solar irradiance S t (λ) has altered for the other spectra in the intra-set, L t . First, we examine if the solar irradiance has significantly altered during the acquisition and if such adjustment is necessary. Here, we calculate the standard deviation and determine if the maximum standard deviation indicates an alternation of more than 2%. If so, we extract the solar irradiance S t (λ) that has been collected throughout for the acquisition by finding the corresponding synchronized GNSS-time stamps for Qmini and FieldSpec. Thus, by using the initial measured spectrum and extracting the acquired solar irradiance at that time point S t r (λ), we can determine the corrected intra-set L * t as The original intra-set L t is multiplied with the calculated solar irradiance factor to receive the corrected intra-set L * t . We then control if the corrected intra-set L * t is more homogeneous than the original intra-set L t by calculating the maximum standard deviation by using If the maximum standard deviation of the corrected intra-set is more than the one of the original intra-set, we reject the corrected intra-set and we keep the original intra-set. In short, this routine makes it possible to correct the intra-set by adjusting it with a calculated solar irradiance factor. See Figure 2 for a visual presentation of this processing step.

Recalculation of Reference Spectrum Due to Signal Clipping
Signal clipping occurs if an acquired signal is restricted to a certain data range and if reached, the signal is cut off at this threshold [37]. This occurred for several samples in limited parts of the reference spectrum and it appears to caused by a technical malfunction. The reference spectrum is cut off at different thresholds. To recover those reference spectra and the accompanying measured spectra, we implement a routine for signal recalculation.
We first implement a routine that detects local flat signal peaks in the reference spectra. We are thus able to detect and determine which parts of the reference spectrum E t * that have been cut off. Once detected, we determine which reference spectrum that was acquired consecutively and did not suffer from signal clipping in the same part of the spectrum, E t r . This works under the assumption that the solar irradiance has not been significantly altered and that the location of the consecutive acquisition is similar and surrounding environment has not altered much. If one of these conditions is not satisfied, the spectrum cannot be properly recalculated. The intensity values of these corresponding wavelengths are extracted and the part that has been cut off is ignored. The average ratio between the two reference spectra can be determined as described in the formula E nc,t * and E nc,t r are the parts in the reference spectrum that were not cut off in the two corresponding reference spectra. This ratio is then multiplied with the consecutively acquired reference spectrum E t r , as seen in , for λ in the spectral range of sensor VNIR E t r · c, for λ in the spectral range of sensors SWIR1 and SWIR2 (5) However, as seen in the equation, another factor needs to be considered when recalculating the reference spectrum E * 0 in sensor VNIR spectral range. The signal clipping in sensor VNIR covers 53% of the spectral range and thus, just using the same formula as in the SWIR spectral range generates reference spectra of poor quality. We can here use additional information acquired with Qmini since it is the same spectral range and the formula can therefore consist of an additional factor. The average wavelength solar irradiance alternation between the two acquisitions can be calculated, s(λ), as described in Here, s nc,t * represents the solar irradiance during the acquisition of the cut off reference spectrum E nc,t * and s nc,t r the solar irradiance during the acquisition of the consecutive non-cut off reference spectrum E nc,t r .
To evaluate the quality of the recalculated reference spectra, E * 0 , we use the part of the reference spectra that was not cut off to calculate a quality measure, Q. Here, we use the two vectors that represent the part of the reference spectrum that was not cut off, E nc,t r , E nc,t * , in addition to the calculated ratio c. Q is defined as Hence, we receive a value ranging between 0 and 1, where 0 represents a perfectly recalculated reference spectrum. We decide to set a threshold of 0.05 to eliminate recalculated reference spectra of poor quality. An example can be seen in Figure 3, where the recalculated reference spectrum E * 0 in the spectral range of sensor SWIR1 passes the quality control while the recalculated reference spectrum for sensor SWIR2 spectral range does not. In the end, the recalculated reference spectrum for 26% of the samples that suffered from signal clipping passed the quality measure. Blue shows the cut off reference spectrum E t * , green the consecutive non-cut off reference spectrum E t r and red the recalculated reference spectrum E * 0 . The recalculated reference spectrum in the spectral range of sensor SWIR2 does not pass the quality control since it does not surpass the quality criteria of Q < 0.05.

Material Categorization
We decide to cluster the material samples to obtain material classes and subclasses that consist of at least one material samples each. This is done to provide several samples of similar characteristic spectral features to be potentially used as training data for material classification. Thus, to categorize the materials, we determine which material clusters we have and compare each material cluster with samples from existing spectral libraries. We lastly evaluate the intra-class spectra similarity.

Sample Clustering Based on Spectral Features
To determine the material clusters (samples with the same material composition) and the corresponding subclasses (samples in the same material cluster but with different surface characteristics), we use suitable spectral measures. The most common spectral measure is Spectral Angle (SA) [38], which is also commonly used for image classification but is then known as Spectral Angle Mapper SAM [39]. It is suitable for continuous data such as hyperspectral spectra as SA calculates the angle between two vectors (spectra) and determines the spectral similarity between the two. The smaller the angle, the more similar the two spectra are. This measure is suitable for spectra acquired during different downwelling irradiance conditions, as the measure is relatively robust for such alternations since it accounts for the vector direction and not the vector length. x, y represents the dot product of the two vectors and ||.|| 2 the Euclidean norm. SA can be described as However, in order not to rely on only one measure, we decide to use Spectral Information Divergence (SID) [40] and Spectral Gradient Angle (SGA) [41] as well. SID determines a divergence measure between two vectors. Again, the smaller the divergence measure, the more similar the two spectra are. Using SID, the quality of spectral similarity has been shown to be better than with the usage of SA [42]. Given two n-dimensional vectors x and y, SID is defined as SGA calculates the angle between two spectral gradients by using SA. SGA uses, in comparison to SA, the vector gradient instead of the vector direction. The slope change is thus considered to increase robustness against static offsets and is therefore invariant to geometry and incident illumination. We first determine the gradient of two n-dimensional vectors x and y. This is then followed by using SA, as defined in Equation (8). SGA is thus defined by SGA(x, y) = SA(abs(SG(x)), abs(SG(y))) However, as SGA is calculated using the absolute gradient, two spectral gradients will be the same even if one has a negative and the other a positive slope as the absolute derivative will be the same. Thus, we propose a modified version of SGA, denoted as SGA*, which does not calculate the absolute gradient but adds 1 instead. This allows us to distinguish spectra with the same absolute negative and positive derivative, as seen in The samples can be now assigned into clusters in an iterative procedure using these measures. We initiate the iteration with a first guess based on the observations made in the field. By iterating the sample clustering, we can split the larger material classes into more refined subclasses and generate, if appropriate, new material classes. The three spectral measures SA, SID and modified SGA* are calculated using the full spectral range and for each of the three FieldSpec sensors for every sample pair. This allows us to study the spectrum in detail at the different spectral ranges. Furthermore, if one part of the spectrum has been removed during the post-processing, the corresponding sample pair will also have this part of the spectrum removed. We set different threshold values for three spectral measures to determine suitable pairings. Thus, we can generate 12 matrices, three for the full spectral range and nine for each of individual FieldSpec sensor, that display the calculated spectral similarity between each sample pair. An example can be seen in Figure 4, where the 12 matrices are displayed and visualized for the subclass Dark reflective ceramic.  The example in Figure 4 visualizes how the three measures determines the spectral similarity between the four samples. As we receive three calculated values from the measures, we use different thresholds that represents three levels of similarity (very, rather, and not similar). These thresholds resemble those used by Robila [42]. We reject pairings if all three measures indicate a poor similarity in several matrices. However, an indication of a poor similarity in the spectral range of sensor VNIR is often ignored since that only represents the material color. Each material cluster consists of at least two samples.

Comparison of Spectra
To determine and assign suitable labels to the different material clusters, we use two publicly available spectral libraries that contain similar samples as KLUM; ASTER [21] and LUMA-SLUM [25]. The ASTER spectral library contains 3420 samples while the LUMA-SLUM library contains 72 samples. We reduce and extract material samples from those libraries which we believe were likely to exist in our library, namely construction material, to reduce the processing time. Thus, we used 61 samples from LUMA-SLUM samples (82%) and only 134 samples ASTER (4%) as it contains few construction material samples. Additionally, ASTER's samples do not cover the same spectral range as KLUM and sometimes only a limited part of the spectral range. Furthermore, we also extracted samples from ASTER's class Rock as it contained natural materials that are often used for constructions, such as sandstone. Other publicly available spectral libraries, such as Santa Barbara [24] and DESIREX [26], contain mainly ground material samples and thus, we did not use them for validation. In general, the under-representation of facade material samples in publicly available spectral libraries makes it challenging to determine and compare the material samples and the corresponding labels.
We compare our clustered material samples by using the same spectral measures we use for the spectral clustering; SA, SID and the modified version of SGA*. We determine the average spectral reflectance for each subclass cluster and calculate the three spectral measures using material samples from ASTER and LUMA-SLUM as reference spectra. We then extract the 10 samples from ASTER and LUMA-SLUM that resemble the KLUM subclass the most according to the three measures. This is exclusively done in the spectral range of SWIR1 and SWIR2 since we want to ignore the material color. Thus, we can determine the best sample matches based on the spectral similarity. Furthermore, as some samples are not present in neither ASTER nor LUMA-SLUM (e.g., neither contains a class named Plaster), we also must rely on our field observations and the photos we had taken.

Intra-Class Evaluation
Finally, the assessment and evaluation of the quality of the sample clustering, both on class and subclass level. This is done by first comparing KLUM's material classes and subclasses with clusters generated from unsupervised algorithms. Thus, we use the unsupervised clustering algorithm k-means [43] since we then can compare the same number of generated clusters and the sample distributions with KLUM's material classes and subclasses. Thus, we set k to 12 and 33 respectively. Additionally, as we are working with high-dimensional data, we employ Principal Component Analysis (PCA) [44] for dimensionality reduction. Here, we use the first few principal components that cover 99.9% of the variability of the data. We decide to also employ t-distributed stochastic neighbor embedding (t-SNE) [45] as it is suitable for visualizing high-dimensional data. We assess the clusters in different spectral ranges; the full spectral range, the spectral range of the sensors SWIR1 and SWIR2 and the spectral range of sensor SWIR2. The last assessment of the sample clustering consists of analyzing and visualizing the intra-class standard deviation of each class and subclass.

Results and Discussion
First, we present the material classification scheme that we created to suit a more refined categorization and an overview of the material classes and subclasses that are available in our spectral library, KLUM (Section 4.1). Then, we discuss the spectral comparison made between KLUM samples and samples from ASTER and LUMA-SLUM (Section 4.2). This is followed by the intra-class evaluation (Section 4.3). Lastly, the signal clipping, the used spectral range and the used spectral measures are discussed (Section 4.4).

Material Samples
181 material samples were successfully processed and clustered into classes and subclasses, thus creating KLUM. We were able to distinguish and cluster 12 common urban materials and 33 subclasses from the 181 material samples, presented in Table 2. KLUM consists of 97 facade, 46 ground, and 38 roof material samples. Some of the collected material classes had enough samples to generate several subclasses. 23% of the samples suffered from signal clipping in one or two FieldSpec sensors. In the end, 17% of the samples had parts of the spectrum removed due to not passing the quality measure. Furthermore, 5% of the samples had their spectra removed in the spectral range of sensor SWIR2 due to noise.
As seen in Table 2, most of the material classes consist of several subclasses. The three largest material classes with the most samples are Ceramic with 45, followed by Concrete with 38 and Granite with 16. We chose descriptive subclass names based on the hierarchical material classification scheme, as seen in Table 3. The scheme we chose is based on the schemes of Kotthaus et al. [25] and Herold et al. [24], where the scheme consists of several levels of descriptive information. This material classification scheme offers a more refined and detailed description of each material class as it defines the usage (facade, ground or roof), the color, the surface structure, texture and coating (e.g., reflective or matte) in addition to the status of the material (new or weathered). This enables the possibility to split and select a refined material subclass, such as Painted concrete, and the option to perform material classification for specific cases, which was the main purpose of the clustering. To keep the color description simple, we decided to exclude the the description of hue, saturation and brightness in addition to only assigning one color per material sample (the most dominant color).
A brief explanation about the materials in Table 2 is presented in the following subsections to provide essential context. The material classes Ceramic, Concrete and Wood are presented with some examples from their corresponding subclasses. The remaining material classes and the detailed descriptions are provided in Appendix A and the full list of all material samples and the metadata are given in Appendix B.

Ceramic
Ceramic is the material class with the most subclasses and samples; seven subclasses and 45 samples. The majority of these samples were acquired at a local building supplier which enabled the possibility to study the impact the material color and surface structure/texture/coating has. This can be seen in Figure 5, where four samples with different colors from three subclasses are displayed and visualized; Dark reflective ceramic (sample D104), Glazed ceramic (roof tiles) (samples D211 and D212) and Grey ceramic (sample D604). By analyzing the visualized spectral reflectance, it is noticeable that the color difference can be seen not only in the spectral range of sensor VNIR, but also in the spectral range of sensor SWIR1. The spectral reflectance of the same material differs here significantly in the studied spectral range but displays a spectral similarity in the spectral range of sensor SWIR2. Thus, this example showcases the importance of proper metadata descriptions in spectral libraries by describing the material surface by color. Furthermore, this demonstrates that it is crucial to be cautious while using spectral libraries as there can be a significant spectral difference for one material in this spectral range.

Concrete
The class material Concrete is the second largest class with six subclasses consisting of 38 samples. We present three subclasses in Figure 6; Bright concrete (sample E301), Grey concrete (sample E407) and Painted concrete (sample E508). The two samples E301 and E407 do have similar characteristics features but with a slightly different spectral feature in the spectral range of 1400 nm and onward. Sample E508 from the subclass Painted concrete does have a distinguishing feature which can be seen in the spectral range of sensors SWIR1 and SWIR2 as it decreases here. This phenomenon is also noticeable for the subclasses with the same surface structure/texture/coating (e.g., Painted metal and Painted wood). This highlights the fact that painted surfaces can be distinguished due to their spectral features in this spectral range.

Wood
We have categorized the material class Wood into two subclasses; Varnished wood and Painted wood. The material class consists of seven samples. One sample from each subclass and the corresponding spectral reflectance can be seen in Figure 7. As one sample has been painted (sample L102) and one has been varnished (sample L002), the spectral reflectance is noticeable different due to the surface coatings. As discussed, the distinguished features that appear for painted surfaces can once again be seen here, a decreasing reflectance in the spectral range of sensors SWIR1 and SWIR2. Since these two subclasses have completely different characteristic features in this spectral range, it would be impossible to classify them both as Wood. However, as the material composition is the same, they are categorized as the same material since the surface coating is the only difference.

Spectral Similarity with Existing Spectral Libraries
With the use of the two spectral libraries, ASTER, and LUMA-SLUM, we could in some cases confirm our sample label assignments and in some other cases receive a hint that could guide us. However, one material can have different colors and different surface structures, coatings, and textures. Thus, this proved a challenge since some of samples could not be successfully matched with a material label from the two spectral libraries because those samples did not fit the description of the color nor the structure description. Thus, this makes it clear that spectral libraries in this spectral range should include additional information in the metadata files. Furthermore, some labels assignments could not be confirmed as ASTER nor LUMA-SLUM contained similar samples (such as Plaster). For those cases, we had to rely on our field observations and the photos.
As we calculated the spectral measures for our material spectra using samples from ASTER and LUMA-SLUM as reference spectra, we noticed that the spectra from various materials are often very similar in the studied spectral range. This can be seen in Figure 8 where we visualize the five samples from ASTER and LUMA-SLUM that are the most similar to our class Grey asphalt according to the modified SGA* measure in the spectral range of sensors SWIR1 and SWIR2. Here, we received a similar modified SGA* score for four different materials; Limestone, Ceramic, Asphalt and Cement. Thus, as the measures give us similar values, it complicates the procedure to compare and assign proper material labels to KLUM's samples. This indicates different materials have similar characteristic features in this spectral range. Thus, longer wavelengths should preferable be use for material classification as it would be possible to determine the material with specific surface coatings (e.g., paint). However, it is not always possible to excess such equipment and it is, therefore, important that spectral libraries, which contain samples in this spectral range, should to be handled with some awareness and consciousness for applications such as material classification and label assignment.

Intra-Class Assessment
To evaluate the spectral intra-class similarity on both material class and subclass level, we employed k -means and t-SNE. For each assessment, we evaluated it with and without PCA for the different spectral ranges. To determine the intra-class similarity, the final assessment consisted of determining the intra-class average standard deviation between each class and subclass.
By first analyzing the spectral similarity between the 12 generated clusters using k-means and t-SNE, neither can completely distinguish and separate the classes in the same clustering formation as KLUM. Figure 9 displays the k-means clusters for the different spectral ranges. The color displays the percentage frequency distribution of the assigned material labels for the k-means clusters. It is apparent that there are two clusters that contain samples from almost every class. Furthermore, it appears that the spectral range of sensor SWIR2 can distinguish unique material clusters (the material classes Plaster and Granite). This highlights that sensor SWIR2 spectral range does provide unique spectral features. The spectral similarity between the classes is also supported by the t-SNE distribution, as seen in Figure 10 that displays the distribution in the spectral range of sensor SWIR2 using PCA. Once again, there are few clearly distinguishable clusters. On the contrary, most of the samples are clustered together. Figure 9. Visualization of k-means clustering and its distribution among the 12 material classes without using PCA. X-axis represents our assigned material labels and y-axis the generated k-means clusters. The color displays the distribution of the assigned material labels in the k-means clusters. Subfigure (a): Full spectral range. Subfigure (b): The spectral range of sensors SWIR1 and SWIR2. Subfigure (c): The spectral range of sensor SWIR2. It is here possible to observe that we do not have 12 distinguishable material clusters since the k-mean clusters consist of samples from several classes. We can receive more material specific clusters using the spectral range of sensor SWIR2. Figure 10. Visualization of the t-SNE distribution in the spectral range of sensor SWIR2 among the 12 material classes using PCA. The material classes are not easily distinguishable as we do not receive any prominent clusters. There are a few smaller clusters within some of the classes, such as the small cluster to the right (the material class Ceramic), which can indicate that there are distinguishable subclasses.
Secondly, by analyzing the spectral similarity between the 33 subclasses, it appears that the 33 generated k-means clusters could generate clusters more similar to KLUM's subclass formation than the 12 main classes. The visualizations in Figure 11 display the 33 k-means clusters with and without PCA in the spectral range of sensor SWIR2 and the color represents the percentage frequency distribution of the assigned material labels in the k-means clusters, as in Figure 9. We can here observe that the subclasses are more distinct as we receive several clusters representing only one subclass. By comparing the two figures, it appears that there is not a significant difference between using PCA or not. We receive in both cases some larger k-means clusters that include samples from several subclasses. Figure 11. Visualization of k-means clustering and its distribution among the 33 material subclasses with and without PCA in the spectral range of sensor SWIR2. The x-axis represents our assigned material labels and the y-axis the generated k-means clusters. The color represents the distribution of the assigned material labels in the k-means clusters. Subfigure The final assessment that consisted of determining the intra-class and intra-subclass similarity by using the average standard deviation can be seen in Figure 12. We can here observe that most classes and subclasses have similar spectral features since the spectral variation is, for most classes and subclasses, less than 5%. The main classes Plaster and Wood appear to have the largest spectral variation which is also observed in the corresponding subclasses. Bright plaster contains five samples with various colors and the difference can be observed here (larger spectral variation). Varnished wood on the other hand does only contain two samples and even if they do have similar spectral features, the color difference is apparent. Overall, this assessment can conclude that the spectral features within the classes and subclasses are similar.

Discussion
The reference spectrum was for around 23% of the samples cut off. While analyzing the data, we discovered that the most common reason for the signal clipping was due to the surrounding environment being bright, e.g., the color of the material sample was white. The cause appears to be an instrumental malfunction of FieldSpec. As the signal clipping appeared at different data ranges for the three sensors, it is challenging to regulate this malfunction during acquisition. Signal clipping occurred mostly in the spectral range of sensor SWIR2. Furthermore, the consecutively acquired samples that were used for recalculating the reference spectra were often acquired at locations with surroundings that were not similar enough as the original location and thus, only 26% of the recalculated reference spectra were able to pass the quality assessment.
As we analyzed and clustered the material samples, it became clear that the material color impacts the clustering outcome when we relied on the full spectral range. The material color impacts the clustering since the spectral range of 350-1400 nm is covering about 48% of the total observed wavelength. We decided therefore to ignore this spectral range and to rely more on the spectral range of sensor SWIR2. This was also used when we compared and matched our material samples with the spectral libraries ASTER and LUMA-SLUM. There are studies that suggest that it is more suitable to work with the SWIR spectral range for material classification [30,46]. However, our analysis suggests that it is more feasible to exclude the spectral range corresponding to sensor SWIR1.
For our dataset, we preferred the modified SGA* measure as it provided us with the most reliable label assessments which we discovered while comparing KLUM's spectra with spectra from ASTER and LUMA-SLUM. The modified SGA* considers the spectral gradient which distinguishes positive and negative derivatives, and thus, SGA* is suitable for identifying spectra with similar spectral features. SA and SID focus on the other hand on the angular difference, which from our experience contribute with worse material labeling assignments since the angular difference does not differ significantly for building materials consisting of similar composition (such as asphalt and concrete). Therefore, spectral measures should be carefully chosen and base it on the type of hyperspectral data that will be classified to suit the needs.

Conclusions
This work presents a spectral library of building materials with a focus on facade materials, covering the VNIR-SWIR spectral range. The spectral library contains spectra from 181 samples consisting of 12 clustered material classes and 33 clustered material subclasses that were collected in situ in the southwestern German city of Karlsruhe. KLUM consists of 97 facade, 46 ground and 38 roof material samples. KLUM is, at the time of its publication, the publicly available spectral library with the most facade material samples. The samples, their metadata (based on hierarchically classification scheme), and photos are all available in the publicly available spectral library KLUM (https://github.com/rebeccailehag/KLUM_library).
A processing flow for the acquired samples was developed which included intra-set solar irradiance correction and recalculation of clipped reference spectrum. The material samples were clustered using the spectral measures SA, SID and the modified version of SGA* to provide classes and subclasses with more than one material sample. The material clusters were then labeled and compared to samples from the spectral libraries LUMA-SLUM and ASTER, using the same measures in addition to our expert knowledge and photos. However, as spectral libraries have not, until now, had a focus on building facades there is an under-representation of facade samples.
Our spectral library is one of the first that has clustered material samples into subclasses with different surface conditions (e.g., color and coating) and studied its impact. As discussed and seen in some examples (e.g., Figures 5 and 7), the spectral characteristic features for one material can differ significantly in this spectral range due to color or surface structure/texture/coating. Because of the varied spectral reflectance, it can be challenging to classify the samples into the same material class. Thus, we can conclude that spectral libraries with building materials should provide additional metadata about the acquired samples to address this challenge properly. Additionally, this also implies that this spectral range is limited for urban material classification while dealing with different material colors and surface structure/texture/coating and longer wavelengths should be preferable.
Since urban materials are diverse and come in different colors and surface conditions, it is not possible to cover the wide range of material samples in this spectral library. Furthermore, our spectral library only covers commonly used building materials found in southern Germany (central Europe) which is just a small fraction of all existing urban materials. Thus, more studies are needed to comprehend the complex diversity of building materials. This could include studies with either focus on how the characteristic features alter throughout a day depending on the solar angular or the different features of one particular material and its various surface coatings and colors. Furthermore, a study focusing on using the surface texture analysis (generated from a high-resolution photo) to distinguish the material classes could be of interest since it is not always possible to have access to hyperspectral data. clustering and separability. We acknowledge support by the KIT-Publication Fund of the Karlsruhe Institute of Technology.

Conflicts of Interest:
The authors declare no conflict of interest. exemplified throughout all samples. Here, we had to rely more on the photos we had taken since the spectral features often resembled the material classes Asphalt and Concrete (due to the sediment). Figure A3. Visualization of the average subclass spectrum of the main class Conglomerate.

Appendix B. KLUM Material Samples
The 181 material samples and their metadata that are included in the spectral library KLUM are here presented in Tables A1, A2 and A3. There are 12 common urban materials and 33 subclasses.