Cluster Analysis of IR Thermography Data for Differentiating Glass Types in Historical Leaded-Glass Windows

: Infrared thermography is a fast, non-destructive and contactless testing technique which is increasingly used in heritage science. The aim of this study was to assess the ability of infrared thermography, in combination with a data clustering approach, to differentiate between the different types of historical glass that were included in a colorless leaded-glass windows during previous restoration interventions. Inspection of the thermograms and the application of two data mining techniques on the thermal data, i.e., k-means clustering and hierarchical clustering, allowed identifying different groups of window panes that show a different thermal behavior. Both clustering approaches arrive at similar groupings of the glass with a clear separation of three types. However, the lead cames that hold the glass panes appear to have a substantial impact on the thermal behavior of the surrounding glass, thus preventing classiﬁcation of the smallest glass panes. For the larger panes, this was not a critical issue as the center of the glass remained unaffected. Subtle visual color differences between panes, implying a variation in coloring metal ions, was not always distinguished by IRT. Nevertheless, data clustering assisted infrared thermography shows potential as an efﬁcient and swift method for documenting the material intervention history of leaded-glass windows during or in preparation of conservation treatments.


Introduction
Infrared thermography (IRT) studies the infrared photothermal emission (0.9-14 µm) of a sample with an infrared sensor. This signal depends on the optical and thermal properties of the sample. This is different from other IR techniques such as IR reflectometry and hyperspectral imaging, where the measured radiation is the emission from a secondary light source that is being reflected or transmitted by the sample. A further distinction can be made between passive and active thermography. Passive thermography studies the natural photothermal emission of a sample in thermal equilibrium. This type of measurement is mainly used to pinpoint thermal anomalies and is limited in its application range [1]. Active thermography, on the other hand, studies the photothermal response of a sample after it has been thermally excited (during heating or cooling) by an external influence. This allows for a more quantitative evaluation of the sample by processing the recorded image data with various techniques, such as Fourier transforms [1].
IRT was initially developed as an inspection method for industrial applications [1], but was soon introduced in the conservation science field as it is a non-destructive and contactless inspection technique. In this area, IRT has distinguished itself as a reliable and sensitive standoff method for defectoscopy and structure studies in works of art [2][3][4]. More particularly, IRT has been used for detecting various types of defects in wood and panel paintings [5,6], frescoes and mosaics [7], historical buildings [8], and ceramics [4]. In addition, it is employed to visualize hidden aspects of artworks such as painted representations [2,3,9], the wood grain in panel painting [3,9] and cusping of canvas [9,10].
The advantage of IRT in comparison with other non-destructive diagnostic techniques commonly employed in heritage science, such as Macro X-ray fluorescence (MA-XRF) [11,12], hyperspectral imaging [13,14] and Raman spectroscopy [15,16], lies in its ability to image a large area in a short amount of time. However, a notable limitation is the fact that different materials can exhibit similar thermal behavior and thus will be indistinguishable from each other. In addition, since heat spreads in a diffusive manner, the thermal behavior of each element is influenced by its surroundings, so the same material can show different thermal behaviors in different surroundings. In addition, thermal diffusion also results in blurry images at longer measurement times. Lastly, in practice, the heating of fragile heritage objects can be prohibited or it can prove challenging to heat heritage objects in a uniform manner in order to reduce noise, as these are often heterogeneous in composition and/or irregular in volume.
Until now, relatively few studies reported the use of IRT for inspecting stained glass windows. Existing publications focus on the adhesion of glass paints [4,17], manufacturing defects [18] or discuss the comparative thermal behavior of the various materials present, such as enamels, grisailles, lead cames, and welds [18][19][20]. The reported methods for data analysis are visual inspections of the thermograms [4,[17][18][19][20], temperature plots [17,19,20] and principal component analysis (PCA ) [18]. Higher order statistical analysis of the thermograms has also been used successfully on other historical glasses to detect degradation due to ageing [21,22]. Nevertheless, IRT holds two particular limitations for glass characterization. First, thermal excitation of glass is relatively inefficient as radiation sources commonly applied for IRT emit radiation in the range for which glass is largely transparent. Silica glass is mostly transparent from 0.2 µm up to 3.5-4.0 µm and can be considered opaque for wavelengths longer than 4.0 µm [23]. An alternative would be to use hot air but in this case heating will be less uniform and this is impractical for measuring larger objects. Second, care should be taken with additional sources of radiation, external to the setup, such as the sun or other (moving) warm objects or bodies in the room. Intensity fluctuations of these unintentional sources during the measurements can readily introduce noise in the thermography measurements as glasses exhibit high reflectance at incident angles above 45°. This would mainly be an issue for in-situ measurements [24,25].
The goal of this study was to assess the potential of IRT to differentiate compositional types of glass panes in historical leaded-glass windows which were typically subject to several conservation campaigns, and this by means of a case study (see Experimental section). Discriminating these glass types allows recognizing non-original glass panes and estimating the number of interventions that have occurred in the past, which is of key importance for defining the appropriate conservation strategy. Traditionally, the most prevalent methods for this type of investigation are XRF and scanning electron microscopy equipped with an energy dispersive X-ray detector (SEM-EDS) [26][27][28]. These techniques require destructive sampling and are labor-intensive, therefore they are not carried out on a routine basis. In order to find different groupings of glass the thermography data was processed using k-means clustering and hierarchical clustering algorithms. Both are unsupervised learning techniques that are used to partition data into clusters. K-means clustering groups all observations into a predetermined number of clusters, k. On the other hand, hierarchical clustering groups the observations hierarchically into subsequently larger clusters until they all belong to the same all-encompassing cluster. Hierarchical clustering was applied on Raman data to classify the glass panes in colored, stained glass windows, based on their chemical properties [16], but not yet with thermography data. In the broader field of heritage science, only Yousefi et al. reported K-means clustering on thermography data to detect defects on a number of heritage artifacts (i.e., a statue, a fresco, and a painting on canvas) [29].

The Window
The investigated window panel ( Figure 1) is a historical leaded-glass panel composed of colorless panes and preserved in its original wooden casing. The window recently underwent a conservation-restoration treatment during which missing panes and gaps were filled with historical, recuperated similar-looking glass while broken pieces were bonded with an epoxy resin. The result is a homogenous looking window containing only historical glass, but supposedly from different periods. As illustrated in Figure 1, the interior side of the glass panes shows scratches and is spotted with a relatively uniform pattern of minute, opaque, white spots, typical for glass corrosion. Some panes exhibit a slightly more pronounced greenish hue than others.

Experimental Set-Up and Instrumentation
As illustrated by Figure 2, the IRT experiments were performed with the interior side of the window facing the camera. Heating was done either in transmission or reflection mode by moving the light source while the rest of the setup stayed in place. In transmission mode, the exterior side of the window is illuminated, while the interior side is illuminated in reflection mode. In particular, the sample was heated by two 500 W halogen lamps, positioned at equal distances and under the same angle with respect to the camera-object axis. The lamps were placed at a distance of 0.80 m, measured from the sides of the window frame, and the camera was positioned at 0.80 m from the center of the window, as shown in Figure 2. Typical emission from a halogen light source ranges from 0.4 µm to 3.5 µm with the peak wavelength around 1 µm. These measurements were performed in laboratory conditions using a FLIR X6540sc actively cooled, thermal camera, which has a thermal sensitivity of <25 mK (typically 18 mK), a resolution of 640 × 512 pixels and spectral sensitivity in the 1.5-5.5 µm range. Measurements were carried out without filter and using an L1009 25 mm f = 2 lens , which has a spectral range of 2.5-5 µm and an Angular Field of View (AFOV) of 22°× 17°. The window was heated for 2 min, while the cooling-down process was recorded for 3 min at a frame rate of 10 Hz. Increasing the heating time or a longer measuring time during cooling did not result in a significant improvement of the results. During the optimization process of the parameters, we strive for minimizing the temperature variation of the glass. Because the coefficient of expansion of glass is different from the other window materials, varying the temperature can potentially generate stress in the window pane. Generally, glass can endure heating up quite well, but cooling down too fast could lead to thermal fractures. The epoxy resin employed for mending broken glass is particularly vulnerable to this. Ideally, leaded-glass windows are kept at 18-20°C with a maximum temperature variation of 3°C in 24 h [30]. During our measurements the apparent surface temperature of the glass varied by 4°C on average. The lead cames appear to cause a temperature gradient in the glass panes, which introduces mechanical stress and bending.
In order to obtain images with a higher spatial resolution, the upper and lower part of the window panel were imaged separately. The resulting images have one row of glass panes in common by way of overlap. The upper and lower image sets were processed separately. An area of roughly 0.5 mm × 0.5 mm is covered by each pixel.

Processing
The measurement output is a three-dimensional (nx × ny × N) data cube, with nx and ny the vertical and horizontal pixel locations and N = 1800 the different time steps. The relative temperatures are recorded as 14-bit grayscale values. Before thermal excitation of the sample, several cold frames are recorded. These frames are averaged in order to reduce the time-dependent noise, this average is subtracted from each frame in the data cube. As a result, the data that is being evaluated are temperature differences with respect to the average cold frame. For display purposes, these values are normalized from 0 to 1 in each image or over the entire data cube. The measurement data is processed with clustering algorithms in order to obtain easy to interpret results that can assist with identifying the differ glass types. Two different clustering methods are used, k-means clustering and hierarchical clustering, and their results are compared.
K-means clustering is a popular unsupervised learning technique for finding clusters in data. It aims to partition all observations/image pixels into a predetermined number of clusters, k. In the present case, each pixel is characterized by a feature vector consisting of the temperature difference reading as a function of (cool-down) time. In Figure 8 some typical feature vectors (cluster averages) are plotted. Each observation/pixel is assigned to the cluster whose mean is closest to that observation, according to some distance metric between their feature vectors. K-means clustering can be used on (hyperspectral) image data for segmentation and therefore also to thermographic datasets. Before clustering a mask was manually defined that eliminates the pixels corresponding to the wooden frame, the lead cames and the areas near them. These pixels are replaced with NaN (not-a-number) values. This enabled clustering solely on the different glass types. Prior to the actual clustering, the data cube was reshaped into a two-dimensional (nx × ny × N) matrix, so each pixel is an observation and the features are the temperature differences at different times during the cooling-down period. The MATLAB ® implementation of the k-means clustering algorithm with the squared Euclidian distance metric was employed as Euclidean distance is often used for clustering time-series data [31]. In order to avoid local minima which can occur as a result of the random initialization step, the clustering algorithm is repeated a few times and the result with the smallest loss is selected. The clustering results are visualized by color-coding each pixel according to the cluster it belongs to. The objective is to obtain segmented images that clearly place several glass panes or pieces of the glass panes in the same group, which likely indicates they are the same glass type.
While for k-means clustering it is necessary to make an assumption about the number of clusters in the data before running the algorithm, this is not the case for agglomerative hierarchical clustering. Here, individual observations or clusters are hierarchically linked into larger clusters, based on the distance between their feature vectors. The closest unlinked observations/clusters are linked to each other into a larger cluster, then the next ones and so on until all observations belong to a single all-encompassing cluster. The linkage method determines how the distance between a cluster and another cluster/observation is measured. We manually draw multiple regions of interest (ROIs) in the image, one for each small glass pane and two for the larger ones, and take the average temperature difference of each ROI for each time step. These averages of the ROIs are the observations we perform hierarchical clustering on. We use the MATLAB ® implementation of hierarchical clustering with the Euclidian distance metric to cluster the ROIs. Because our main goal is to discriminate between groups and the cluster hierarchy is of less importance, we use the complete linkage method. In practice, using a different linkage method did not have a large impact on our results. The history of the hierarchical clustering process is shown in a dendrogram in Figure 7. These dendrograms are used to identify the clusters.
Both clustering algorithms work in different ways to group the observations, but it is at the discretion of the user to determine the amount of clusters and to identify them.

Thermograms
By examining the thermograms at the start of the cool-down process (Figure 3, left column), we can clearly distinguish two different glass types. In particular, there are three glass panes that display a lighter grey value because they heat up faster than the other fragments. These panes have a slightly greener hue than the other panes (Figure 1), indicating the presence of chromophores. The added absorption by these chromophores might be the cause for the higher temperature of these glass panes. Palomar et al. also observe higher temperatures for coloured glasses compared to colorless glass due to the presence of chromophores [20]. There is also a small triangular glass fragment in the top row that behaves differently from the surrounding glass, while it has the same visual color. After two minutes of cooling-down (Figure 3, right column) the same glass panes stand out in the thermograms but an additional type of glass appears to emerge. These four panes, indicated by red frames in Figure 3, appear to cool down faster and thus exhibit darker grey values than the other panes. In general, the thermal behavior of the glass panes is similar in transmission and reflection mode. Damage on the surface of the glass panes from scratches and glass corrosion is much more visible in the transmission mode measurement, as can be seen in Figure 4. Glass areas subjected to degradation seem to heat up more, as reported earlier by Melada et al. [22] Although the impact is limited here, this can become problematic when windows are considered with a higher grade of weathering. Therefore, it seems more favorable to employ a reflection mode measurement when IRT experiments are carried out with the specific purpose of distinguishing different types of glass. In addition, the contrast between the lead cames and the surrounding glass is less pronounced in transmission mode. The lead cames appear to heat up much faster than the glass and they heavily impact the thermal behavior of the surrounding glass. We examine this effect for two line scans in Fgure 5: the scan on the left covers a row of the larger glass panes, while the scan on the right covers the row of smaller panes near the top of the window. The larger glass panes in the lower half of the window have an area in the center that remains largely unaffected by the influence of the lead cames, but this is not the case for the smaller glass panes near the top of the window. As a result, it will be difficult to determine whether differences and/or similarities are actually caused by a difference in chemical composition of the glass or rather by the influence of the lead cames. The lead cames appear to cool down much faster than the other elements, and over time their influence on the surrounding glass diminishes. A narrow area next to the wooden frame also displays a noticeably lower temperature difference. This can be seen in the outer edges of the temperature plots ( Figure 5) and the thermograms (Figure 3). We suspect this is caused by non-uniform heating due to the window frame casting a shadow on this area. Since the effects of glass degradation are more apparent in the transition mode measurements, this data is deemed less reliable. Therefore, the reflection mode measurement data is used for clustering. While the lead cames appear to heavily influence the thermal behavior of the surrounding glass during the first minutes of cooling, is not advisable to eliminate this part of the data because the thermal behavior at different times contains different information. During clustering certain areas are masked or we define ROIs, which partially excludes these problematic areas from the data. Figure 6 shows the resulting segmented images after applying k-means clustering to the thermal data cube of the reflection mode measurement, with, from left to right, a gradual increase in the number of clusters. When k = 2 clusters, the glass panes are distinctively separated into two groups, in a similar way as can be visually seen in the thermogram at the start of cool-down process (see Figure 3). Aside from the smaller panes near the top, which are heavily influenced by the lead cames, no additional panes or distinct parts of the panes are segmented when k = 3. An additional group of glass panes seems to emerge at the right side of the window when k = 4 (indicated in cyan in Figure 6) and is distinctly separated when k = 5. This group corresponds to the glass panes that could be visually discerned as fragments with a faster cooling in the thermogram recorded two minutes after heating (Figure 3, 2 min of cooling). Since these panes are only segmented at higher values of k, this group differs less from the other panes compared to the group indicated in red. The top left corner of one of these panes (indicated with an orange frame in Figure 6 at k = 5) appears to behave differently, as also seen in the thermograms. In the images of the lower region, the glass pane in the top left corner is separated from all other panes for k = 4 and k = 5 clusters (indicated in yellow). This pane does not get separated in the images of the upper region. The small piece of glass in the top row that behaves differently in the thermograms (Figure 3, start of cooling) does not seem to get segmented in the cluster analysis. These results look similar to what we found while examining the individual thermograms, with some exceptions for smaller pieces. Figure 6. Results for k-means clustering on the reflective thermography dataset. The amount of clusters increases from left to right. While the false colors in the upper and lower images match, in order to make it easier to interpret the results, they do not represent the same clusters because the upper region and lower region datasets were processed separately. The grey areas were masked out. Figure 7 shows the dendrograms as a result of hierarchical clustering on the averaged ROIs. Starting from each individual observation, the observations are linked together into larger clusters. These links are depicted by the horizontal lines in the dendrogram. The position of this line on the y-axis indicates the distance between the linked groups. Starting from the top of the dendrogram, and working towards the bottom, we can split up the formed clusters into subsequently smaller and smaller clusters. Determining the correct amount of clusters is a somewhat arbitrary process. The corresponding ROIs are indicated by the red frames on the images on the right sides of their respective dendrogram. When evaluating the dendrogram of the upper region, the first three clusters (red, magenta + green and red) seem to correspond to the same groupings of glass pieces we found with the other methods. Cutting the dendrogram into four clusters, i.e., splitting the magenta and green branches, places ROI 2 and ROI 3 in different clusters, but otherwise does not yield any additional insights. This is because the red cluster encompasses the smaller glass panes, which are heavily influenced by the surrounding lead cames. If we cut the dendrogram of the lower region into four clusters, we arrive at the same groupings of glass panes we found with the previous methods. ROI 34 is placed into a different cluster from the other ROIs that are part of this group. This is in accordance with what we see in the thermograms and k-means results. The ROIs from the glass pane in the top left corner (ROI 26 and ROI 27) are placed in a completely separate cluster from the other window panes in the bottom image. While in the top image this glass pane does seem to belong to a larger cluster of glass pieces (ROI 16 and ROI 17). Again, this looks similar to the segmentation results from k-means clustering. It shoud be noted that in both the upper and lower regions the distance between the red cluster and the other clusters is comparatively large. The distances between the other clusters are much smaller, indicating that they do not differ as much from each other as the red cluster does. The corresponding ROIs are marked with a red frame in the images next to each dendrogram. Each dendrogram was split into multiple clusters, each cluster is indicated by a different color.

Clustering
In Figure 8 the glass pieces are color-coded according to the groups they likely belong to, based on our interpretation of the IRT measurements and clustering results. The average temperature difference over time of the numbered areas is shown in the graphs. While we could see more differences between the remaining non-coded glass pieces, we could not determine whether there are more groups or not with a sufficient degree of certainty. There are other differences which can be distinguished visually, but which cannot be detected by using thermography. In particular, the middle glass pane in the bottom row is clearly composed of two different glass types (Figure 1). Even when we performed k-means clustering solely on this specific glass pane, we could not segment the two pieces properly. A spectral-based chemical chemical inspection was performed on this window with other non-invasive, mobile techniques such as MA-XRF, UV-Vis-NIR spectroscopy, etc. The results of this inspection are discussed in another paper [X-Ray spectrometry, submitted]. These more elaborate techniques come to a similar grouping as presented in this manuscript, though with more effort.

Conclusions
Infrared thermography was used to study a colorless leaded-glass window which was subjected to a conservation-restoration treatment. The window looks visually homogeneous and only contains historical glasses, but these glass panes are supposedly from different time periods and therefore are likely to have a different composition and thermal properties. Our goal was to assess to what extent IRT in combination with a clustering approach can differentiate the different glass types. The window panel was measured both in reflection and transmission heating mode. The thermograms of both measurement modes showed similar results, but the transmission mode measurement was subjected to more interference from corrosion deposition on the glass surface. The lead cames that hold the glass panes appear to have a significant impact on the thermal behavior of the surrounding glass, which can be problematic in case of small glass panes (typically smallest dimension < 40 mm) because their entire surface is affected. There was also some non-uniformity in the heating of the glass panes at the edges of the window, caused by the wooden frame. By inspecting the thermograms and applying k-means and hierarchical clustering to the thermal data, we were able to differentiate between different groups of glass panes. Although clustering approaches did not allow to distinguish more types as compared to a careful manual inspection of the individual thermograms, we feel that performing clustering on the thermal data cube provides us with clearer and more reliable results. This also automates part of the inspection process, which reduces the amount of man hours required and delivers faster results. Both clustering methods arrive at the same groups of glasses and clearly separate them. The strength of IRT lies in its compact set-up, fast measurement time, ability to measure a large area, and ease of use. However, it cannot be ruled out that glasses with different chemical compositions are not distinguished by IRT due to similar thermal properties. This might explain the fact that some visual color differences between glass panes, could not be distinguished by means of IRT. Differences in thermal behaviour are also not always caused by a different chemical composition. A systematic experiment on glasses with known chemical composition, representative for the established historical groups, would be necessary to elucidate this matter. Nevertheless, IRT shows potential as a swift and easy to use method for documenting the material intervention history of leaded-glass windows during or in preparation of conservation treatments. The potential weaknesses of IRT could be covered by making use of a multimodal (imaging) system that makes use of both IRT and chemical imaging methods. In this way, IRT could be used for a first complete scan, while a more specific chemical technique is used to inspect the areas where IRT does not give conclusive results.