SPAER: Sparse Deep Convolutional Autoencoder Model to Extract Low Dimensional Imaging Biomarkers for Early Detection of Breast Cancer Using Dynamic Thermography

: Early diagnosis of breast cancer unequivocally improves the survival rate of patients and is crucial for disease treatment. With the current developments in infrared imaging, breast screening using dynamic thermography seems to be a great complementary method for clinical breast examination (CBE) prior to mammography. In this study, we propose a sparse deep convolutional autoencoder model named SPAER to extract low-dimensional deep thermomics to aid breast cancer diagnosis. The model receives multichannel, low-rank, approximated thermal bases as input images. SPAER provides a solution for high-dimensional deep learning features and selects the predominant basis matrix using matrix factorization techniques. The model has been evaluated using ﬁve state-of-the-art matrix factorization methods and 208 thermal breast cancer screening cases. The best accuracy was for non-negative matrix factorization (NMF)-SPAER + Clinical and NMF-SPAER for maintaining thermal heterogeneity, leading to ﬁnding symptomatic cases with accuracies of 78.2% (74.3–82.5%) and 77.7% (70.9–82.1%), respectively. SPAER showed signiﬁcant robustness when tested for additive Gaussian noise cases (3–20% noise), evaluated by the signal-to-noise ratio (SNR). The results suggest high performance of SPAER for preserveing thermal heterogeneity, and it can be used as a noninvasive in vivo tool aiding CBE in the early detection of breast cancer.


Introduction
Cancer is the foremost cause of death worldwide and in the United States. Based on the American Cancer Society and World Health Organization (WHO) reports, newly diagnosed cancer cases in 2021 accounted for an overall estimation of 608,570 cancer deaths, where breast cancer alone was 30% of the overall cases [1,2]. Despite better survival rates and developments in different imaging modalities for screening and therapy, breast cancer is the most fatal cancer among women (second-most common cancer) [1]. Early detection of breast cancer has a crucial role in disease treatment planning and patients' survival. This study proposes a deep learning-based model for dynamic thermography imaging for a clinical breast exam (CBE) before performing mammography. We hypothesize that deep learning features can track thermal heterogeneity in breast tissue, which may associate with the angiogenesis and vasodilation caused by cancer metabolism. learning features can track thermal heterogeneity in breast tissue, which may associate with the angiogenesis and vasodilation caused by cancer metabolism.
Since the 1960s, mammography has been a gold standard method for breast cancer screening. Studies reported some variabilities in the screening results with this modality, facing various clinical factors such as age, breast density, type of malignancy, and family history [3][4][5][6][7]. Fibrocystic breasts, hormone replacement therapy, and dense breasts undermine the strength of mammography imaging for accurate diagnosis (and screening) [8][9][10][11]. As an alternative, magnetic resonance imaging (MRI) is often used as it is relatively expensive and has lower specificity than mammography [12,13]. A CBE is performed by clinicians, which helps for detecting breast cancer with reasonable accuracy, but it is not usually employed alone [14,15]. The proposed system can be used in combination with CBE to increase the detection rate of breast cancer as a noninvasive low-cost tool.
Infrared imaging captures thermal radiation emitted from tissue within an 8-10 μm bandwidth. Skin's emissivity is close to black-body emissivity (0.98) [16,17], which can transfer thermal fluctuations in such a way as to be observed with an infrared camera. The main source of thermal radiation is blood circulation, which becomes more heterogeneous due to the vasodilation and angiogenesis of tumor-adjacent tissues in abnormal tissue [18,19]. Additionally, metabolic activity in malignant tissue correlating with nitric oxide and estrogen expression causes increased temperatures, which can be detected through the skin [6,[19][20][21][22]. In malignant neoplasia, the tumors' growth is dependent on angiogenic sprouts, such that without this vascular support, tumors cannot receive vital nutrients [23,24]. Angiogenesis also promotes breast cancer progression and metastasis [25] through the cessation of C-C chemokine ligand 2 (CCL2) inhibitors [26]. The vascular endothelial growth factor (VEGF) is an important angiogenic factor for growing vascularity and providing nutrients to cancer cells [27]. Several studies have tried to detect such metabolic activities through different imaging tools, such as Raman spectroscopy [28,29] and infrared imaging [30][31][32]. In this study, we focus on the application of infrared imaging as an indicator of cancer's presence (see Figure 1).  The application of infrared thermography in breast cancer screening involves computational methods to extract thermal heterogeneity. There are many methods for thermography that capture variance and reduce the dimensionality of thermal images [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51]. In this study, we propose an infrared imaging system equipped with deep learning-based ther-Appl. Sci. 2021, 11, 3248 3 of 15 momics to assist the clinician in the early diagnosis of breast cancer combined with CBE. Figure 2 shows the workflow of the proposed method. We designed a sparse deep convolutional autoencoder (SPAER) to not only compress the dimensionality of the data but also perform embedding for low-rank matrix approximations for the thermal matrix. The proposed approach significantly aided in early diagnosis of breast cancer. The system showed significant robustness in preserving thermal patterns while facing additive Gaussian noise increasing from 3% to 20%. Moreover, this study performed a comparative assessment for the state-of-the-art matrix factorization techniques to generate multichannel inputs. In the next section, the proposed methodology is presented by describing details about the applied matrix factorization approaches and the SPAER and random forest models to perform the early diagnosis of breast cancer. The application of infrared thermography in breast cancer screening involves computational methods to extract thermal heterogeneity. There are many methods for thermography that capture variance and reduce the dimensionality of thermal images [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47][48][49][50][51]. In this study, we propose an infrared imaging system equipped with deep learningbased thermomics to assist the clinician in the early diagnosis of breast cancer combined with CBE. Figure 2 shows the workflow of the proposed method. We designed a sparse deep convolutional autoencoder (SPAER) to not only compress the dimensionality of the data but also perform embedding for low-rank matrix approximations for the thermal matrix. The proposed approach significantly aided in early diagnosis of breast cancer. The system showed significant robustness in preserving thermal patterns while facing additive Gaussian noise increasing from 3% to 20%. Moreover, this study performed a comparative assessment for the state-of-the-art matrix factorization techniques to generate multichannel inputs. In the next section, the proposed methodology is presented by describing details about the applied matrix factorization approaches and the SPAER and random forest models to perform the early diagnosis of breast cancer. Figure 2. Scheme of the presented thermographic system to be used prior to mammography for the initial diagnosis of breast cancer along with clinical breast examination (CBE).

Low-Rank Matrix Approximation
Thermal variance during the short acquisition intervals provides an indication of thermal heterogeneity, which can be detected by different matrix factorization (MF) methods that are widely used in noninvasive evaluation [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47], such as principal component analysis (PCA) in thermography, named as PCT [33,34], and non-negative matrix factorization (NMF) [36][37][38][39]. High dimensionality of thermography in various applications is still an open research challenge in the field. PCT decomposes the vectorized thermal images that are stacked in a matrix, called a heat matrix. It sorts the thermal base vectors Figure 2. Scheme of the presented thermographic system to be used prior to mammography for the initial diagnosis of breast cancer along with clinical breast examination (CBE).

Low-Rank Matrix Approximation
Thermal variance during the short acquisition intervals provides an indication of thermal heterogeneity, which can be detected by different matrix factorization (MF) methods that are widely used in noninvasive evaluation [33][34][35][36][37][38][39][40][41][42][43][44][45][46][47], such as principal component analysis (PCA) in thermography, named as PCT [33,34], and non-negative matrix factorization (NMF) [36][37][38][39]. High dimensionality of thermography in various applications is still an open research challenge in the field. PCT decomposes the vectorized thermal images that are stacked in a matrix, called a heat matrix. It sorts the thermal base vectors based on their variance. Some methods, such as fixed eigenvector analysis [36], incremental PCT (IPCT) [37], and candid covariance-free incremental principal component thermography (CCIPCT) [40], attempt to overcome the computational load by a fixed set of previously generated eigenvectors and a covariance-free approach, respectively. SparsePCT [41,42] (also known as sparse dictionary matrix decomposition [38]) adds regularization terms to the PCT to not only restrict the domain of solution, but to also make it nonlinear, which strengthens the important basis of improving the detection of defective patterns. NMF [35,39] adds constraints to the basis and coefficients and converts MF to a clustering problem, which helps in thermography to better evaluate the thermal abnormalities with no overlapping bases [43,44]. Semi-NMF and sparse NMF are used for infrared, noninvasive evaluation and extraction of the thermal patterns as well as for preliminary breast cancer screening, along with all the aforementioned methods for comparison [45][46][47]. Imaging features extracted from thermal images, called thermomics [47], are used to provide diagnostic solutions along with deep learning-based thermomics [48]. Deep learning approaches are also used to diagnose breast cancer using thermographic imaging [49,50].
Input data X is the heat matrix obtained by stacking the vectorized thermal images X ∈ R MN × τ . PCT decomposes the input heat matrix X to eigenvalue and eigenvector matrices, which are also known as coefficients and bases. The bases matrix represents the highest variance of thermal patterns, sorted in descending variance order corresponding to the coefficients. SparsePCT [41,42] adds penalty terms to the original PCT to limit the domain of the solutions, which leads to nonlinear behavior and a sparse outcome. NMF modifies the PCA's decomposition using non-negative constraints, whereas the bases and coefficients in PCA are allowed to be negative. X can be presented as a linear relation of τ bases B = β 1 , β 2 , . . . , β p , B ∈ R + MN × τ and A coefficient matrix A ∈ R + τ × τ , and X = BA s.t. A ≥ 0, B ≥ 0. The 1 norm penalty term, along with a regularization parameter, is added to NMF for compensating the bases' distinctiveness and improving their presentation, which is shown by the following equation: where B p represents the p norm of B, defined by p = ∑ d,m B d,m 1/p p , which is similar to the 1 penalty term [51,52] calculating B for convex A. Like SparsePCT, low-rank SparseNMF is obtained by selecting k bases, corresponding to the highest coefficients and representing the highest variance of the targeted thermal images (here, k = 3).

Sparse Latent Space Deep Thermomics
The high-dimensional throughput of thermal features, called thermomics [44][45][46][47][48], inspired from radiomics [53][54][55], are often referred to as quantitative feature extraction from medical images. Thermomics contain quantitative measurements of distinct thermal patterns that may aid in the diagnosis of diseases or assist in the course of treatment. Like radiomics (well known in radiographic medical imaging), thermomics have proven to have the capability to find the characteristics of some diseases that cannot be revealed by the naked eye [44][45][46][47][48]. These features are usually complimentary with clinical covariates to increase analytical power. Deep neural networks have been recently explored to improve conventional radiomics [48][49][50]. Computer-aided diagnosis systems often leverage having a sufficient number of features to capture the imaging characteristics, whereas the abundance of thermomics impedes the performance of the system and creates an overfitting problem, known as the curse of dimensionality [48].
Deep features are commonly extracted from the hidden weight layers of pre-trained models such as ResNet [56], ImageNet [57], and VGG [58]. This often leads to highdimensional throughput information and challenges in reducing the dimensionality, which often overfits the model. Conventional feature selection methods, which are efficient for traditional radiomics, might not be the best solution to reduce the dimensionality in such features due to a tight connection among the hidden weights, which can be interpreted as collinearity. The application of a sparse autoencoder for reducing the dimensionality of the pre-trained hidden layers of pre-trained deep learning models can be an acceptable solution for such problems due to training such dimensionality reduction models to be sensitive to slowly varying high dimensionality [48].
In this study, we propose a sparse autoencoder directly for the low-rank approximation of thermal sequences to extract the low-dimensional thermomics and use them to diagnose breast cancer. This surpasses the use of pre-trained models. The autoencoder comprises encoding and decoding pathways to compress the input data to lower dimensions hierarchically and expand the compressed data to regenerate the input image and strive to get as similar an output as possible. This measures the loss function and distance between the original input and the regenerated output, and the parameters of such a model are optimized through a stochastic gradient descent algorithm and strive to minimalize the differences between the input and the reconstructed image (otherwise known as reconstruction loss).
Let x ∈ R n × m be the input with the spatial dimensions of 512 × 512 × 3. The latent space dimension of η, η = F e (x) = a e (Wx + b e ) is the compressed representation of the input image, a e is the encoder activation, and W and b e are the weight and encoder bias matrices, respectively. y = G d (η) = a d W T η expands the latent features back to the original input spatial dimensions. y is the equivalent of x, and a d is the decoder's activation. A deep autoencoder composed of a multilayer model with several activations a (.) i , biases b (.) i , and weights W i and matrices with the goal of minimization {W i , b e i }: Following this optimization, the predominant noise invariant representative patterns of the thermal data are captured. (.) is our loss function for training our model, for which we use binary cross-entropy (BCE) as presented below: where p(y) presents the likelihood of predicting the label y for each point (form i = 1, . . . , F). The presenting value of the background versus the targeted class binarizes with 1 1+e −y , a Sigmoid function. To fit a training set through learning a dictionary through the sparse latent representation, an additional 1 of the latent space as an additional penalty term with a regularization parameter is added to the model as follows [59,60]: where each W i and η i are convex in the objective of this optimization. The aforementioned objective applies for W 1 and η 1 . Applying the following procedure, we extracted the low-dimensional deep thermomics and used them to train a random forest model, then classified the participants into symptomatic and healthy groups. Figure 3 shows the configuration of the SPAER network.
This study used 208 participants from the database for mastology research (DMR), acquired from patients of the Hospital Universitário Antônio Pedro (HUAP) of the Federal University Fluminense [30]. This study was approved by the ethical committee of the HUAP under registration number CAAE: 01042812.0.0000.5243 by the Brazilian Ministry of Health and met the criteria defined by the Declaration of Helsinki. The participants cohort was divided into healthy (without symptoms) and symptomatic (healthy with symptoms or cancer patients) groups with a median age of 60 years, while the youngest and the oldest participants were 20 and 120 years old, respectively. There were 57 African (27.4%), 72 Pardo (34.6%), 77 Caucasian (37%), 1 Mulatto (0.5%), and 1 indigenous (0.5%) woman in this dataset. Among the participants, 38 went through hormone replacement (18.3%), and 52 had diabetes in their family histories (25%). A forward-looking infrared (FLIR) SC620 model infrared camera (Wilsonville, OR) with a sensitivity range of <0.04 • C was used to perform infrared imaging acquisition with a spatial resolution of 640 × 480 pixels. It captured a range of [−40 • C, 500 • C] [15,30]. Table 1 summarizes the demography and clinical information of our study set.  This study used 208 participants from the database for mastology research (DMR), acquired from patients of the Hospital Universitário Antônio Pedro (HUAP) of the Federal University Fluminense [30]. This study was approved by the ethical committee of the HUAP under registration number CAAE: 01042812.0.0000.5243 by the Brazilian Ministry of Health and met the criteria defined by the Declaration of Helsinki. The participants cohort was divided into healthy (without symptoms) and symptomatic (healthy with symptoms or cancer patients) groups with a median age of 60 years, while the youngest and the oldest participants were 20 and 120 years old, respectively. There were 57 African (27.4%), 72 Pardo (34.6%), 77 Caucasian (37%), 1 Mulatto (0.5%), and 1 indigenous (0.5%) woman in this dataset. Among the participants, 38 went through hormone replacement (18.3%), and 52 had diabetes in their family histories (25%). A forward-looking infrared (FLIR) SC620 model infrared camera (Wilsonville, OR) with a sensitivity range of <0.04 °C was used to perform infrared imaging acquisition with a spatial resolution of 640 × 480 pixels. It captured a range of [−40 °C, 500 °C] [15,30]. Table 1 summarizes the demography and clinical information of our study set.

Results
Our proposed models were tested on 208 participants. The results of the system were compared to the ground truth to measure the accuracy of the system.

Results of Low-Rank Thermal Matrix Approximation
We calculated three predominant low-rank matrices from 23 baseline thermal imaging sequences using matrix factorization. Figure 4 shows the low-rank representation of thermal patterns captured by PCT, IPCT, NMF, sparse PCT, and sparse NMF for six examples of healthy and symptomatic participants. With this procedure, a region of interest (ROI) was identified for each participant for three bases. For every case in Figure 4, the targeted ROI was magnified to highlight the thermal patterns for the screening cases. These thermal patterns revealed the potential angiogenesis in thermal images, as there was more heterogeneity in the ROI of cancerous or healthy with symptoms cases (Figure 4iii-vi) than the healthy ROI (Figure 4i,ii). Figure 4iii,iv presents healthy participants with pain and changes in the nipple, but the mammography and biopsy results did not conclude the presence of cancer. In this analysis, thermal heterogeneity was detected for symptomatic patients using various low-rank matrix factorizations (see Figure 4a-e), whereas there were more homogeneous thermal patterns observed for asymptomatic participants (Figure 4a-e, rows i,ii).

Sparse Autoencoder and Deep Thermomics
Sixteen (16) deep thermomics were extracted from the proposed SPAER model from three-channel images with a spatial resolution of 512 × 512 as the input. The input channels were three predominant, low-rank thermal matrix approximations. The proposed deep autoencoder used in this study contained five convolutional layers (Figure 3) to reduce the dimensionality of the input from 786,432 to 32,768, where another five sparse, dense layers reduced the dimensionality of the imaging markers from 16,384 to 2048, 256, 64, and 16 deep thermomics. The SPAER model has 637,507 trainable parameters, which are trained using the Adam optimizer for 100 epochs for 3000 raw thermal images and validated for 788 baseline thermal images, with a sparsity 1 regularization value of 10 −5 in the dense layers and 16 batch sizes. Consequently, the SPAER model mitigated the dimensionality of the embedded low-rank thermal matrices slightly less than 50,000 times while striving to keep the characteristics of the thermal images intact. Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 of 16

Classification Outcome
The extracted deep thermomics generated by the SPAER model were used to train a random forest model and distinguish symptomatic patients from asymptomatic participants. The accuracy of this model was evaluated by comparing the model's suggested cohort versus the ground truth, which was obtained by mammography and breast biopsy. We benchmarked the model using 16 extracted deep thermomics biomarkers and other clinical and demographic information (e.g., age, marital status, and family history)

Classification Outcome
The extracted deep thermomics generated by the SPAER model were used to train a random forest model and distinguish symptomatic patients from asymptomatic participants. The accuracy of this model was evaluated by comparing the model's suggested cohort versus the ground truth, which was obtained by mammography and breast biopsy. We benchmarked the model using 16 extracted deep thermomics biomarkers and other clinical and demographic information (e.g., age, marital status, and family history) through leave-one-out cross-validation. The accuracy of the maximal multivariate model yielded 78.2% (74.3-82.5%) for NMF deep thermomics and clinical information, which was challenged by NMF-SPAER, PCT-SPAER and sparse PCT-SPAER with clinical information, having accuracies of 77.7% (70.9-82.1%), 77.7% (73.3-82.1%), and 77.7% (72.3-81.6%), respectively. IPCT-SPAER, PCT-SPAER, and sparse PCT-SPAER showed considerably similar accuracies of diagnosis, yielding 85.4% (79.6-88.8%) and 86.4% (81.1-88.8%) with and without clinical information, respectively. IPCT-SPAER showed the lowest diagnostic accuracy of 74.3% (69.4-80.1%), 74.3% (70.4-79.1%), and 74.3% (69.9-79.6%), respectively. IPCT-SPAER's accuracy slightly increased to 76.2% (71.4-79.6%) with demographics. The minimal full multivariate model, which was for sparse NMF-SPAER with clinical information, showed an accuracy of 74.8% (70.9-77.2%), indicating lower performance, probably due to dual sparsity and non-negative matrix factorization with the models (see Table 2). Table 2. Results of diagnosis based on the classification of symptomatic versus asymptomatic patients with leave-one-out cross-validation. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated using the optimum point in the receiver operating characteristic (ROC) for the trained model (Table 2). Sparse NMF-SPAER + Clinical showed the maximum sensitivity (87.5%) among the other methods, where IPCT-SPAER + Clinical and Sparse PCT-SPAER at 86.3% were the second-best sensitivities among the methods. Clinical covariates alone showed the minimum sensitivity (73.8%) compared with the other approaches. The sparse NMF-SPAER (91.4%) and NMF-SPAER + Clinical (87.5%) showed the maximum specificities, while clinical alone showed the worst specificity of 70.3%. The sparsity in the model might be the key to having higher specificity and indicating which cases are not symptomatic. Similarly, sparse NMF-SPAER showed the maximum PPV (85.5%), while sparse NMF-SPAER + Clinical had the highest NPV among the rest of the algorithms (91.2%). The clinical covariate was the least significant for the PPV and NPV.

Classification Accuracy 2 (%) Sensitivity (%) Specificity (%) PPV (%) NPV (%) T-Test 3 t-Statistic,
A multivariate model consisting of clinical information and demographics alone provided an accuracy of 72.8% (70.4-75.3%). We also calculated the statistical difference of the maximal accuracy (NMF-SPAER + Clinical), with other approaches using a twotailed t-test (see Table 2). NMF-SPAER + Clinical showed considerable similarity to NMF-SPAER with clinical covariates (t-statistic = 1.1, p-value = 0.27) as well as sparse PCT-SPAER + Clinical (t-statistic = 0.2, p-value = 0.8) and PCT-SPAER+ Clinical (t-statistic = 0.6, p-value = 0.5), whereas the rest of the methods showed significant statistical differences. Figure 5 presents the ROC graphs of the baseline models with different embedding SPAER methods. The total computation of this study was performed in the Python programming language using the TensorFlow Python library [61,62].
rest of the algorithms (91.2%). The clinical covariate was the least significant for the PPV and NPV.
A multivariate model consisting of clinical information and demographics alone provided an accuracy of 72.8% (70.4-75.3%). We also calculated the statistical difference of the maximal accuracy (NMF-SPAER + Clinical), with other approaches using a two-tailed t-test (see Table 2). NMF-SPAER + Clinical showed considerable similarity to NMF-SPAER with clinical covariates (t-statistic = 1.1, p-value = 0.27) as well as sparse PCT-SPAER + Clinical (t-statistic = 0.2, p-value = 0.8) and PCT-SPAER+ Clinical (t-statistic = 0.6, p-value = 0.5), whereas the rest of the methods showed significant statistical differences. Figure 5 presents the ROC graphs of the baseline models with different embedding SPAER methods. The total computation of this study was performed in the Python programming language using the TensorFlow Python library [61,62]. All parameters, including the number of trees, depth, and random state, were optimized in a cross-validation loop through a greed search. This greed search not only guaranteed that we optimized the parameters, but it also confirmed that we did not overfit the parameters as we did it after leaving one subject out of the optimization. To improve the model's accuracy, we suggest increasing the number of patients to increase the learning and statistical power.

System's Robustness
Thermography is usually affected by noise, as one of the undeniable facts of using infrared imagery induced by environmental conditions surrounding the breast cancer screening exam. Robustness of the model versus noise can be vital for the system, particularly on the imaging acquisition and decision-making system. This problem is highlighted especially for our study cohort with different ages, races, family histories, and body shapes. The proposed SPAER model is presumably robust against noise because of the nature of the low-rank matrix approximation and latent space representation of the All parameters, including the number of trees, depth, and random state, were optimized in a cross-validation loop through a greed search. This greed search not only guaranteed that we optimized the parameters, but it also confirmed that we did not overfit the parameters as we did it after leaving one subject out of the optimization. To improve the model's accuracy, we suggest increasing the number of patients to increase the learning and statistical power.

System's Robustness
Thermography is usually affected by noise, as one of the undeniable facts of using infrared imagery induced by environmental conditions surrounding the breast cancer screening exam. Robustness of the model versus noise can be vital for the system, particularly on the imaging acquisition and decision-making system. This problem is highlighted especially for our study cohort with different ages, races, family histories, and body shapes. The proposed SPAER model is presumably robust against noise because of the nature of the low-rank matrix approximation and latent space representation of the input image. To show such stability in the SPAER model, we measured the signal-to-noise ratio (SNR) of the model with additive Gaussian noise in the multiple channels of input of the SPEAR model. We used the following SNR equation to measure the stability of the SPAER model (from [63]): where µ S and µ N are the signal and noise averages in the ROI, respectively, and σ N denotes the standard deviation of the noise, obtained from the thermal referencing point in the images. The low-rank matrix approximation techniques significantly eliminated the noise of the input data (see [37,46]). The reason we evaluated the SNR only for the multichannel input of the SPAER model was to have a separate assessment of this model. We included the additive Gaussian noise, incrementally increasing it from 3% to a 20% noise level for the multichannel input of the trained SPAER model and extracting deep thermomics, then reconstructing the image where the SNR was measured to evaluate the stability of the SPAER model. With that, we also measured the SNRs for the thermal matrix factorization techniques separately. Figure 6 presents the robustness of the model, divided into two parts: low-rank matrix approximation methods and the SPAER model. The results showed that sparse PCT and the SPAER model demonstrated the highest robustness due to additive penalty terms, inducing nonlinear behavior and limiting the domain of the solution, which increased the robustness of the model. This was also seen in sparse NMF, where nonnegative constraints may have weakened the robustness occurring in NMF, with the lowest robustness among the matrix factorization algorithms. IPCT showed more robustness than PCT. This might be because of the incremental adjusting of the bases in the algorithm. We also performed statistical analysis among matrix factorization methods using a t-test between sparse PCT and other approaches. The results showed a significant difference between sparse PCT and sparse NMF, IPCT, and PCT. It may be due to the dissimilarity between PCT and IPCT and the non-negative and sparsity constraints in sparse NMF. NMF and sparse PCT showed more similarity in SNR while the noise increased, as non-negative constraints also limited the domain of the solution, similar to the sparsity penalty term in sparse PCT, despite the lower stability of NMF.
where and are the signal and noise averages in the ROI, respectively, and denotes the standard deviation of the noise, obtained from the thermal referencing point in the images. The low-rank matrix approximation techniques significantly eliminated the noise of the input data (see [37,46]). The reason we evaluated the SNR only for the multichannel input of the SPAER model was to have a separate assessment of this model. We included the additive Gaussian noise, incrementally increasing it from 3% to a 20% noise level for the multichannel input of the trained SPAER model and extracting deep thermomics, then reconstructing the image where the SNR was measured to evaluate the stability of the SPAER model. With that, we also measured the SNRs for the thermal matrix factorization techniques separately. Figure 6 presents the robustness of the model, divided into two parts: low-rank matrix approximation methods and the SPAER model. The results showed that sparse PCT and the SPAER model demonstrated the highest robustness due to additive penalty terms, inducing nonlinear behavior and limiting the domain of the solution, which increased the robustness of the model. This was also seen in sparse NMF, where non-negative constraints may have weakened the robustness occurring in NMF, with the lowest robustness among the matrix factorization algorithms. IPCT showed more robustness than PCT. This might be because of the incremental adjusting of the bases in the algorithm. We also performed statistical analysis among matrix factorization methods using a t-test between sparse PCT and other approaches. The results showed a significant difference between sparse PCT and sparse NMF, IPCT, and PCT. It may be due to the dissimilarity between PCT and IPCT and the non-negative and sparsity constraints in sparse NMF. NMF and sparse PCT showed more similarity in SNR while the noise increased, as non-negative constraints also limited the domain of the solution, similar to the sparsity penalty term in sparse PCT, despite the lower stability of NMF.

Discussion
In this study, we introduced a sparse deep convolutional autoencoder called SPAER to generate deep thermomic features. It embedded three predominant thermal bases, obtained by low-rank matrix approximation in the form of a multichannel input matrix. We designed the SPAER model to be applicable to any imaging modality, but we tested it for dynamic thermography. The results indicated that SPAER could identify the potential symptomatic patients and aid in the early diagnosis of breast abnormality as a noninvasive, efficient, and fast technique to be used with CBE and before mammography.
SPAER potentially surpasses the use of pre-trained deep neural networks to extract high-dimensional imaging throughput, which leads to high collinearity among the features and overfitting the system, known as the curse of dimensionality. The results indicate that SPAER significantly reduced the noise throughout the process by having sparse regularizations in dense layers, which were tested with additive noise (Figure 6). Autoencoders are often used for noise elimination due to their nonlinearity. SPAER also behaved the same and showed a significant SNR when facing additive noise [64][65][66]. Low-rank matrix approximation used for the extraction of high variance bases also enhanced the stability of the system against noise [37,46]. This had the opposite effect on the accuracy of sparse NMF, and relatively Sparse PCT, as two-folds sparsity with non-negative constraints mitigating the outcome performance of the overall system. This may be due to reducing the domain of the solution having two penalty terms, which led to a slight decrease in the overall accuracy for sparse NMF-SPAER and even for sparse PCT-SPAER (Table 2).
The application of infrared thermography to establish early breast abnormality has been suggested previously [30,49]. This can be expanded for several nonneoplastic diseases and normal homeostatic processes [21,22]. Despite some discussions about its reliability as a single applied system for finding breast cancer [20], dynamic thermography showed promising outcomes [46][47][48], and this makes it a valuable complementary mechanism along with CBE for increasing the possibility of finding abnormalities at an initial stage (not as a mammography substitute [67,68]). Moreover, the association of heterogeneous thermal patterns with early breast cancer detection has not been discussed in the literature and significantly increases the novelty of the proposed model. Besides that, the SPAER model was tested for dynamic thermography, yet its applications are not limited to this application, and it can be used for any other imaging modality, which makes SPAER a generalized model for a variety of applications.
One of the limitations of this study is the number of patients. Despite having a relatively large cohort of patients, having access to an even larger cohort would increase the validation of the statistical analysis and improve the accuracy of the system. The maximum efficiency of the method would be revealed by a higher number of patients. DMR was the only available dataset with such analyses that we could access, and this limited the possibility of an independent validation set to confirm the performance of the system. Another limitation of this study is related to the lack of comprehensive clinical information. Even with the great effort of the DMR team [30], there could be more potential clinical information, such as pathology information. This could help to find the correlation of different factors with thermal heterogeneity. Moreover, there is a limitation regarding the minimum number of thermal images required for the process, which should be more than four images, as there is a matrix factorization method involved that selects three bases, which requires a minimum of four thermal images.
The proposed model provides several advantages. (1) SPAER provides low-dimensional deep thermomics (imaging biomarkers), which exceeds the use of pre-trained models for feature extraction with high dimensionality and their complications. (2) The SPAER model can be trained for the specific data and generate relevant imaging features, rather than diverse training datasets (e.g., natural images versus medical imaging sets), which provides another advantage for this application. (3) SPAER uses the multichannel low-rank thermal matrix approximation that provides embedding for the use of the low-rank matrix factorization method. This mitigates the use of manual selection of predominant bases.
(4) This model significantly mitigates the noise and motion artifacts during imaging, which often happens for infrared thermographic imaging. To the best of our knowledge, the model and the way of embedding multichannels for extracting deep thermomics has not been proposed before, which makes this research a novel study.

Conclusions
The proposed study tackled one of the major challenges of using deep learning imaging biomarkers by developing a sparse deep convolutional autoencoder, called the SPAER model, which generates low-dimensional deep thermomics. It used the low-rank multichannel thermal matrix approximation as an input for the SPAER model. The SPAER model provided significant dimensionality reduction without lessening the diagnostic performance, decreasing from 786,432 to 16 imaging biomarkers. Five state-of-the-art matrix factorizations were employed to extract three initial predominant bases and embed them, which provided a solution for selecting manual bases. A total of 208 breast cancer screening cases with dynamic thermography were used for testing the proposed model.
The best accuracy was for NMF-incorporated SPAER with demographics for preserving thermal heterogeneity to classify between symptomatic and asymptomatic cases (yielded to 78.2% (74.3-82.5%)). In addition, SPAER showed significant robustness against additive Gaussian noise, increasing from 3% to 20%, while the highest SNR was obtained by sparse PCT, generating multichannel low-rank thermal matrix approximation. Future work would involve the integration of other available clinical factors to enhance the ability to assess the thermal characteristics of tissues. This could be better evaluated by increasing the study cohort and multimodal analysis.