Next Article in Journal
Synthesis, Characterization, DFT Studies, and NLO Properties of Some Benzimidazole Compounds
Previous Article in Journal
DSC Study of Cold Crystallization Process Characterizing the Stereocomplexed PLA Compounds
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Machine Learning and Infrared Thermography for Breast Cancer Detection †

by
Caroline B Gonçalves
1,
Amanda C. Q. Leles
2,
Lucimara E. Oliveira
2,
Gilmar Guimaraes
2,
Juliano R. Cunha
3 and
Henrique Fernandes
1,4,*
1
School of Computer Science, Federal University of Uberlandia, Uberlandia 38400–902, Brazil
2
School of Mechanical Engineering, Federal University of Uberlandia, Uberlandia 38400–902, Brazil
3
Cancer Hospital of Uberlandia, Federal University of Uberlandia, Uberlandia 38405–302, Brazil
4
Fraunhofer IZFP Institute for Nondestructive Testing, Campus E3 1, 66123 Saarbrücken, Germany
*
Author to whom correspondence should be addressed.
Presented at the 15th International Workshop on Advanced Infrared Technology and Applications (AITA 2019), Florence, Italy, 17–19 September 2019.
Proceedings 2019, 27(1), 45; https://doi.org/10.3390/proceedings2019027045
Published: 11 October 2019

Abstract

:
Breast cancer kills a large number of women around the world. Infrared thermography is a promising screening technique which does not involve harmful radiation for the patient and has a relatively low cost. This work proposes an approach for classifying patients into three different classes using infrared images: healthy patients, patients with benign changes and patients with cancer (malignant changes). A set of features is extracted from each image and two approaches are used in the classification process. The first is based on Artificial Neural Networks while the second is based on Support Vector Machines. The proposed approach shows a great potential to be used as a screening diagnosis technique for early breast cancer detection.

1. Introduction

Cancer is a major public health problem worldwide and is the second leading cause of death in the United States. Breast cancer is the second type of cancer more common in the world and the most common among women. Mammography is the main test used for the diagnosis in the early stages of breast cancer, being the most used test to diagnose this type of cancer. However, mammography has some limitations, such as the difficulty of detecting tumors in young patients or cancers without masses, such as Paget’s carcinoma [1]. This difficulty is related to the density of the breasts, since the young breast is mainly composed of glandular tissue, which makes it more dense. This high density of the breast interferes in the identification of masses and micro-calcifications by X-rays [2].
The chance of cure for breast cancer drops dramatically if the disease is not discovered in the early stages [3]. Thus, early detection of the disease is critical since the earlier the disease is discovered the better the treatments and the chances of cure of the patient, which would lead to a decrease in the mortality rate due to this type of cancer.
Infrared thermography (IRT) is a technique that has been used in combination with other screening techniques in the aid of breast cancer detection. IRT is a low-cost technique that does not involve harmful radiation to humans and it does not involve invasive procedures. It is based on the principle of measuring the infrared radiation emitted by an object or surface, through an infrared camera for example, to determine its temperature [4]. In addition to the low cost of the examination, thermography may provide better results for breast cancer detection in young women, who usually have denser breasts, when compared to mammography [5]. Researches also indicate that IRT can detect breast cancer earlier than other techniques, offering the potential to detect it even years before mammography [5].

2. Methodology

The approach proposed in this work involves the acquisition of infrared images of healthy patients and patients with cancer followed by the extraction of features to describe these images. Then, these set of features is used to train an automatic classifier where a part of the images (patients) is used to train the classifiers and the other part is used to test the classifiers. After the classifier is trained it can be used to classify any new patient (image).

2.1. Database

The database used contains data from 70 patients with or without breast tumor. These images were collected in a five month period. The images were captured with a mid-wave infrared camera. Images were acquired using the static protocol: patient waited 15 minutes for acclimatization of body temperature and then the image acquisition was performed. Images, from each patient, were acquired in four different poses: front with arms raised, front with arms down, left side (left breast only) and right side (right breast only). For the analyzes performed in this work, only the frontal images with the raised arms were considered since the images in the other positions were not standardized (patients in different poses).
The patients were classified into three groups: normal (without changes), benign changes (benign nodules) and malignant changes (tumors). A previously classification of each patient was obtained through the diagnosis of the specialist physician based on the exams: mammography, ultrasonography, magnetic resonance imaging and biopsy. However, some patients were not submitted to all exams. The patients were only submitted to the necessary examinations so that the specialist doctor had the conclusive and final diagnosis. The 70 patients considered are divided as follows: 21 patients without alterations in the breast (normal), 23 with benign changes and 26 with malignant changes.
Each image in the database was then preprocessed to extract the region of interest (ROI), i.e., the region containing only the patient’s breast. The ROI is the only region of the image which contains information regarding the presence or not of the breast cancer. This segmentation process was performed with a thresholding approach based on the Otsu’s method [6]. Fine adjustments were done to correct any outliers. The other parts of the image like the background, arms and head of the patients were not considered in the feature extraction stage.

2.2. Feature Extraction

After ROI extraction, a set of features was extracted from these ROIs. The feature extraction seeks to obtain information from the images that can describe them effectively and that the expert just looking at the images would not perceive.
In this work, a compilation of features found in the literature was used to describe each breast. Features used are: (1)Average, (2) Standard deviation, (3) Median, (4) Minimum temperature, (5) Maximum temperature, (6) Thermal Amplitude, (7) Asymmetry, (8) Kurtosis, (9) Entropy, (10) Contrast, (11) Correlation, (12) Homogeneity, (13) Energy, (14) Moment 2, (15) Moment 3, (16) Moment 4, (17) Box-Counting Dimension. These feature were chosen based on the some works available in the literature [3,7,8]. For each breast (right and left) all these features were calculated. Details of each feature can be found in the works previously cited.
Several combinations of different features were obtained from the features mentioned in the previous paragraph. Each combination was used as the input of the classifiers to find the best set of features appropriate to the database used.

2.3. Classifiers

In this work, two classifiers were considered: Artificial Neural Networks (ANN) and the Support Vector Machine (SVM). SVM was chosen because it is the most common classifier used in the literature for thermographic images. Both were implemented using the toolboxes provided by Matlab©, Neural Network Toolbox (for ANN implementation) and Machine Learning Toolbox (for SVM implementation). The input of the classifiers is an array with features extracted from a ROI. As several feature combinations were considered, the input of the classifiers is variable and a single specification of the classifier was trained several times.
The ANNs considered are of type Pattern Recognition Network which contains one hidden layer. The training function used was Levenberg-Marquardt backpropagation which was chosen empirically. The ANN stop criteria were 1000 iterations or a minimum gradient less than 1 10 . The number of neurons in the hidden layer was one of the parameters considered in the tests and will be discussed in Section 4. The number of input neurons depends on the combination of characteristics (features) being considered.
For the SVM, the multi-class error-correcting output codes (ECOC) model was used, which allows classification in more than two classes; and the Matlab© fitcecoc function that creates and adjusts the template for SVM. The Kernel functions considered in the SVM were: Linear, Quadratic, Cubic, Gaussian and Sigmoid. The parameters of the kernel functions were adjusted depending on the test and will be presented in Section 4. Since more than two classes were considered as output from the classifier, the approach type one vs one and one vs all is a configurable parameter. Both approaches were considered in the tests.

3. Experiments

Experiments were performed with three different input images. For the first one no spatial filter was applied in the images. In the second the median filter was applied while in the third a Gaussian filter with σ = 2 was applied. Based on the features presented in Section 2, seven different feature combination were tested. The choice of these combinations was made based on what was presented in [3,7,8]. Feature combination are going to be presented in details in the extended version of the paper.
Initially, the ANN was tested with a number of neurons in the hidden layer equal to the square root of the number of input neurons. As the results were not satisfactory, several number of neurons in the hidden layer were tested. The number of neurons in the hidden layer were: 4 to 10, 15, 20, 25, 30, 50.
For the SVM, the kernel functions used were Linear; Quadratic; Cubic; Gaussian, with kernel scale values equals 1.5, 5.8 and 23; sigmoid with slope equals to 1. Approaches one-vs-one and one-vs-all were both considered.

4. Results and Discussion

Each training process was repeated 50 times, changing only the point of the solution space where the classifier begins the solution search. Only the best result of each test was considered. The best result obtained with the ANN was achieved with 15 neurons in the hidden layer, considering the feature combination number 5 (each includes 14 feature: 1 to 11 and 14 to 16). This setup had an accuracy of 76.19%, a normal specificity of 57.1%, a benign specificity of 83.3%, and a malignant sensitivity of 87.5%. For the SVM, the best result was also achieved by using the feature combination number 5, with a one-vs-all approach and a cubic kernel function. This setup had an accuracy of 80.95%, a normal specificity of 83.33%, a benign specificity of 85.71%, and a malignant sensitivity of 75%.
A negative factor that probably impacted the results is the fact that the database did not have information (classification) on each breast separately. The classification (has benign change, has malignant change, or no change) is for each patient. Thus, if there is a change, there is no information if the change is located on the left or on the right breast.
Additional experiments were performed considering only two classes: patients without changes and patients with change. The first class includes patients without changes and with benign changes, while the second class includes only patients with malignant changes. All experiments performed before were repeated and only the classes (two instead of three) changed. Classifying results obtained were better. For the combination number 5, the one which got the best results in the three classes approach, the ANN classifier achieved an accuracy of 90.5% and the SVM achieved an accuracy of 85.7%.

5. Conclusions

The use of infrared images for the detection of breast cancer is a promising screening technique which can aid in the diagnostic of the disease since it is a pain-free technique, can identifying changes in dense breasts (which is hard for the conventional mammography), there is no harmful radiation, can identify early changes, and has a low cost.
In this work, a database with 70 patients was analyzed. First, the ROI of each image was identified. Second, a set of features was extracted from each ROI. Last, a set of different feature combinations was tested with two different classifiers (ANN and SVM) in order to classify patients in three different classes: patients with no change, patients with benign changes, and patients with malignant changes. The great majority of works available in the literature classifies patients only in two classes (no change and with change). Thus, this work is an effort to develop an approach that could differentiate, using only infrared images, if a change in a breast is benign or malignant.
Our results are promising and confirm that infrared images can be used for breast cancer detection. The best result obtained had an accuracy of 80.95%, a normal specificity of 83.33%, a benign specificity of 85.71%, and malignant sensitivity of 75%. Future works, already in progress include (1) consideration of other features like the thermal asymmetry between the breast of the patient, (2) use of genetic algorithms (GA) to automatic select the best set of features to describe the image, and (3) use of deep learning approaches to classify the patients.

Author Contributions

Conceptualization and software development, C.B.G.; Data acquisition, A.C.Q.L. and L.E.O.; Validation, J.R.C.; Project administration and co-supervision, G.G.; Supervision, H.F.

Funding

This study was financed in part by the Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior - Brasil (CAPES) - Finance Code 001.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wishart, G.; Campisi, M.; Boswell, M.; Chapman, D.; Shackleton, V.; Iddles, S.; Hallett, A.; Britton, P. The accuracy of digital infrared imaging for breast cancer detection in women undergoing breast biopsy. Eur. J. Surgical Oncol. 2010, 36, 535–540. [Google Scholar] [CrossRef] [PubMed]
  2. Freer, P.E. Mammographic Breast Density: Impact on Breast Cancer Risk and Implications for Screening. RadioGraphics 2015, 35, 302–315. [Google Scholar] [CrossRef] [PubMed]
  3. Lessa, V.; Marengoni, M. Applying Artificial Neural Network for the Classification of Breast Cancer Using Infrared Thermographic Images. In International Conference on Computer Vision and Graphics; Springer: Berlin, Germany, 2016; pp. 429–438. [Google Scholar]
  4. Kandlikar, S.G.; Perez-Raya, I.; Raghupathi, P.A.; Gonzalez-Hernandez, J.L.; Dabydeen, D.; Medeiros, L.; Phatak, P. Infrared imaging technology for breast cancer detection–Current status, protocols and new directions. Int. J. Heat Mass Transf. 2017, 108, 2303–2320. [Google Scholar] [CrossRef]
  5. Ng, E.K. A review of thermography as promising non-invasive detection modality for breast tumor. Int. J. Therm. Sci. 2009, 48, 849–859. [Google Scholar] [CrossRef]
  6. Otsu, N. A threshold selection method from gray-level histograms. In Proceedings of the International Conference on Systems, Man, and Cybernetics (SMC), Bavaria, Germany, 8–10 October 1979; Volume 9, pp. 62–66. [Google Scholar]
  7. Borchartt, T. Análise de imagens termográficas para a classificação de alterações na mama; UFF: Niterói, Brazil, 2013. [Google Scholar]
  8. Acharya, U.R.; Ng, E.Y.K.; Tan, J.H.; Sree, S.V. Thermography based breast cancer detection using texture features and support vector machine. J. Med. Syst. 2012, 36, 1503–1510. [Google Scholar] [CrossRef]

Share and Cite

MDPI and ACS Style

Gonçalves, C.B.; Leles, A.C.Q.; Oliveira, L.E.; Guimaraes, G.; Cunha, J.R.; Fernandes, H. Machine Learning and Infrared Thermography for Breast Cancer Detection. Proceedings 2019, 27, 45. https://doi.org/10.3390/proceedings2019027045

AMA Style

Gonçalves CB, Leles ACQ, Oliveira LE, Guimaraes G, Cunha JR, Fernandes H. Machine Learning and Infrared Thermography for Breast Cancer Detection. Proceedings. 2019; 27(1):45. https://doi.org/10.3390/proceedings2019027045

Chicago/Turabian Style

Gonçalves, Caroline B, Amanda C. Q. Leles, Lucimara E. Oliveira, Gilmar Guimaraes, Juliano R. Cunha, and Henrique Fernandes. 2019. "Machine Learning and Infrared Thermography for Breast Cancer Detection" Proceedings 27, no. 1: 45. https://doi.org/10.3390/proceedings2019027045

APA Style

Gonçalves, C. B., Leles, A. C. Q., Oliveira, L. E., Guimaraes, G., Cunha, J. R., & Fernandes, H. (2019). Machine Learning and Infrared Thermography for Breast Cancer Detection. Proceedings, 27(1), 45. https://doi.org/10.3390/proceedings2019027045

Article Metrics

Back to TopTop