Next Article in Journal
A 60-GHz Wideband High-Efficiency Circularly Polarized Dual-Coil Antenna Array
Previous Article in Journal
An Experimental Evaluation of Indoor Localization in Autonomous Mobile Robots
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Chemical Nose-Based Non-Invasive Detection of Breast Cancer Using Exhaled Breath †

1
Biomedical Engineering, Ben-Gurion University of the Negev, Beersheba 8410501, Israel
2
Breast Health Center Soroka Medical Center, Ben-Gurion University, Beersheba 8410501, Israel
3
OPTIMAL—Industrial Neural Systems, Beersheba 84243, Israel
4
Pulmonary Unit, Soroka University Medical Center and the Faculty of Health Sciences, Ben-Gurion University of the Negev, Beersheba 8410501, Israel
5
Department of Analytical Chemistry, Nuclear Research Center Negev, Be’er-Sheva 84190, Israel
*
Authors to whom correspondence should be addressed.
This paper is dedicated to the memory of Yehuda Zeiri.
Deceased author.
Sensors 2025, 25(7), 2210; https://doi.org/10.3390/s25072210
Submission received: 28 February 2025 / Revised: 25 March 2025 / Accepted: 28 March 2025 / Published: 31 March 2025
(This article belongs to the Section Biomedical Sensors)

Abstract

:

Highlights

What are the main findings?
  • Exhaled breath was analyzed using a commercial electronic nose for early breast cancer detection.
  • Samples were collected with no prior requirements from the patients.
  • After feature extraction, the processed data was used for model optimization and training.
What is the implication of the main finding?
  • Accuracy, precision, and specificity of 91.0% were achieved by the model.
  • The method could provide a path for a rapid, simple, screening technique for low-income countries.

Abstract

Breast cancer (BC) is the most commonly occurring cancer in women and one of the leading causes of cancer death in women worldwide. BC mortality is related to early tumor detection, highlighting the importance of early detection methods. This work aims to develop a robust, accurate and highly reliable, non-invasive, low-cost screening method for early detection of BC in routine screening using exhaled breath (EB) analysis. For this, exhaled breath samples were collected from 267 women: 131 breast cancer patients and 136 healthy women. After collection, the samples were measured using a commercially available electronic nose. The signals obtained for each sample were first processed and then went through a feature extraction step. An SVM model was then optimized with respect to the accuracy matrix using a validation set by applying a Monte Carlo cross-validation with 100 iterations, with each iteration containing 20% of the data. The validation set results were 80, 94, 88, and 95% for recall, precision, accuracy, and specificity, correspondingly. Once model optimization had concluded, 22 unknown samples were analyzed by the model, and an accuracy, precision, and specificity of 91% was achieved.

Graphical Abstract

1. Introduction

Breast cancer (BC) is a leading cause of death in women, with 670,000 global deaths by BC in 2022 alone [1,2]. Although BC is sometimes thought of as a disease of the developed world [3], about half of BC cases and approximately 60% of BC deaths occur in less-developed countries [4]. Furthermore, BC incidence is increasing in developing countries due to increased life expectancy and changes in lifestyle [5]. In the United States, BC is the most commonly diagnosed cancer in women and the second most common cause of cancer-related mortality among women. It is estimated that over 310,000 women will be diagnosed with invasive BC in 2024, and it was estimated that approximately 42,000 will die from the disease [6]. Feminine gender and advancing age are the most significant risk factors for BC; hence, efficient screening for BC is an essential part of women’s healthcare [3]. The survival rate of BC varies among different parts of the world and is lower in low-income countries. This lower survival rate can be attributed to a lack of awareness and early detection programs, which results in a high proportion of women presenting with advanced disease. Since BC mortality is therefore related to the sensitivity of tumor detection methods used, the development of new detection methods capable of early tumor identification has been a highly active area of research for several decades [7,8]. Ideally, new schemes should be non-invasive, simple in both usage and result analysis, and inexpensive for implementation. Currently, the main approach for the early detection of BC is by screening mammography, which has been proven to reduce BC mortality. However, one of the limitations of screening mammography is the ability to detect small tumors in dense breast tissue [9]. The overall sensitivity of mammography is 75–85% but decreases to 30–50% in dense breast tissue [10]. One of the new methods that can overcome this limitation is dual-energy digital mammography [11,12]. This approach consists of high- and low-energy digital mammograms following the administration of an iodine-based contrast agent. However, the improved resolution of this method is achieved by the exposure of the breast to an increased dose of X-ray irradiation and contrast material injection. An additional method is magnetic resonance imaging (MRI) imaging [13]. MRI imaging has become increasingly important in the detection and delineation of BC in daily practice. MRI is highly sensitive in dense breasts, where mammography sensitivity is low. However, the major drawback of the MRI imaging technique is its high cost. Another method, based on identifying differences in electric properties between normal breast tissue and carcinoma [14], is electrical impedance tomography (EIT) [15]. This imaging method maps the body’s conductivity from superficial skin current measurements and has been shown to have a good ability to identify tumor abnormalities. Another pathway towards BC detection, which overcomes limits such as dense breast tissue and high cost, is through exhaled breath analysis [8,16,17,18,19,20,21]. It is well established that exhaled breath contains volatile organic compounds (VOCs) produced by biochemical reactions that occur in the human body. Consequently, the compositional analysis of exhaled breath can supply information regarding our medical condition [22]. Various illnesses have been identified by biomarkers in exhaled breath (EB), including chronic obstructive pulmonary disease [23], pulmonary diseases [24], diabetes [25], lung cancer [26], and pancreatic cancer [27]. As a result, exhaled breath analysis has become a very active research and diagnostic field during the last two decades [16,17,18,19,20,21,28,29,30,31,32,33,34,35,36,37]. The prosperity of EB analysis-based research is partially due to the development of small, easy-to-use, and inexpensive detectors, also known as electronic noses (ENs) [38,39,40,41,42]. Similar to the olfactory system, an EN uses an array of non-specific sensors. The EN does not provide precise information on the chemical composition of the exhaled breath gas mixture but a pattern of sensor signals that represent the EB composition. The EN data are usually analyzed using machine learning techniques [43,44,45,46]. These include different algorithms to identify patterns in the data and use them for data classification, allowing efficient predictions on such data.
In this work, exhaled breath collected from 267 women is measured using the commercially available Cyranose 320 EN. The measurement results then undergo feature extraction and are classified using an SVM model. The developed method exhibits accuracy, precision, and specificity values of 91.0%.

2. Experimental and Computational Methods

2.1. Exhaled Breath Collection

All samples were collected in the Breast Health Center in Soroka Medical Center, Beer Sheva. Exhaled breath was collected using Teldar 1 L sample collection bags (Cel Scientific Corp. Cerritos, CA, USA). Patients exhaled into a plastic tube (4 mm inner diameter) connected to the sampling bag, filling it to about 90% of the bag’s volume. For measurements, 20 mL of a sample was withdrawn from the sample bag to a disposable plastic syringe through the septa located on the Tedlar bag. Samples were measured within 4 h of sampling.

2.2. Electronic Nose

Exhale breath analysis was performed using the Cyranose 320 (Sensigent Intelligent Sensing Solutions, Baldwin Park, CA, USA). This EN has 32 polymer-based sensors, each with different sensitivity to various gases. Measurements were carried out in three steps. First, ambient air would flow through the EN for 25 s. This step serves as the baseline purge. Next, the sample in the syringe would flow into the EN for 40 s. Finally, the syringe was disconnected, and ambient air was drawn into the EN until the measurement was completed. The total measurement time was two minutes. The electronic nose output signals for each sample were recorded and saved. Of the EN’s 32 sensors, four showed sensitivity towards humidity and were therefore not used in this work.

2.3. Subjects

The sick women’s samples were taken from patients who were diagnosed as having breast cancer based on physical or mammography tests prior to any surgery. Sick women were identified as having breast cancer by biopsy tests. The control group consisted of healthy women who did not present any kind of cancer, pregnancy, or acute inflammation at the time of sample collection.

2.4. Data Analysis Method

Support vector machine (SVM) [47,48] was used for data classification. Hyperopt was used to obtain hyperparameters [46], and the parameters obtained were optimized with respect to the accuracy matrix using Monte Carlo cross-validation with 100 iterations, with each iteration containing 20% of the data, as the validation set.

3. Results

3.1. Exhaled Breath Collection

Direct exhaled breath data collection is very complex to analyze. The signal obtained by the EN may be influenced by the condition of the patient. The signal may be altered and exhibit very high noise levels depending on breathing intensity and rate. To overcome this, patients were instructed to breathe into a gas collection bag, as suggested by the European Respiratory Society guide [49]. The collection bag then served as a reservoir, allowing the injection of a well-defined volume of exhaled breath samples for all patients (Figure 1). Samples were kept in ambient conditions and measured within 4 h of sampling. Samples were collected from every patient who visited the physician and agreed to participate in the research, without any limitations or requirements from the patient.

3.2. Exhaled Breath Data Processing

It has been previously shown that the EN sensor response data require some degree of processing prior to their classification [36]. Before processing, unnecessary information (response from unused sensors, instrument log information) was deleted. The first step of the data processing consisted of extraction of the effective signal duration and its normalization since only a fraction of the total measurement time is necessary for classification. For each sensor response, a moving difference window was used to detect the beginning and end of the main signal, and the rest of the measurement data were removed.
Next, outlier samples were manually removed from the data set. To accomplish this, each sensor value was normalized using Equation (1). A representative sample before and after normalization is displayed in Figure 2.
S n o r m a l i z e d = S i S 0 S 0
where Si is the signal at time i, and S0 is the signal at time 0. It should be noted that the normalized results were only used for manual evaluation and not for classification. During the sample evaluation, 51 samples were found to contain no noticeable peaks (relative to a typical sample) and were removed from the data set. The criterion for sample removal was no signal higher than three standard variations of the background signal observed. The removed samples were all measured after more than 4 h from their measurement and had low signal intensities, which led to a low signal-to-noise ratio. Notably, none of the samples measured soon after (within 4 h) their collection were found to be noisy, indicating the importance of rapid sample measurement. Of the remaining 216 samples, 194 were randomly chosen for the model training set. The test set consisted of the remaining 22 samples. The final part of the data processing was applying a moving average filter to the data with a window size of three samples to filter out noises in the measurements.

3.3. Feature Extraction

Feature extraction is a process of finding the most relevant variables for a predictive model and is an important step in machine learning [47]. For feature extraction, the measurement data of each of the 28 sensors employed were encapsulated using 33 descriptive features. The features include basic statistics (maximum, minimum, amplitude, mean, median, standard deviation, interquartile range, kurtosis, and skewness), time-related features (time required for signal to rise from 10% to 90% of its maximum value and decrease back from 90% to 10%, time over mean, time over median, time over 90%, time from start to maximum, time of max slope, and time of min slope), area-related features (total area under the curve, area between start and max value, area over the mean, area over the median, and area over 90%), ratio features (area to amplitude ratio, area over 90% to time over 90%, area over mean to time over mean, area over median to time over median, and area start to max to time start to max), and slope-related features (maximum, minimum, mean, median, standard deviation, and interquartile range slopes). Each sample was, therefore, represented as a single line comprising 924 features (33 descriptive features for each of 28 sensors), which was then used for model selection.

3.4. Model Selection

Determining whether an exhaled breath sample belongs to a sick or healthy patient requires the use of a binary classification algorithm. SVM, specifically the C-Support Vector Classification (SVC) algorithm, was chosen for this task. This model was chosen as it is often useful when the number of parameters is high while the data set is not very large. It is possible that with a much larger database, a different model would be able to obtain superior results. The environment used to build the model was Python (version 3.10), and the library used was scikit-learn [50]. The kernel was a radial basis function kernel, or Gaussian kernel. This kernel is very versatile and efficient for capturing non-linear relationships.
Several steps were taken before further optimization of the model. First, all features with a variance of zero across all samples were removed (a total of 54 features). Of the remaining 870 features, the top 3% features (according to the ANOVA F-value) were selected, and the rest were discarded. Therefore, once all of the data processing was complete, each sample was represented by 27 features (detailed in the Supplementary Materials).
The selected features were then standardized, as SVM can be sensitive to changes in the order of magnitude in the data set. For this, each column in the data set was standardized by subtracting the mean and scaling to unit variance. A boxchart of the standardized selected features for sick and healthy subjects is presented in Figure 3. It can be observed that the results for the healthy subjects are narrower and more negative than the sick subjects’ results.
PCA plots of the entire dataset (all features) and of the features selected for the model after standardization are presented in Figure 4 (with additional plots in the Supplementary Materials). The plots show the results obtained from both sick and healthy subjects. It can be observed that the standardized selected features display a better separation of the two sample types.

3.5. Model Optimization

The training data for the model were composed of 194 of the samples (80 sick, 114 healthy), and the test data were composed of 22 samples (11 sick, 11 healthy). The optimization process was performed using hyperopt [51], aiming to obtain hyper-parameters that yield the best performance for the SVM model. The SVM model was then optimized with respect to the accuracy matrix using a validation set by applying a Monte Carlo cross-validation with 100 iterations, with each iteration containing 20% of the data. In the optimization process, hyperopt runs the model using random parameters (from a provided range; see Supplementary Materials) and can yield a different model each time; however, on average, the obtained parameters make a model of ca. 90% accuracy, with considerably low variance compared to the train set size. The optimized parameters are selected, chosen since they achieved relatively small variance and bias scores for the train and validation sets. The chosen parameters for the model are presented in Table 1.
After receiving the optimized parameters, the tuned model was trained on all of the training samples, with scores for recall, precision, accuracy, and specificity. Recall represents the fraction of “true positive results” out of the total positive results (true positive and false negative). Precision represents the fraction of “true positive results” out of all positive results (true and false). Accuracy is the fraction of the true positive and true negative results out of the total number of positives and negatives. Specificity is the fraction of true negatives out of total negative results (true negative and false positive). The train and validation scores after parameter training are detailed in Table 2. The confusion matrix of the train data, used to visualize an algorithm’s performance, is presented in Figure 5.

3.6. Model Performance

The performance of the model was assessed using the 22 samples of test data. The scores obtained using the model are presented in Table 3, and the confusion matrix is presented in Figure 6.

4. Discussion

This study used 242 exhaled breath samples to produce a model that allows for a facile diagnosis of breast cancer. Gas chromatography has previously been used for BC determination using exhaled breath, with fine results [18,19,20,52,53]. However, that technique is much more costly and requires more expertise relative to the EN. While novel ENs are constantly researched and produced, the use of a commercial EN makes the method readily available for application. In addition to previous work in our group [36], there have been several publications using the Cyranose 320 EN for breast cancer analysis [54,55]. Comparing these works with the research presented here, their main strength is the significantly higher number of samples obtained (443 and 899). This allowed them to examine and evaluate the effect of many factors, such as age, smoking, asthma, and diabetes, on the method. The smaller number of participants in our research limits our ability to perform such analysis in a statistically significant manner. In addition, a larger data pool, such as in the mentioned works, can be expected to improve the model optimization process and the final method parameters. Consequently, the correct prediction value obtained by Lorena Díaz de León-Martínez et al. was 98.7%, while Hsiao-Yu Yang et al. obtained a value of 97%, both higher than the 91% value obtained in this work. Future work building on the foundations presented here should look to significantly increase the number of participants in the study, hopefully obtaining well over a thousand samples. On the other hand, this work has several strengths. The main one is the simplicity and robustness of the method. Lorena Díaz de León-Martínez and colleagues required 5 h of fasting, no smoking, no oral hygiene, and no medication prior to exhaled breath collection. Hsiao-Yu Yang and colleagues required 8 h of fasting and collected alveolar air after anesthetic drugs were employed for surgery. While these limitations can improve the data quality from exhaled breath, especially the use of endotracheal intubation to obtain the purest alveolar air, our method placed no such limits on the participants of the research. As a result, samples in our data are not as “neat” and more varied. This should make our method more robust, allowing it to remain less affected by small variations [56]. Without any requirements from the patient, samples could be collected from anyone visiting the physician or even at the workplace, not necessitating the presence of a physician for BC screening. In addition to the method’s robustness, our measurement procedure is very simple, as it is performed at ambient temperature, and has a total measurement time of 2 min, making it quite rapid, and it requires no additional equipment other than the Cyranose 320. We have found that the collected breath samples were not very stable, with samples measured more than 4 h after collection exhibiting a high degree of noise. This might be improved by storing the samples in a cooler environment prior to measurement. However, the simplicity and short measurement time of the method make the simplest solution to perform the measurement shortly after sample collection. Finally, we presented our feature extraction process in a detailed manner, which could be useful for future researchers. Previous works extracted features using software that provides few details [54] or omitted feature extraction from their analysis process [55].
Currently, ENs are unlikely to be able to replace very established detection methods such as mammography, MRI, and biopsies. However, the simplicity and low cost of analysis make BC screening by EN a very promising option for developing countries, wherein early detection could significantly lower the BC mortality rate. In addition, the non-invasive nature of the technique makes it simple to implement in many clinics. As the pool of exhaled breath data increases, the use of AI big data analysis [57] should further improve classification models and increase the number of diseases recognized by exhaled breath.

5. Conclusions

This study aimed to develop a robust, accurate, and highly reliable screening method for the early detection of BC in routine screening using exhaled breath (EB) analysis. Method advantages include its low cost, simplicity, and being non-invasive. Samples were collected from 242 women (120 cancer patients, 122 healthy) and analyzed using a commercial EN. The EN is composed of 32 polymer-based sensors that change their electrical conductivity when exposed to gas mixtures. Feature extraction was applied to the results to minimize the data obtained to their most relevant features. An SVM model was adopted for data classification. After training the model, the validation set results were 80, 94, 88, and 95% for recall, precision, accuracy, and specificity, correspondingly. The analysis of 22 unknown samples gave results of 91% in accuracy, specificity, recall, and precision.
This study demonstrates the potential of EN as a non-invasive, simple, safe, painless, and inexpensive diagnosis method and demonstrates the importance of classification models after features have been extracted from the data. Further research with larger sets of data, larger sample volumes, and optimized sensor arrays could further improve the results obtained in this work.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s25072210/s1, Table S1: Selected features for classification model, Figure S1: Boxchart of all standardized features selected for the model, with legend, Figure S2: PCA plots (PCA-3 and PCA-4) for all features and the selected, standardized features, Table S2: Parameter range used in model optimization.

Author Contributions

Conceptualization, Y.Z. and Y.M.; methodology, Y.Z., Y.M., Z.B., D.L. and S.L.; software, Y.M. and B.A.; formal analysis, Y.M. and B.A.; resources, Y.Z. and S.L.; writing—original draft preparation, Y.Z., O.Z., B.A. and Y.M.; writing—review and editing, O.Z.; supervision, Y.Z., S.L., D.L., Z.B. and Y.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Ben Gurion University, approval number SOR-0314-18.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author(s).

Acknowledgments

We thank Eyal Naos for all his help. We thank Itzhak Sedgi for his time and assistance.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BCbreast cancer
VOCsvolatile organic compounds
EBexhaled breath
ENelectronic nose
SVMsupport vector machine

References

  1. The World Health Organization. In Breast Cancer; The World Health Organization: Geneva, Switzerland, 2024.
  2. The Breast Cancer Research Foundation. In The Breast Cancer Statistics and Resources; Cancer Research Foundation: New York, NY, USA, 2024.
  3. Wilkinson, L.; Gathani, T. Understanding Breast Cancer as a Global Health Concern. Br. J. Radiol. 2022, 95, 20211033. [Google Scholar] [CrossRef]
  4. Da Costa Vieira, R.A.; Biller, G.; Uemura, G.; Ruiz, C.A.; Curado, M.P. Breast Cancer Screening in Developing Countries. Clinics 2017, 72, 244–253. [Google Scholar] [CrossRef] [PubMed]
  5. Shulman, L.N.; Willett, W.; Sievers, A.; Knaul, F.M. Breast Cancer in Developing Countries: Opportunities for Improved Survival. J. Oncol. 2010, 2010, 595167. [Google Scholar] [CrossRef]
  6. Siegel, R.L.; Giaquinto, A.N.; Jemal, A. Cancer Statistics, 2024. CA Cancer J. Clin. 2024, 74, 12–49. [Google Scholar] [CrossRef]
  7. Jaglan, P.; Dass, R.; Duhan, M. Breast Cancer Detection Techniques: Issues and Challenges. J. Inst. Eng. India Ser. B 2019, 100, 379–386. [Google Scholar] [CrossRef]
  8. Li, J.; Guan, X.; Fan, Z.; Ching, L.-M.; Li, Y.; Wang, X.; Cao, W.-M.; Liu, D.-X. Non-Invasive Biomarkers for Early Detection of Breast Cancer. Cancers 2020, 12, 2767. [Google Scholar] [CrossRef]
  9. Shermis, R.B.; Wilson, K.D.; Doyle, M.T.; Martin, T.S.; Merryman, D.; Kudrolli, H.; Brenner, R.J. Supplemental Breast Cancer Screening With Molecular Breast Imaging for Women With Dense Breast Tissue. Am. J. Roentgenol. 2016, 207, 450–457. [Google Scholar] [CrossRef]
  10. Bae, M.S.; Moon, W.K.; Chang, J.M.; Koo, H.R.; Kim, W.H.; Cho, N.; Yi, A.; La Yun, B.; Lee, S.H.; Kim, M.Y.; et al. Breast Cancer Detected with Screening US: Reasons for Nondetection at Mammography. Radiology 2014, 270, 369–377. [Google Scholar] [CrossRef]
  11. Dromain, C.; Thibault, F.; Diekmann, F.; Fallenberg, E.M.; Jong, R.A.; Koomen, M.; Hendrick, R.E.; Tardivon, A.; Toledano, A. Dual-Energy Contrast-Enhanced Digital Mammography: Initial Clinical Results of a Multireader, Multicase Study. Breast Cancer Res. 2012, 14, R94. [Google Scholar] [CrossRef]
  12. Knogler, T.; Homolka, P.; Hoernig, M.; Leithner, R.; Langs, G.; Waitzbauer, M.; Pinker, K.; Leitner, S.; Helbich, T.H. Application of BI-RADS Descriptors in Contrast-Enhanced Dual-Energy Mammography: Comparison with MRI. Breast Care 2017, 12, 212–216. [Google Scholar] [CrossRef]
  13. Menezes, G.L. Magnetic Resonance Imaging in Breast Cancer: A Literature Review and Future Perspectives. WJCO 2014, 5, 61. [Google Scholar] [CrossRef] [PubMed]
  14. Jossinet, J. The Impedivity of Freshly Excised Human Breast Tissue. Physiol. Meas. 1998, 19, 61–75. [Google Scholar] [CrossRef] [PubMed]
  15. Laganà, F.; Prattico, D.; De Carlo, D.; Oliva, G.; Pullano, S.A.; Calcagno, S. Engineering Biomedical Problems to Detect Carcinomas: A Tomographic Impedance Approach. Eng 2024, 5, 1594–1614. [Google Scholar] [CrossRef]
  16. Krilaviciute, A.; Heiss, J.A.; Leja, M.; Kupcinskas, J.; Haick, H.; Brenner, H. Detection of Cancer through Exhaled Breath: A Systematic Review. Oncotarget 2015, 6, 38643–38657. [Google Scholar] [CrossRef]
  17. Oakley-Girvan, I.; Davis, S.W. Breath Based Volatile Organic Compounds in the Detection of Breast, Lung, and Colorectal Cancers: A Systematic Review. CBM 2017, 21, 29–39. [Google Scholar] [CrossRef]
  18. Phillips, M.; Cataneo, R.N.; Ditkoff, B.A.; Fisher, P.; Greenberg, J.; Gunawardena, R.; Kwon, C.S.; Rahbari-Oskoui, F.; Wong, C. Volatile Markers of Breast Cancer in the Breath. Breast J. 2003, 9, 184–191. [Google Scholar] [CrossRef]
  19. Phillips, M.; Cataneo, R.N.; Ditkoff, B.A.; Fisher, P.; Greenberg, J.; Gunawardena, R.; Kwon, C.S.; Tietje, O.; Wong, C. Prediction of Breast Cancer Using Volatile Biomarkers in the Breath. Breast Cancer Res. Treat. 2006, 99, 19–21. [Google Scholar] [CrossRef]
  20. Phillips, M.; Cataneo, R.N.; Saunders, C.; Hope, P.; Schmitt, P.; Wai, J. Volatile Biomarkers in the Breath of Women with Breast Cancer. J. Breath Res. 2010, 4, 026003. [Google Scholar] [CrossRef]
  21. Shuster, G.; Gallimidi, Z.; Reiss, A.H.; Dovgolevsky, E.; Billan, S.; Abdah-Bortnyak, R.; Kuten, A.; Engel, A.; Shiban, A.; Tisch, U.; et al. Classification of Breast Cancer Precursors through Exhaled Breath. Breast Cancer Res. Treat. 2011, 126, 791–796. [Google Scholar] [CrossRef]
  22. Webb, S. Breathing Out. ACS Cent. Sci. 2016, 2, 59–61. [Google Scholar] [CrossRef]
  23. Scarlata, S.; Finamore, P.; Meszaros, M.; Dragonieri, S.; Bikov, A. The Role of Electronic Noses in Phenotyping Patients with Chronic Obstructive Pulmonary Disease. Biosensors 2020, 10, 171. [Google Scholar] [CrossRef] [PubMed]
  24. Binson, V.A.; Subramoniam, M.; Sunny, Y.; Mathew, L. Prediction of Pulmonary Diseases With Electronic Nose Using SVM and XGBoost. IEEE Sens. J. 2021, 21, 20886–20895. [Google Scholar] [CrossRef]
  25. Gudiño-Ochoa, A.; García-Rodríguez, J.A.; Ochoa-Ornelas, R.; Cuevas-Chávez, J.I.; Sánchez-Arias, D.A. Noninvasive Diabetes Detection through Human Breath Using TinyML-Powered E-Nose. Sensors 2024, 24, 1294. [Google Scholar] [CrossRef] [PubMed]
  26. Binson, V.A.; Mathew, P.; Thomas, S.; Mathew, L. Detection of Lung Cancer and Stages via Breath Analysis Using a Self-Made Electronic Nose Device. Expert Rev. Mol. Diagn. 2024, 24, 341–353. [Google Scholar] [CrossRef]
  27. Daulton, E.; Wicaksono, A.N.; Tiele, A.; Kocher, H.M.; Debernardi, S.; Crnogorac-Jurcevic, T.; Covington, J.A. Volatile Organic Compounds (VOCs) for the Non-Invasive Detection of Pancreatic Cancer from Urine. Talanta 2021, 221, 121604. [Google Scholar] [CrossRef]
  28. Schmidt, K.; Podmore, I. Current Challenges in Volatile Organic Compounds Analysis as Potential Biomarkers of Cancer. J. Biomark. 2015, 2015, 981458. [Google Scholar] [CrossRef]
  29. Hanna, G.B.; Boshier, P.R.; Markar, S.R.; Romano, A. Accuracy and Methodologic Challenges of Volatile Organic Compound–Based Exhaled Breath Tests for Cancer Diagnosis: A Systematic Review and Meta-Analysis. JAMA Oncol. 2019, 5, e182815. [Google Scholar] [CrossRef]
  30. Szulejko, J.E.; McCulloch, M.; Jackson, J.; McKee, D.L.; Walker, J.C.; Solouki, T. Evidence for Cancer Biomarkers in Exhaled Breath. IEEE Sens. J. 2010, 10, 185–210. [Google Scholar] [CrossRef]
  31. Markar, S.R.; Wiggins, T.; Antonowicz, S.; Chin, S.-T.; Romano, A.; Nikolic, K.; Evans, B.; Cunningham, D.; Mughal, M.; Lagergren, J.; et al. Assessment of a Noninvasive Exhaled Breath Test for the Diagnosis of Oesophagogastric Cancer. JAMA Oncol. 2018, 4, 970. [Google Scholar] [CrossRef]
  32. Peng, G.; Hakim, M.; Broza, Y.Y.; Billan, S.; Abdah-Bortnyak, R.; Kuten, A.; Tisch, U.; Haick, H. Detection of Lung, Breast, Colorectal, and Prostate Cancers from Exhaled Breath Using a Single Array of Nanosensors. Br. J. Cancer 2010, 103, 542–551. [Google Scholar] [CrossRef]
  33. Kou, L.; Zhang, D.; Liu, D. A Novel Medical E-Nose Signal Analysis System. Sensors 2017, 17, 402. [Google Scholar] [CrossRef] [PubMed]
  34. Elia, P.; Raizelman, S. Biomarkers for the Detection of Pre-Cancerous Stage of Cervical Dysplasia. J. Mol. Biomark. Diagn. 2015, 6, 255. [Google Scholar] [CrossRef]
  35. Jurado, C.; Giménez, M.P.; Soriano, T.; Menéndez, M.; Repetto, M. Rapid Analysis of Amphetamine, Methamphetamine, MDA, and MDMA in Urine Using Solid-Phase Microextraction, Direct On-Fiber Derivatization, and Analysis by GC-MS. J. Anal. Toxicol. 2000, 24, 11–16. [Google Scholar] [CrossRef]
  36. Herman-Saffar, O.; Boger, Z.; Libson, S.; Lieberman, D.; Gonen, R.; Zeiri, Y. Early Non-Invasive Detection of Breast Cancer Using Exhaled Breath and Urine Analysis. Comput. Biol. Med. 2018, 96, 227–232. [Google Scholar] [CrossRef]
  37. Li, J.; Peng, Y.; Duan, Y. Diagnosis of Breast Cancer Based on Breath Analysis: An Emerging Method. Crit. Rev. Oncol./Hematol. 2013, 87, 28–40. [Google Scholar] [CrossRef]
  38. Lourenço, C.; Turner, C. Breath Analysis in Disease Diagnosis: Methodological Considerations and Applications. Metabolites 2014, 4, 465–498. [Google Scholar] [CrossRef]
  39. Röck, F.; Barsan, N.; Weimar, U. Electronic Nose: Current Status and Future Trends. Chem. Rev. 2008, 108, 705–725. [Google Scholar] [CrossRef]
  40. Wojnowski, W.; Dymerski, T.; Gębicki, J.; Namieśnik, J. Electronic Noses in Medical Diagnostics. CMC 2019, 26, 197–215. [Google Scholar] [CrossRef]
  41. Wilson, A. Advances in Electronic-Nose Technologies for the Detection of Volatile Biomarker Metabolites in the Human Breath. Metabolites 2015, 5, 140–163. [Google Scholar] [CrossRef]
  42. Li, Z.; Yu, J.; Dong, D.; Yao, G.; Wei, G.; He, A.; Wu, H.; Zhu, H.; Huang, Z.; Tang, Z. E-Nose Based on a High-Integrated and Low-Power Metal Oxide Gas Sensor Array. Sens. Actuators B Chem. 2023, 380, 133289. [Google Scholar] [CrossRef]
  43. Jovel, J.; Greiner, R. An Introduction to Machine Learning Approaches for Biomedical Research. Front. Med. 2021, 8, 771607. [Google Scholar] [CrossRef]
  44. Binson, V.A.; Thomas, S.; Subramoniam, M.; Arun, J.; Naveen, S.; Madhu, S. A Review of Machine Learning Algorithms for Biomedical Applications. Ann. Biomed. Eng. 2024, 52, 1159–1183. [Google Scholar] [CrossRef] [PubMed]
  45. Park, C.; Took, C.C.; Seong, J.-K. Machine Learning in Biomedical Engineering. Biomed. Eng. Lett. 2018, 8, 1–3. [Google Scholar] [CrossRef] [PubMed]
  46. Binson, V.A.; Subramoniam, M.; Mathew, L. Detection of COPD and Lung Cancer with Electronic Nose Using Ensemble Learning Methods. Clin. Chim. Acta 2021, 523, 231–238. [Google Scholar] [CrossRef]
  47. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  48. Tsujitani, M.; Tanaka, Y. Cross-Validation, Bootstrap, and Support Vector Machines. Adv. Artif. Neural Syst. 2011, 2011, 302572. [Google Scholar] [CrossRef]
  49. Horváth, I.; Barnes, P.J.; Loukides, S.; Sterk, P.J.; Högman, M.; Olin, A.-C.; Amann, A.; Antus, B.; Baraldi, E.; Bikov, A.; et al. A European Respiratory Society Technical Standard: Exhaled Biomarkers in Lung Disease. Eur. Respir. J. 2017, 49, 1600965. [Google Scholar] [CrossRef]
  50. Kramer, O. Scikit-Learn. In Machine Learning for Evolution Strategies; Studies in Big Data; Springer International Publishing: Cham, Switzerland, 2016; Volume 20, pp. 45–53. ISBN 978-3-319-33381-6. [Google Scholar]
  51. Bergstra, J.; Komer, B.; Eliasmith, C.; Yamins, D.; Cox, D.D. Hyperopt: A Python Library for Model Selection and Hyperparameter Optimization. Comput. Sci. Discov. 2015, 8, 014008. [Google Scholar] [CrossRef]
  52. Phillips, M.; Cataneo, R.N.; Cruz-Ramos, J.A.; Huston, J.; Ornelas, O.; Pappas, N.; Pathak, S. Prediction of Breast Cancer Risk with Volatile Biomarkers in Breath. Breast Cancer Res. Treat. 2018, 170, 343–350. [Google Scholar] [CrossRef]
  53. Barash, O.; Zhang, W.; Halpern, J.M.; Hua, Q.-L.; Pan, Y.-Y.; Kayal, H.; Khoury, K.; Liu, H.; Davies, M.P.A.; Haick, H. Differentiation Between Genetic Mutations of Breast Cancer by Breath Volatolomics. Oncotarget 2015, 6, 44864–44876. [Google Scholar] [CrossRef]
  54. Díaz De León-Martínez, L.; Rodríguez-Aguilar, M.; Gorocica-Rosete, P.; Domínguez-Reyes, C.A.; Martínez-Bustos, V.; Tenorio-Torres, J.A.; Ornelas-Rebolledo, O.; Cruz-Ramos, J.A.; Balderas-Segura, B.; Flores-Ramírez, R. Identification of Profiles of Volatile Organic Compounds in Exhaled Breath by Means of an Electronic Nose as a Proposal for a Screening Method for Breast Cancer: A Case-Control Study. J. Breath Res. 2020, 14, 046009. [Google Scholar] [CrossRef]
  55. Yang, H.-Y.; Wang, Y.-C.; Peng, H.-Y.; Huang, C.-H. Breath Biopsy of Breast Cancer Using Sensor Array Signals and Machine Learning Analysis. Sci. Rep. 2021, 11, 103. [Google Scholar] [CrossRef]
  56. Ferreira, S.L.C.; Caires, A.O.; Borges, T.D.S.; Lima, A.M.D.S.; Silva, L.O.B.; Dos Santos, W.N.L. Robustness Evaluation in Analytical Methods Optimized Using Experimental Designs. Microchem. J. 2017, 131, 163–169. [Google Scholar] [CrossRef]
  57. Wang, C.; He, T.; Zhou, H.; Zhang, Z.; Lee, C. Artificial Intelligence Enhanced Sensors—Enabling Technologies to next-Generation Healthcare and Biomedical Platform. Bioelectron. Med. 2023, 9, 17. [Google Scholar] [CrossRef]
Figure 1. Injection of a sample into the EN using a disposable syringe.
Figure 1. Injection of a sample into the EN using a disposable syringe.
Sensors 25 02210 g001
Figure 2. A typical sample prior to (left) and post (right)-normalization. Each line represents a different sensor. The results are presented after the extraction of the effective signal.
Figure 2. A typical sample prior to (left) and post (right)-normalization. Each line represents a different sensor. The results are presented after the extraction of the effective signal.
Sensors 25 02210 g002
Figure 3. Boxchart of all standardized features selected for the model. Healthy subjects’ results are darker colored and can be seen to be more narrowly spread and negative than the sick subjects’ results. The sample legend is provided in the Supplementary Materials, X represents the mean of the dataset.
Figure 3. Boxchart of all standardized features selected for the model. Healthy subjects’ results are darker colored and can be seen to be more narrowly spread and negative than the sick subjects’ results. The sample legend is provided in the Supplementary Materials, X represents the mean of the dataset.
Sensors 25 02210 g003
Figure 4. PCA plots for all features (left) and the selected, standardized features (right). Data from healthy subjects is in red, and from sick subjects in blue. Additional plots (PCA-3 and PCA-4) are presented in the Supplementary Materials.
Figure 4. PCA plots for all features (left) and the selected, standardized features (right). Data from healthy subjects is in red, and from sick subjects in blue. Additional plots (PCA-3 and PCA-4) are presented in the Supplementary Materials.
Sensors 25 02210 g004
Figure 5. Confusion matrix for the train data set. Labels 0 and 1 represent healthy and sick subjects, correspondingly.
Figure 5. Confusion matrix for the train data set. Labels 0 and 1 represent healthy and sick subjects, correspondingly.
Sensors 25 02210 g005
Figure 6. Confusion matrix for the test data set. Labels 0 and 1 represent healthy and sick subjects, correspondingly.
Figure 6. Confusion matrix for the test data set. Labels 0 and 1 represent healthy and sick subjects, correspondingly.
Sensors 25 02210 g006
Table 1. The optimized parameters obtained for the SVM model.
Table 1. The optimized parameters obtained for the SVM model.
ParameterValue
C66
Coef066
Degree10
Gamma2.5
Tol0.02
Table 2. The obtained scores for the train and validation sets.
Table 2. The obtained scores for the train and validation sets.
ParameterTrainValidation
Recall0.80 ± 0.020.80 ± 0.11
Precision0.95 ± 0.010.94 ± 0.06
Accuracy0.89 ± 0.010.88 ± 0.06
Specificity0.96 ± 0.010.95 ± 0.05
Table 3. The obtained scores for the test set.
Table 3. The obtained scores for the test set.
ParameterTest
Recall0.91
Precision0.91
Accuracy0.91
Specificity0.91
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Matana, Y.; Libson, S.; Amihood, B.; Boger, Z.; Lieberman, D.; Zeiri, O.; Zeiri, Y. Chemical Nose-Based Non-Invasive Detection of Breast Cancer Using Exhaled Breath. Sensors 2025, 25, 2210. https://doi.org/10.3390/s25072210

AMA Style

Matana Y, Libson S, Amihood B, Boger Z, Lieberman D, Zeiri O, Zeiri Y. Chemical Nose-Based Non-Invasive Detection of Breast Cancer Using Exhaled Breath. Sensors. 2025; 25(7):2210. https://doi.org/10.3390/s25072210

Chicago/Turabian Style

Matana, Yosef, Shai Libson, Barak Amihood, Zvi Boger, David Lieberman, Offer Zeiri, and Yehuda Zeiri. 2025. "Chemical Nose-Based Non-Invasive Detection of Breast Cancer Using Exhaled Breath" Sensors 25, no. 7: 2210. https://doi.org/10.3390/s25072210

APA Style

Matana, Y., Libson, S., Amihood, B., Boger, Z., Lieberman, D., Zeiri, O., & Zeiri, Y. (2025). Chemical Nose-Based Non-Invasive Detection of Breast Cancer Using Exhaled Breath. Sensors, 25(7), 2210. https://doi.org/10.3390/s25072210

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop