Machine Learning Techniques in Cancer

A special issue of Cancers (ISSN 2072-6694). This special issue belongs to the section "Cancer Informatics and Big Data".

Deadline for manuscript submissions: closed (15 July 2021) | Viewed by 108591

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science, University of St Andrews, St Andrews KY16 9SX, UK
Interests: computer vision; pattern recognition; machine learning; bioinformatics statistics mathematical modelling
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear colleagues,

Medicine and healthcare have already experienced major benefits stemming from the use of machine learning. Within this broad realm of application, oncology-related problems have been attracting particular interest from the research community, both because of their practical significance as well as the technical challenges they present. These include inherent challenges such as the heterogeneity of the disease itself, but also the need to deal with the multimodal nature of data acquisition (histopathology, radiography, magnetic resonance imaging, computed tomography, and others), large data (and datum) sizes, etc. While significant progress has already been demonstrated, the field is still in relative infancy and offers a major opportunity for an innovation and paradigm shift; in particular, prognosis, that is, the prediction of disease development on a patient rather than population level, remains challenging.

We welcome high-quality submissions in any topic falling under the broad umbrella of cancer-related machine learning. While contributions with high technical novelty are preferred, we will also consider manuscripts with a more practical focus but which demonstrate particularly convincing and significant clinical results. We would also particularly welcome and encourage submissions which use machine learning as a means of gaining new insights into the underlying molecular mechanisms of cancer.

Dr. Ognjen Arandjelović
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Cancers is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2900 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • oncology
  • digital pathology
  • deep learning
  • multimodal prognostics

Published Papers (26 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

0 pages, 26903 KiB  
Article
Morphological Features Extracted by AI Associated with Spatial Transcriptomics in Prostate Cancer
by Eduard Chelebian, Christophe Avenel, Kimmo Kartasalo, Maja Marklund, Anna Tanoglidi, Tuomas Mirtti, Richard Colling, Andrew Erickson, Alastair D. Lamb, Joakim Lundeberg and Carolina Wählby
Cancers 2021, 13(19), 4837; https://doi.org/10.3390/cancers13194837 - 28 Sep 2021
Cited by 14 | Viewed by 4624
Abstract
Prostate cancer is a common cancer type in men, yet some of its traits are still under-explored. One reason for this is high molecular and morphological heterogeneity. The purpose of this study was to develop a method to gain new insights into the [...] Read more.
Prostate cancer is a common cancer type in men, yet some of its traits are still under-explored. One reason for this is high molecular and morphological heterogeneity. The purpose of this study was to develop a method to gain new insights into the connection between morphological changes and underlying molecular patterns. We used artificial intelligence (AI) to analyze the morphology of seven hematoxylin and eosin (H&E)-stained prostatectomy slides from a patient with multi-focal prostate cancer. We also paired the slides with spatially resolved expression for thousands of genes obtained by a novel spatial transcriptomics (ST) technique. As both spaces are highly dimensional, we focused on dimensionality reduction before seeking associations between them. Consequently, we extracted morphological features from H&E images using an ensemble of pre-trained convolutional neural networks and proposed a workflow for dimensionality reduction. To summarize the ST data into genetic profiles, we used a previously proposed factor analysis. We found that the regions were automatically defined, outlined by unsupervised clustering, associated with independent manual annotations, in some cases, finding further relevant subdivisions. The morphological patterns were also correlated with molecular profiles and could predict the spatial variation of individual genes. This novel approach enables flexible unsupervised studies relating morphological and genetic heterogeneity using AI to be carried out. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

16 pages, 3672 KiB  
Article
Histological Grade of Endometrioid Endometrial Cancer and Relapse Risk Can Be Predicted with Machine Learning from Gene Expression Data
by Péter Gargya and Bálint László Bálint
Cancers 2021, 13(17), 4348; https://doi.org/10.3390/cancers13174348 - 27 Aug 2021
Cited by 5 | Viewed by 3353
Abstract
The tumor grade of endometrioid endometrial cancer is used as an independent marker of prognosis and a key component in clinical decision making. It is reported that between grades 1 and 3, however, the intermediate grade 2 carries limited information; thus, patients with [...] Read more.
The tumor grade of endometrioid endometrial cancer is used as an independent marker of prognosis and a key component in clinical decision making. It is reported that between grades 1 and 3, however, the intermediate grade 2 carries limited information; thus, patients with grade 2 tumors are at risk of both under- and overtreatment. We used RNA-sequencing data from the TCGA project and machine learning to develop a model which can correctly classify grade 1 and grade 3 samples. We used the trained model on grade 2 patients to subdivide them into low-risk and high-risk groups. With iterative retraining, we selected the most relevant 12 transcripts to build a simplified model without losing accuracy. Both models had a high AUC of 0.93. In both cases, there was a significant difference in the relapse-free survivals of the newly identified grade 2 subgroups. Both models could identify grade 2 patients that have a higher risk of relapse. Our approach overcomes the subjective components of the histological evaluation. The developed method can be automated to perform a prescreening of the samples before a final decision is made by pathologists. Our translational approach based on machine learning methods could allow for better therapeutic planning for grade 2 endometrial cancer patients. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

14 pages, 4087 KiB  
Article
Relevant and Non-Redundant Feature Selection for Cancer Classification and Subtype Detection
by Pratip Rana, Phuc Thai, Thang Dinh and Preetam Ghosh
Cancers 2021, 13(17), 4297; https://doi.org/10.3390/cancers13174297 - 26 Aug 2021
Cited by 7 | Viewed by 2432
Abstract
Biologists seek to identify a small number of significant features that are important, non-redundant, and relevant from diverse omics data. For example, statistical methods such as LIMMA and DEseq distinguish differentially expressed genes between a case and control group from the transcript profile. [...] Read more.
Biologists seek to identify a small number of significant features that are important, non-redundant, and relevant from diverse omics data. For example, statistical methods such as LIMMA and DEseq distinguish differentially expressed genes between a case and control group from the transcript profile. Researchers also apply various column subset selection algorithms on genomics datasets for a similar purpose. Unfortunately, genes selected by such statistical or machine learning methods are often highly co-regulated, making their performance inconsistent. Here, we introduce a novel feature selection algorithm that selects highly disease-related and non-redundant features from a diverse set of omics datasets. We successfully applied this algorithm to three different biological problems: (a) disease-to-normal sample classification; (b) multiclass classification of different disease samples; and (c) disease subtypes detection. Considering the classification of ROC-AUC, false-positive, and false-negative rates, our algorithm outperformed other gene selection and differential expression (DE) methods for all six types of cancer datasets from TCGA considered here for binary and multiclass classification problems. Moreover, genes picked by our algorithm improved the disease subtyping accuracy for four different cancer types over state-of-the-art methods. Hence, we posit that our proposed feature reduction method can support the community to solve various problems, including the selection of disease-specific biomarkers, precision medicine design, and disease sub-type detection. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

21 pages, 9422 KiB  
Article
A Means of Assessing Deep Learning-Based Detection of ICOS Protein Expression in Colon Cancer
by Md Mostafa Kamal Sarker, Yasmine Makhlouf, Stephanie G. Craig, Matthew P. Humphries, Maurice Loughrey, Jacqueline A. James, Manuel Salto-Tellez, Paul O’Reilly and Perry Maxwell
Cancers 2021, 13(15), 3825; https://doi.org/10.3390/cancers13153825 - 29 Jul 2021
Cited by 20 | Viewed by 3088
Abstract
Biomarkers identify patient response to therapy. The potential immune-checkpoint biomarker, Inducible T-cell COStimulator (ICOS), expressed on regulating T-cell activation and involved in adaptive immune responses, is of great interest. We have previously shown that open-source software for digital pathology image analysis can be [...] Read more.
Biomarkers identify patient response to therapy. The potential immune-checkpoint biomarker, Inducible T-cell COStimulator (ICOS), expressed on regulating T-cell activation and involved in adaptive immune responses, is of great interest. We have previously shown that open-source software for digital pathology image analysis can be used to detect and quantify ICOS using cell detection algorithms based on traditional image processing techniques. Currently, artificial intelligence (AI) based on deep learning methods is significantly impacting the domain of digital pathology, including the quantification of biomarkers. In this study, we propose a general AI-based workflow for applying deep learning to the problem of cell segmentation/detection in IHC slides as a basis for quantifying nuclear staining biomarkers, such as ICOS. It consists of two main parts: a simplified but robust annotation process, and cell segmentation/detection models. This results in an optimised annotation process with a new user-friendly tool that can interact with1 other open-source software and assists pathologists and scientists in creating and exporting data for deep learning. We present a set of architectures for cell-based segmentation/detection to quantify and analyse the trade-offs between them, proving to be more accurate and less time consuming than traditional methods. This approach can identify the best tool to deliver the prognostic significance of ICOS protein expression. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

15 pages, 2012 KiB  
Article
Explainable Artificial Intelligence Reveals Novel Insight into Tumor Microenvironment Conditions Linked with Better Prognosis in Patients with Breast Cancer
by Debaditya Chakraborty, Cristina Ivan, Paola Amero, Maliha Khan, Cristian Rodriguez-Aguayo, Hakan Başağaoğlu and Gabriel Lopez-Berestein
Cancers 2021, 13(14), 3450; https://doi.org/10.3390/cancers13143450 - 9 Jul 2021
Cited by 20 | Viewed by 4478
Abstract
We investigated the data-driven relationship between immune cell composition in the tumor microenvironment (TME) and the ≥5-year survival rates of breast cancer patients using explainable artificial intelligence (XAI) models. We acquired TCGA breast invasive carcinoma data from the cbioPortal and retrieved immune cell [...] Read more.
We investigated the data-driven relationship between immune cell composition in the tumor microenvironment (TME) and the ≥5-year survival rates of breast cancer patients using explainable artificial intelligence (XAI) models. We acquired TCGA breast invasive carcinoma data from the cbioPortal and retrieved immune cell composition estimates from bulk RNA sequencing data from TIMER2.0 based on EPIC, CIBERSORT, TIMER, and xCell computational methods. Novel insights derived from our XAI model showed that B cells, CD8+ T cells, M0 macrophages, and NK T cells are the most critical TME features for enhanced prognosis of breast cancer patients. Our XAI model also revealed the inflection points of these critical TME features, above or below which ≥5-year survival rates improve. Subsequently, we ascertained the conditional probabilities of ≥5-year survival under specific conditions inferred from the inflection points. In particular, the XAI models revealed that the B cell fraction (relative to all cells in a sample) exceeding 0.025, M0 macrophage fraction (relative to the total immune cell content) below 0.05, and NK T cell and CD8+ T cell fractions (based on cancer type-specific arbitrary units) above 0.075 and 0.25, respectively, in the TME could enhance the ≥5-year survival in breast cancer patients. The findings could lead to accurate clinical predictions and enhanced immunotherapies, and to the design of innovative strategies to reprogram the breast TME. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

15 pages, 1423 KiB  
Article
MOUSSE: Multi-Omics Using Subject-Specific SignaturEs
by Giuseppe Fiorentino, Roberto Visintainer, Enrico Domenici, Mario Lauria and Luca Marchetti
Cancers 2021, 13(14), 3423; https://doi.org/10.3390/cancers13143423 - 8 Jul 2021
Cited by 3 | Viewed by 3315
Abstract
High-throughput technologies make it possible to produce a large amount of data representing different biological layers, examples of which are genomics, proteomics, metabolomics and transcriptomics. Omics data have been individually investigated to understand the molecular bases of various diseases, but this may not [...] Read more.
High-throughput technologies make it possible to produce a large amount of data representing different biological layers, examples of which are genomics, proteomics, metabolomics and transcriptomics. Omics data have been individually investigated to understand the molecular bases of various diseases, but this may not be sufficient to fully capture the molecular mechanisms and the multilayer regulatory processes underlying complex diseases, especially cancer. To overcome this problem, several multi-omics integration methods have been introduced but a commonly agreed standard of analysis is still lacking. In this paper, we present MOUSSE, a novel normalization-free pipeline for unsupervised multi-omics integration. The main innovations are the use of rank-based subject-specific signatures and the use of such signatures to derive subject similarity networks. A separate similarity network was derived for each omics, and the resulting networks were then carefully merged in a way that considered their informative content. We applied it to analyze survival in ten different types of cancer. We produced a meaningful clusterization of the subjects and obtained a higher average classification score than ten state-of-the-art algorithms tested on the same data. As further validation, we extracted from the subject-specific signatures a list of relevant features used for the clusterization and investigated their biological role in survival. We were able to verify that, according to the literature, these features are highly involved in cancer progression and differential survival. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

19 pages, 4139 KiB  
Article
Deep Learning-Based Stage-Wise Risk Stratification for Early Lung Adenocarcinoma in CT Images: A Multi-Center Study
by Jing Gong, Jiyu Liu, Haiming Li, Hui Zhu, Tingting Wang, Tingdan Hu, Menglei Li, Xianwu Xia, Xianfang Hu, Weijun Peng, Shengping Wang, Tong Tong and Yajia Gu
Cancers 2021, 13(13), 3300; https://doi.org/10.3390/cancers13133300 - 30 Jun 2021
Cited by 10 | Viewed by 2655
Abstract
This study aims to develop a deep neural network (DNN)-based two-stage risk stratification model for early lung adenocarcinomas in CT images, and investigate the performance compared with practicing radiologists. A total of 2393 GGNs were retrospectively collected from 2105 patients in four centers. [...] Read more.
This study aims to develop a deep neural network (DNN)-based two-stage risk stratification model for early lung adenocarcinomas in CT images, and investigate the performance compared with practicing radiologists. A total of 2393 GGNs were retrospectively collected from 2105 patients in four centers. All the pathologic results of GGNs were obtained from surgically resected specimens. A two-stage deep neural network was developed based on the 3D residual network and atrous convolution module to diagnose benign and malignant GGNs (Task1) and classify between invasive adenocarcinoma (IA) and non-IA for these malignant GGNs (Task2). A multi-reader multi-case observer study with six board-certified radiologists’ (average experience 11 years, range 2–28 years) participation was conducted to evaluate the model capability. DNN yielded area under the receiver operating characteristic curve (AUC) values of 0.76 ± 0.03 (95% confidence interval (CI): (0.69, 0.82)) and 0.96 ± 0.02 (95% CI: (0.92, 0.98)) for Task1 and Task2, which were equivalent to or higher than radiologists in the senior group with average AUC values of 0.76 and 0.95, respectively (p > 0.05). With the CT image slice thickness increasing from 1.15 mm ± 0.36 to 1.73 mm ± 0.64, DNN performance decreased 0.08 and 0.22 for the two tasks. The results demonstrated (1) a positive trend between the diagnostic performance and radiologist’s experience, (2) the DNN yielded equivalent or even higher performance in comparison with senior radiologists, and (3) low image resolution decreased model performance in predicting the risks of GGNs. Once tested prospectively in clinical practice, the DNN could have the potential to assist doctors in precision diagnosis and treatment of early lung adenocarcinoma. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

21 pages, 595 KiB  
Article
Predicting Postoperative Complications in Cancer Patients: A Survey Bridging Classical and Machine Learning Contributions to Postsurgical Risk Analysis
by Daniel M. Gonçalves, Rui Henriques and Rafael S. Costa
Cancers 2021, 13(13), 3217; https://doi.org/10.3390/cancers13133217 - 28 Jun 2021
Cited by 2 | Viewed by 2339
Abstract
Postoperative complications can impose a significant burden, increasing morbidity, mortality, and the in-hospital length of stay. Today, the number of studies available on the prognostication of postsurgical complications in cancer patients is growing and has already created a considerable set of dispersed contributions. [...] Read more.
Postoperative complications can impose a significant burden, increasing morbidity, mortality, and the in-hospital length of stay. Today, the number of studies available on the prognostication of postsurgical complications in cancer patients is growing and has already created a considerable set of dispersed contributions. This work provides a comprehensive survey on postoperative risk analysis, integrating principles from classic risk scores and machine-learning approaches within a coherent frame. A qualitative comparison is offered, taking into consideration the available cohort data and the targeted postsurgical outcomes of morbidity (such as the occurrence, nature or severity of postsurgical complications and hospitalization needs) and mortality. This work further establishes a taxonomy to assess the adequacy of cohort studies and guide the development and assessment of new learning approaches for the study and prediction of postoperative complications. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

16 pages, 6236 KiB  
Article
SurvCNN: A Discrete Time-to-Event Cancer Survival Estimation Framework Using Image Representations of Omics Data
by Yogesh Kalakoti, Shashank Yadav and Durai Sundar
Cancers 2021, 13(13), 3106; https://doi.org/10.3390/cancers13133106 - 22 Jun 2021
Cited by 2 | Viewed by 2666
Abstract
The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose [...] Read more.
The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose SurvCNN, an alternative approach to process multi-omics data with robust computer vision architectures, to predict cancer prognosis for Lung Adenocarcinoma patients. Numerical multi-omics data were transformed into their image representations and fed into a Convolutional Neural network with a discrete-time model to predict survival probabilities. The framework also dichotomized patients into risk subgroups based on their survival probabilities over time. SurvCNN was evaluated on multiple performance metrics and outperformed existing methods with a high degree of confidence. Moreover, comprehensive insights into the relative performance of various combinations of omics datasets were probed. Critical biological processes, pathways and cell types identified from downstream processing of differentially expressed genes suggested that the framework could elucidate elements detrimental to a patient’s survival. Such integrative models with high predictive power would have a significant impact and utility in precision oncology. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

22 pages, 13151 KiB  
Article
OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data
by Xiaoyu Zhang, Yuting Xing, Kai Sun and Yike Guo
Cancers 2021, 13(12), 3047; https://doi.org/10.3390/cancers13123047 - 18 Jun 2021
Cited by 36 | Viewed by 7350
Abstract
High-dimensional omics data contain intrinsic biomedical information that is crucial for personalised medicine. Nevertheless, it is challenging to capture them from the genome-wide data, due to the large number of molecular features and small number of available samples, which is also called “the [...] Read more.
High-dimensional omics data contain intrinsic biomedical information that is crucial for personalised medicine. Nevertheless, it is challenging to capture them from the genome-wide data, due to the large number of molecular features and small number of available samples, which is also called “the curse of dimensionality” in machine learning. To tackle this problem and pave the way for machine learning-aided precision medicine, we proposed a unified multi-task deep learning framework named OmiEmbed to capture biomedical information from high-dimensional omics data with the deep embedding and downstream task modules. The deep embedding module learnt an omics embedding that mapped multiple omics data types into a latent space with lower dimensionality. Based on the new representation of multi-omics data, different downstream task modules were trained simultaneously and efficiently with the multi-task strategy to predict the comprehensive phenotype profile of each sample. OmiEmbed supports multiple tasks for omics data including dimensionality reduction, tumour type classification, multi-omics integration, demographic and clinical feature reconstruction, and survival prediction. The framework outperformed other methods on all three types of downstream tasks and achieved better performance with the multi-task strategy compared to training them individually. OmiEmbed is a powerful and unified framework that can be widely adapted to various applications of high-dimensional omics data and has great potential to facilitate more accurate and personalised clinical decision making. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

16 pages, 1016 KiB  
Article
Identifying Cancer Drivers Using DRIVE: A Feature-Based Machine Learning Model for a Pan-Cancer Assessment of Somatic Missense Mutations
by Ionut Dragomir, Adnan Akbar, John W. Cassidy, Nirmesh Patel, Harry W. Clifford and Gianmarco Contino
Cancers 2021, 13(11), 2779; https://doi.org/10.3390/cancers13112779 - 3 Jun 2021
Cited by 4 | Viewed by 2909
Abstract
Sporadic cancer develops from the accrual of somatic mutations. Out of all small-scale somatic aberrations in coding regions, 95% are base substitutions, with 90% being missense mutations. While multiple studies focused on the importance of this mutation type, a machine learning method based [...] Read more.
Sporadic cancer develops from the accrual of somatic mutations. Out of all small-scale somatic aberrations in coding regions, 95% are base substitutions, with 90% being missense mutations. While multiple studies focused on the importance of this mutation type, a machine learning method based on the number of protein–protein interactions (PPIs) has not been fully explored. This study aims to develop an improved computational method for driver identification, validation and evaluation (DRIVE), which is compared to other methods for assessing its performance. DRIVE aims at distinguishing between driver and passenger mutations using a feature-based learning approach comprising two levels of biological classification for a pan-cancer assessment of somatic mutations. Gene-level features include the maximum number of protein–protein interactions, the biological process and the type of post-translational modifications (PTMs) while mutation-level features are based on pathogenicity scores. Multiple supervised classification algorithms were trained on Genomics Evidence Neoplasia Information Exchange (GENIE) project data and then tested on an independent dataset from The Cancer Genome Atlas (TCGA) study. Finally, the most powerful classifier using DRIVE was evaluated on a benchmark dataset, which showed a better overall performance compared to other state-of-the-art methodologies, however, considerable care must be taken due to the reduced size of the dataset. DRIVE outlines the outstanding potential that multiple levels of a feature-based learning model will play in the future of oncology-based precision medicine. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

13 pages, 1765 KiB  
Article
Prediction of Incident Cancers in the Lifelines Population-Based Cohort
by Francisco O. Cortés-Ibañez, Sunil Belur Nagaraj, Ludo Cornelissen, Gerjan J. Navis, Bert van der Vegt, Grigory Sidorenkov and Geertruida H. de Bock
Cancers 2021, 13(9), 2133; https://doi.org/10.3390/cancers13092133 - 28 Apr 2021
Cited by 2 | Viewed by 1984
Abstract
Cancer incidence is rising, and accurate prediction of incident cancers could be relevant to understanding and reducing cancer incidence. The aim of this study was to develop machine learning (ML) models that could predict an incident diagnosis of cancer. Participants without any history [...] Read more.
Cancer incidence is rising, and accurate prediction of incident cancers could be relevant to understanding and reducing cancer incidence. The aim of this study was to develop machine learning (ML) models that could predict an incident diagnosis of cancer. Participants without any history of cancer within the Lifelines population-based cohort were followed for a median of 7 years. Data were available for 116,188 cancer-free participants and 4232 incident cancer cases. At baseline, socioeconomic, lifestyle, and clinical variables were assessed. The main outcome was an incident cancer during follow-up (excluding skin cancer), based on linkage with the national pathology registry. The performance of three ML algorithms was evaluated using supervised binary classification to identify incident cancers among participants. Elastic net regularization and Gini index were used for variables selection. An overall area under the receiver operator curve (AUC) <0.75 was obtained, the highest AUC value was for prostate cancer (random forest AUC = 0.82 (95% CI 0.77–0.87), logistic regression AUC = 0.81 (95% CI 0.76–0.86), and support vector machines AUC = 0.83 (95% CI 0.78–0.88), respectively); age was the most important predictor in these models. Linear and non-linear ML algorithms including socioeconomic, lifestyle, and clinical variables produced a moderate predictive performance of incident cancers in the Lifelines cohort. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

17 pages, 4712 KiB  
Article
Performance Comparison of Deep Learning Autoencoders for Cancer Subtype Detection Using Multi-Omics Data
by Edian F. Franco, Pratip Rana, Aline Cruz, Víctor V. Calderón, Vasco Azevedo, Rommel T. J. Ramos and Preetam Ghosh
Cancers 2021, 13(9), 2013; https://doi.org/10.3390/cancers13092013 - 22 Apr 2021
Cited by 34 | Viewed by 6690
Abstract
A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a [...] Read more.
A heterogeneous disease such as cancer is activated through multiple pathways and different perturbations. Depending upon the activated pathway(s), the survival of the patients varies significantly and shows different efficacy to various drugs. Therefore, cancer subtype detection using genomics level data is a significant research problem. Subtype detection is often a complex problem, and in most cases, needs multi-omics data fusion to achieve accurate subtyping. Different data fusion and subtyping approaches have been proposed over the years, such as kernel-based fusion, matrix factorization, and deep learning autoencoders. In this paper, we compared the performance of different deep learning autoencoders for cancer subtype detection. We performed cancer subtype detection on four different cancer types from The Cancer Genome Atlas (TCGA) datasets using four autoencoder implementations. We also predicted the optimal number of subtypes in a cancer type using the silhouette score and found that the detected subtypes exhibit significant differences in survival profiles. Furthermore, we compared the effect of feature selection and similarity measures for subtype detection. For further evaluation, we used the Glioblastoma multiforme (GBM) dataset and identified the differentially expressed genes in each of the subtypes. The results obtained are consistent with other genomic studies and can be corroborated with the involved pathways and biological functions. Thus, it shows that the results from the autoencoders, obtained through the interaction of different datatypes of cancer, can be used for the prediction and characterization of patient subgroups and survival profiles. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

20 pages, 9699 KiB  
Article
Assessment of Immunological Features in Muscle-Invasive Bladder Cancer Prognosis Using Ensemble Learning
by Christos G. Gavriel, Neofytos Dimitriou, Nicolas Brieu, Ines P. Nearchou, Ognjen Arandjelović, Günter Schmidt, David J. Harrison and Peter D. Caie
Cancers 2021, 13(7), 1624; https://doi.org/10.3390/cancers13071624 - 1 Apr 2021
Cited by 17 | Viewed by 3535
Abstract
The clinical staging and prognosis of muscle-invasive bladder cancer (MIBC) routinely includes the assessment of patient tissue samples by a pathologist. Recent studies corroborate the importance of image analysis in identifying and quantifying immunological markers from tissue samples that can provide further insight [...] Read more.
The clinical staging and prognosis of muscle-invasive bladder cancer (MIBC) routinely includes the assessment of patient tissue samples by a pathologist. Recent studies corroborate the importance of image analysis in identifying and quantifying immunological markers from tissue samples that can provide further insight into patient prognosis. In this paper, we apply multiplex immunofluorescence to MIBC tissue sections to capture whole-slide images and quantify potential prognostic markers related to lymphocytes, macrophages, tumour buds, and PD-L1. We propose a machine-learning-based approach for the prediction of 5 year prognosis with different combinations of image, clinical, and spatial features. An ensemble model comprising several functionally different models successfully stratifies MIBC patients into two risk groups with high statistical significance (p value < 1×105). Critical to improving MIBC survival rates, our method correctly classifies 71.4% of the patients who succumb to MIBC, which is significantly more than the 28.6% of the current clinical gold standard, the TNM staging system. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

18 pages, 4879 KiB  
Article
Convolutional Neural Network-Based Clinical Predictors of Oral Dysplasia: Class Activation Map Analysis of Deep Learning Results
by Seda Camalan, Hanya Mahmood, Hamidullah Binol, Anna Luiza Damaceno Araújo, Alan Roger Santos-Silva, Pablo Agustin Vargas, Marcio Ajudarte Lopes, Syed Ali Khurram and Metin N. Gurcan
Cancers 2021, 13(6), 1291; https://doi.org/10.3390/cancers13061291 - 14 Mar 2021
Cited by 48 | Viewed by 5601
Abstract
Oral cancer/oral squamous cell carcinoma is among the top ten most common cancers globally, with over 500,000 new cases and 350,000 associated deaths every year worldwide. There is a critical need for objective, novel technologies that facilitate early, accurate diagnosis. For this purpose, [...] Read more.
Oral cancer/oral squamous cell carcinoma is among the top ten most common cancers globally, with over 500,000 new cases and 350,000 associated deaths every year worldwide. There is a critical need for objective, novel technologies that facilitate early, accurate diagnosis. For this purpose, we have developed a method to classify images as “suspicious” and “normal” by performing transfer learning on Inception-ResNet-V2 and generated automated heat maps to highlight the region of the images most likely to be involved in decision making. We have tested the developed method’s feasibility on two independent datasets of clinical photographic images of 30 and 24 patients from the UK and Brazil, respectively. Both 10-fold cross-validation and leave-one-patient-out validation methods were performed to test the system, achieving accuracies of 73.6% (±19%) and 90.9% (±12%), F1-scores of 97.9% and 87.2%, and precision values of 95.4% and 99.3% at recall values of 100.0% and 81.1% on these two respective cohorts. This study presents several novel findings and approaches, namely the development and validation of our methods on two datasets collected in different countries showing that using patches instead of the whole lesion image leads to better performance and analyzing which regions of the images are predictive of the classes using class activation map analysis. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

12 pages, 2814 KiB  
Article
Homology-Based Image Processing for Automatic Classification of Histopathological Images of Lung Tissue
by Mizuho Nishio, Mari Nishio, Naoe Jimbo and Kazuaki Nakane
Cancers 2021, 13(6), 1192; https://doi.org/10.3390/cancers13061192 - 10 Mar 2021
Cited by 47 | Viewed by 3827
Abstract
The purpose of this study was to develop a computer-aided diagnosis (CAD) system for automatic classification of histopathological images of lung tissues. Two datasets (private and public datasets) were obtained and used for developing and validating CAD. The private dataset consists of 94 [...] Read more.
The purpose of this study was to develop a computer-aided diagnosis (CAD) system for automatic classification of histopathological images of lung tissues. Two datasets (private and public datasets) were obtained and used for developing and validating CAD. The private dataset consists of 94 histopathological images that were obtained for the following five categories: normal, emphysema, atypical adenomatous hyperplasia, lepidic pattern of adenocarcinoma, and invasive adenocarcinoma. The public dataset consists of 15,000 histopathological images that were obtained for the following three categories: lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. These images were automatically classified using machine learning and two types of image feature extraction: conventional texture analysis (TA) and homology-based image processing (HI). Multiscale analysis was used in the image feature extraction, after which automatic classification was performed using the image features and eight machine learning algorithms. The multicategory accuracy of our CAD system was evaluated in the two datasets. In both the public and private datasets, the CAD system with HI was better than that with TA. It was possible to build an accurate CAD system for lung tissues. HI was more useful for the CAD systems than TA. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

15 pages, 2415 KiB  
Article
A 10-Year Probability Deep Neural Network Prediction Model for Lung Cancer
by Hsiu-An Lee, Louis R. Chao and Chien-Yeh Hsu
Cancers 2021, 13(4), 928; https://doi.org/10.3390/cancers13040928 - 23 Feb 2021
Cited by 5 | Viewed by 2428
Abstract
Cancer is the leading cause of death in Taiwan. According to the Cancer Registration Report of Taiwan’s Ministry of Health and Welfare, a total of 13,488 people suffered from lung cancer in 2016, making it the second-most common cancer and the leading cancer [...] Read more.
Cancer is the leading cause of death in Taiwan. According to the Cancer Registration Report of Taiwan’s Ministry of Health and Welfare, a total of 13,488 people suffered from lung cancer in 2016, making it the second-most common cancer and the leading cancer in men. Compared with other types of cancer, the incidence of lung cancer is high. In this study, the National Health Insurance Research Database (NHIRDB) was used to determine the diseases and symptoms associated with lung cancer, and a 10-year probability deep neural network prediction model for lung cancer was developed. The proposed model could allow patients with a high risk of lung cancer to receive an earlier diagnosis and support the physicians’ clinical decision-making. The study was designed as a cohort study. The subjects were patients who were diagnosed with lung cancer between 2000 and 2009, and the patients’ disease histories were back-tracked for a period, extending to ten years before the diagnosis of lung cancer. As a result, a total of 13 diseases were selected as the predicting factors. A nine layers deep neural network model was created to predict the probability of lung cancer, depending on the different pre-diagnosed diseases, and to benefit the earlier detection of lung cancer in potential patients. The model is trained 1000 times, the batch size is set to 100, the SGD (Stochastic gradient descent) optimizer is used, the learning rate is set to 0.1, and the momentum is set to 0.1. The proposed model showed an accuracy of 85.4%, a sensitivity of 72.4% and a specificity of 85%, as well as an 87.4% area under ROC (AUROC) (95%, 0.8604–0.8885) model precision. Based on data analysis and deep learning, our prediction model discovered some features that had not been previously identified by clinical knowledge. This study tracks a decade of clinical diagnostic records to identify possible symptoms and comorbidities of lung cancer, allows early prediction of the disease, and assists more patients with early diagnosis. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

11 pages, 2344 KiB  
Article
Deep Learning Based HPV Status Prediction for Oropharyngeal Cancer Patients
by Daniel M. Lang, Jan C. Peeken, Stephanie E. Combs, Jan J. Wilkens and Stefan Bartzsch
Cancers 2021, 13(4), 786; https://doi.org/10.3390/cancers13040786 - 13 Feb 2021
Cited by 23 | Viewed by 2941
Abstract
Infection with the human papillomavirus (HPV) has been identified as a major risk factor for oropharyngeal cancer (OPC). HPV-related OPCs have been shown to be more radiosensitive and to have a reduced risk for cancer related death. Hence, the histological determination of HPV [...] Read more.
Infection with the human papillomavirus (HPV) has been identified as a major risk factor for oropharyngeal cancer (OPC). HPV-related OPCs have been shown to be more radiosensitive and to have a reduced risk for cancer related death. Hence, the histological determination of HPV status of cancer patients depicts an essential diagnostic factor. We investigated the ability of deep learning models for imaging based HPV status detection. To overcome the problem of small medical datasets, we used a transfer learning approach. A 3D convolutional network pre-trained on sports video clips was fine-tuned, such that full 3D information in the CT images could be exploited. The video pre-trained model was able to differentiate HPV-positive from HPV-negative cases, with an area under the receiver operating characteristic curve (AUC) of 0.81 for an external test set. In comparison to a 3D convolutional neural network (CNN) trained from scratch and a 2D architecture pre-trained on ImageNet, the video pre-trained model performed best. Deep learning models are capable of CT image-based HPV status determination. Video based pre-training has the ability to improve training for 3D medical data, but further studies are needed for verification. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Graphical abstract

11 pages, 1027 KiB  
Article
Deep Learning Prediction of Cancer Prevalence from Satellite Imagery
by Jean-Emmanuel Bibault, Maxime Bassenne, Hongyi Ren and Lei Xing
Cancers 2020, 12(12), 3844; https://doi.org/10.3390/cancers12123844 - 19 Dec 2020
Cited by 5 | Viewed by 3626
Abstract
The worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) [...] Read more.
The worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) features extracted from satellite images to predict cancer prevalence at the census tract level in seven cities in the United States. We trained the model using detailed cancer prevalence estimates from 2018 available in the CDC (Center for Disease Control) 500 Cities project. Data from 3500 census tracts covering 14,483,366 inhabitants were included. Features were extracted from 170,210 satellite images with deep learning. This method explained up to 64.37% (median = 43.53%) of the variation of cancer prevalence. Satellite features are highly correlated with individual socioeconomic and health measures that are linked to cancer prevalence (age, smoking and drinking status, and obesity). A higher similarity between two environments is associated with better generalization of the model (p = 1.10–6). This method can be used to accurately estimate cancer prevalence at a high spatial resolution without using surveys at a fraction of the cost. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Graphical abstract

16 pages, 1242 KiB  
Article
Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study
by Shi-Jer Lou, Ming-Feng Hou, Hong-Tai Chang, Chong-Chi Chiu, Hao-Hsien Lee, Shu-Chuan Jennifer Yeh and Hon-Yi Shi
Cancers 2020, 12(12), 3817; https://doi.org/10.3390/cancers12123817 - 17 Dec 2020
Cited by 28 | Viewed by 3055
Abstract
No studies have discussed machine learning algorithms to predict recurrence within 10 years after breast cancer surgery. This study purposed to compare the accuracy of forecasting models to predict recurrence within 10 years after breast cancer surgery and to identify significant predictors of [...] Read more.
No studies have discussed machine learning algorithms to predict recurrence within 10 years after breast cancer surgery. This study purposed to compare the accuracy of forecasting models to predict recurrence within 10 years after breast cancer surgery and to identify significant predictors of recurrence. Registry data for breast cancer surgery patients were allocated to a training dataset (n = 798) for model development, a testing dataset (n = 171) for internal validation, and a validating dataset (n = 171) for external validation. Global sensitivity analysis was then performed to evaluate the significance of the selected predictors. Demographic characteristics, clinical characteristics, quality of care, and preoperative quality of life were significantly associated with recurrence within 10 years after breast cancer surgery (p < 0.05). Artificial neural networks had the highest prediction performance indices. Additionally, the surgeon volume was the best predictor of recurrence within 10 years after breast cancer surgery, followed by hospital volume and tumor stage. Accurate recurrence within 10 years prediction by machine learning algorithms may improve precision in managing patients after breast cancer surgery and improve understanding of risk factors for recurrence within 10 years after breast cancer surgery. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

18 pages, 7565 KiB  
Article
Experimental Assessment of Color Deconvolution and Color Normalization for Automated Classification of Histology Images Stained with Hematoxylin and Eosin
by Francesco Bianconi, Jakob N. Kather and Constantino Carlos Reyes-Aldasoro
Cancers 2020, 12(11), 3337; https://doi.org/10.3390/cancers12113337 - 11 Nov 2020
Cited by 20 | Viewed by 5975
Abstract
Histological evaluation plays a major role in cancer diagnosis and treatment. The appearance of H&E-stained images can vary significantly as a consequence of differences in several factors, such as reagents, staining conditions, preparation procedure and image acquisition system. Such potential sources of noise [...] Read more.
Histological evaluation plays a major role in cancer diagnosis and treatment. The appearance of H&E-stained images can vary significantly as a consequence of differences in several factors, such as reagents, staining conditions, preparation procedure and image acquisition system. Such potential sources of noise can all have negative effects on computer-assisted classification. To minimize such artefacts and their potentially negative effects several color pre-processing methods have been proposed in the literature—for instance, color augmentation, color constancy, color deconvolution and color transfer. Still, little work has been done to investigate the efficacy of these methods on a quantitative basis. In this paper, we evaluated the effects of color constancy, deconvolution and transfer on automated classification of H&E-stained images representing different types of cancers—specifically breast, prostate, colorectal cancer and malignant lymphoma. Our results indicate that in most cases color pre-processing does not improve the classification accuracy, especially when coupled with color-based image descriptors. Some pre-processing methods, however, can be beneficial when used with some texture-based methods like Gabor filters and Local Binary Patterns. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

15 pages, 2994 KiB  
Article
Machine Learning Model to Predict Pseudoprogression Versus Progression in Glioblastoma Using MRI: A Multi-Institutional Study (KROG 18-07)
by Bum-Sup Jang, Andrew J. Park, Seung Hyuck Jeon, Il Han Kim, Do Hoon Lim, Shin-Hyung Park, Ju Hye Lee, Ji Hyun Chang, Kwan Ho Cho, Jin Hee Kim, Leonard Sunwoo, Seung Hong Choi and In Ah Kim
Cancers 2020, 12(9), 2706; https://doi.org/10.3390/cancers12092706 - 21 Sep 2020
Cited by 19 | Viewed by 3675
Abstract
Some patients with glioblastoma show a worsening presentation in imaging after concurrent chemoradiation, even when they receive gross total resection. Previously, we showed the feasibility of a machine learning model to predict pseudoprogression (PsPD) versus progressive disease (PD) in glioblastoma patients. The previous [...] Read more.
Some patients with glioblastoma show a worsening presentation in imaging after concurrent chemoradiation, even when they receive gross total resection. Previously, we showed the feasibility of a machine learning model to predict pseudoprogression (PsPD) versus progressive disease (PD) in glioblastoma patients. The previous model was based on the dataset from two institutions (termed as the Seoul National University Hospital (SNUH) dataset, N = 78). To test this model in a larger dataset, we collected cases from multiple institutions that raised the problem of PsPD vs. PD diagnosis in clinics (Korean Radiation Oncology Group (KROG) dataset, N = 104). The dataset was composed of brain MR images and clinical information. We tested the previous model in the KROG dataset; however, that model showed limited performance. After hyperparameter optimization, we developed a deep learning model based on the whole dataset (N = 182). The 10-fold cross validation revealed that the micro-average area under the precision-recall curve (AUPRC) was 0.86. The calibration model was constructed to estimate the interpretable probability directly from the model output. After calibration, the final model offers clinical probability in a web-user interface. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

15 pages, 4142 KiB  
Article
Computer-Aided Diagnosis in Multiparametric MRI of the Prostate: An Open-Access Online Tool for Lesion Classification with High Accuracy
by Stephan Ellmann, Michael Schlicht, Matthias Dietzel, Rolf Janka, Matthias Hammon, Marc Saake, Thomas Ganslandt, Arndt Hartmann, Frank Kunath, Bernd Wullich, Michael Uder and Tobias Bäuerle
Cancers 2020, 12(9), 2366; https://doi.org/10.3390/cancers12092366 - 21 Aug 2020
Cited by 9 | Viewed by 5170
Abstract
Computer-aided diagnosis (CADx) approaches could help to objectify reporting on prostate mpMRI, but their use in many cases is hampered due to common-built algorithms that are not publicly available. The aim of this study was to develop an open-access CADx algorithm with high [...] Read more.
Computer-aided diagnosis (CADx) approaches could help to objectify reporting on prostate mpMRI, but their use in many cases is hampered due to common-built algorithms that are not publicly available. The aim of this study was to develop an open-access CADx algorithm with high accuracy for classification of suspicious lesions in mpMRI of the prostate. This retrospective study was approved by the local ethics commission, with waiver of informed consent. A total of 124 patients with 195 reported lesions were included. All patients received mpMRI of the prostate between 2014 and 2017, and transrectal ultrasound (TRUS)-guided and targeted biopsy within a time period of 30 days. Histopathology of the biopsy cores served as a standard of reference. Acquired imaging parameters included the size of the lesion, signal intensity (T2w images), diffusion restriction, prostate volume, and several dynamic parameters along with the clinical parameters patient age and serum PSA level. Inter-reader agreement of the imaging parameters was assessed by calculating intraclass correlation coefficients. The dataset was stratified into a train set and test set (156 and 39 lesions in 100 and 24 patients, respectively). Using the above parameters, a CADx based on an Extreme Gradient Boosting algorithm was developed on the train set, and tested on the test set. Performance optimization was focused on maximizing the area under the Receiver Operating Characteristic curve (ROCAUC). The algorithm was made publicly available on the internet. The CADx reached an ROCAUC of 0.908 during training, and 0.913 during testing (p = 0.93). Additionally, established rule-in and rule-out criteria allowed classifying 35.8% of the malignant and 49.4% of the benign lesions with error rates of <2%. All imaging parameters featured excellent inter-reader agreement. This study presents an open-access CADx for classification of suspicious lesions in mpMRI of the prostate with high accuracy. Applying the provided rule-in and rule-out criteria might facilitate to further stratify the management of patients at risk. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Graphical abstract

Review

Jump to: Research

24 pages, 12876 KiB  
Review
Global Trends in Cancer Nanotechnology: A Qualitative Scientific Mapping Using Content-Based and Bibliometric Features for Machine Learning Text Classification
by Nuwan Indika Millagaha Gedara, Xuan Xu, Robert DeLong, Santosh Aryal and Majid Jaberi-Douraki
Cancers 2021, 13(17), 4417; https://doi.org/10.3390/cancers13174417 - 1 Sep 2021
Cited by 11 | Viewed by 3939
Abstract
This study presents a new way to investigate comprehensive trends in cancer nanotechnology research in different countries, institutions, and journals providing critical insights to prevention, diagnosis, and therapy. This paper applied the qualitative method of bibliometric analysis on cancer nanotechnology using the PubMed [...] Read more.
This study presents a new way to investigate comprehensive trends in cancer nanotechnology research in different countries, institutions, and journals providing critical insights to prevention, diagnosis, and therapy. This paper applied the qualitative method of bibliometric analysis on cancer nanotechnology using the PubMed database during the years 2000–2021. Inspired by hybrid medical models and content-based and bibliometric features for machine learning models, our results show cancer nanotechnology studies have expanded exponentially since 2010. The highest production of articles in cancer nanotechnology is mainly from US institutions, with several countries, notably the USA, China, the UK, India, and Iran as concentrated focal points as centers of cancer nanotechnology research, especially in the last five years. The analysis shows the greatest overlap between nanotechnology and DNA, RNA, iron oxide or mesoporous silica, breast cancer, and cancer diagnosis and cancer treatment. Moreover, more than 50% of the information related to the keywords, authors, institutions, journals, and countries are considerably investigated in the form of publications from the top 100 journals. This study has the potential to provide past and current lines of research that can unmask comprehensive trends in cancer nanotechnology, key research topics, or the most productive countries and authors in the field. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

39 pages, 10620 KiB  
Review
A Review of Computer-Aided Expert Systems for Breast Cancer Diagnosis
by Xin Yu Liew, Nazia Hameed and Jeremie Clos
Cancers 2021, 13(11), 2764; https://doi.org/10.3390/cancers13112764 - 2 Jun 2021
Cited by 16 | Viewed by 4781
Abstract
A computer-aided diagnosis (CAD) expert system is a powerful tool to efficiently assist a pathologist in achieving an early diagnosis of breast cancer. This process identifies the presence of cancer in breast tissue samples and the distinct type of cancer stages. In a [...] Read more.
A computer-aided diagnosis (CAD) expert system is a powerful tool to efficiently assist a pathologist in achieving an early diagnosis of breast cancer. This process identifies the presence of cancer in breast tissue samples and the distinct type of cancer stages. In a standard CAD system, the main process involves image pre-processing, segmentation, feature extraction, feature selection, classification, and performance evaluation. In this review paper, we reviewed the existing state-of-the-art machine learning approaches applied at each stage involving conventional methods and deep learning methods, the comparisons within methods, and we provide technical details with advantages and disadvantages. The aims are to investigate the impact of CAD systems using histopathology images, investigate deep learning methods that outperform conventional methods, and provide a summary for future researchers to analyse and improve the existing techniques used. Lastly, we will discuss the research gaps of existing machine learning approaches for implementation and propose future direction guidelines for upcoming researchers. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

16 pages, 494 KiB  
Review
Transfer Learning in Breast Cancer Diagnoses via Ultrasound Imaging
by Gelan Ayana, Kokeb Dese and Se-woon Choe
Cancers 2021, 13(4), 738; https://doi.org/10.3390/cancers13040738 - 10 Feb 2021
Cited by 85 | Viewed by 7878
Abstract
Transfer learning is a machine learning approach that reuses a learning method developed for a task as the starting point for a model on a target task. The goal of transfer learning is to improve performance of target learners by transferring the knowledge [...] Read more.
Transfer learning is a machine learning approach that reuses a learning method developed for a task as the starting point for a model on a target task. The goal of transfer learning is to improve performance of target learners by transferring the knowledge contained in other (but related) source domains. As a result, the need for large numbers of target-domain data is lowered for constructing target learners. Due to this immense property, transfer learning techniques are frequently used in ultrasound breast cancer image analyses. In this review, we focus on transfer learning methods applied on ultrasound breast image classification and detection from the perspective of transfer learning approaches, pre-processing, pre-training models, and convolutional neural network (CNN) models. Finally, comparison of different works is carried out, and challenges—as well as outlooks—are discussed. Full article
(This article belongs to the Special Issue Machine Learning Techniques in Cancer)
Show Figures

Figure 1

Back to TopTop