ijms-logo

Journal Browser

Journal Browser

Special Issue "Data Analysis and Integration in Cancer Research"

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (31 October 2019).

Special Issue Editors

Dr. Antonio Federico
E-Mail Website
Guest Editor
Faculty of Medicine and Health Technology - Tampere University, Finland BioMediTech Institute, Tampere University, Finland
Interests: systems biology; multi-omics data analysis; data integration; cancer genomics; network analysis; next-generation sequencing; drugability evaluation; predictive pharmacology; precision oncology
Special Issues and Collections in MDPI journals
Dr. Giovanni Scala
E-Mail
Guest Editor
Faculty of Medicine and Life Sciences, University of Tampere, Finland
Interests: omics data analysis; integration and modelling; multi-omics approaches in the study of adverse outcome pathways; epigenomic therapeutic targets of human cancers; epidemiological epigenomics
Special Issues and Collections in MDPI journals

Special Issue Information

Dear colleagues,

Recent technological advancements in genomics have paved the way to a deeper understanding of the mechanisms driving the onset and progression of cancer, providing an unprecedented source of information for the investigation of the features determining treatment outcomes.

Recent studies carried out by big consortia, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), have generated comprehensive catalogues of molecular aberrations linked to tumorigenesis for large cohorts of cancer patients and across a wide range of cancer types.

The amount and diversity of currently available data poses an important challenge: the ability to meaningfully analyze and integrate information from different biological sources, in order to obtain a more detailed and complete picture of the complex dynamics behind biological events related to cancer.

In this regard, advances in bioinformatics and computational biology techniques, along with dedicated tools, are essential for the integrative analysis of high-dimensional data.

This Special Issue, “Data Analysis and Integration in Cancer Research”, will cover a selection of research articles, review articles, and commentaries reporting recent methodological advancements on the analysis and integration of high-dimensional data, with a particular focus on (multi-)omics technologies in the study of molecular dynamics in cancer.

Dr. Antonio Federico
Dr. Giovanni Scala
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • biological big data
  • precision oncology
  • cancer therapy
  • biological data integration
  • cancer research
  • multi-omics approaches
  • next-generation sequencing
  • high-throughput technologies
  • systems biology
  • cancer genomics
  • network analysis
  • transcriptome analysis
  • epigenomics analysis
  • non-coding RNAs

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

Open AccessArticle
An Integrated Pan-Cancer Analysis and Structure-Based Virtual Screening of GPR15
Int. J. Mol. Sci. 2019, 20(24), 6226; https://doi.org/10.3390/ijms20246226 - 10 Dec 2019
Abstract
G protein-coupled receptor 15 (GPR15, also known as BOB) is an extensively studied orphan G protein-coupled receptors (GPCRs) involving human immunodeficiency virus (HIV) infection, colonic inflammation, and smoking-related diseases. Recently, GPR15 was deorphanized and its corresponding natural ligand demonstrated an ability to inhibit [...] Read more.
G protein-coupled receptor 15 (GPR15, also known as BOB) is an extensively studied orphan G protein-coupled receptors (GPCRs) involving human immunodeficiency virus (HIV) infection, colonic inflammation, and smoking-related diseases. Recently, GPR15 was deorphanized and its corresponding natural ligand demonstrated an ability to inhibit cancer cell growth. However, no study reported the potential role of GPR15 in a pan-cancer manner. Using large-scale publicly available data from the Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) databases, we found that GPR15 expression is significantly lower in colon adenocarcinoma (COAD) and rectal adenocarcinoma (READ) than in normal tissues. Among 33 cancer types, GPR15 expression was significantly positively correlated with the prognoses of COAD, neck squamous carcinoma (HNSC), and lung adenocarcinoma (LUAD) and significantly negatively correlated with stomach adenocarcinoma (STAD). This study also revealed that commonly upregulated gene sets in the high GPR15 expression group (stratified via median) of COAD, HNSC, LUAD, and STAD are enriched in immune systems, indicating that GPR15 might be considered as a potential target for cancer immunotherapy. Furthermore, we modelled the 3D structure of GPR15 and conducted structure-based virtual screening. The top eight hit compounds were screened and then subjected to molecular dynamics (MD) simulation for stability analysis. Our study provides novel insights into the role of GPR15 in a pan-cancer manner and discovered a potential hit compound for GPR15 antagonists. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessArticle
RankerGUI: A Computational Framework to Compare Differential Gene Expression Profiles Using Rank Based Statistics
Int. J. Mol. Sci. 2019, 20(23), 6098; https://doi.org/10.3390/ijms20236098 - 03 Dec 2019
Abstract
The comparison of high throughput gene expression datasets obtained from different experimental conditions is a challenging task. It provides an opportunity to explore the cellular response to various biological events such as disease, environmental conditions, and drugs. There is a need for tools [...] Read more.
The comparison of high throughput gene expression datasets obtained from different experimental conditions is a challenging task. It provides an opportunity to explore the cellular response to various biological events such as disease, environmental conditions, and drugs. There is a need for tools that allow the integration and analysis of such data. We developed the “RankerGUI pipeline”, a user-friendly web application for the biological community. It allows users to use various rank based statistical approaches for the comparison of full differential gene expression profiles between the same or different biological states obtained from different sources. The pipeline modules are an integration of various open-source packages, a few of which are modified for extended functionality. The main modules include rank rank hypergeometric overlap, enriched rank rank hypergeometric overlap and distance calculations. Additionally, preprocessing steps such as merging differential expression profiles of multiple independent studies can be added before running the main modules. Output plots show the strength, pattern, and trends among complete differential expression profiles. In this paper, we describe the various modules and functionalities of the developed pipeline. We also present a case study that demonstrates how the pipeline can be used for the comparison of differential expression profiles obtained from multiple platforms’ data of the Gene Expression Omnibus. Using these comparisons, we investigate gene expression patterns in kidney and lung cancers. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessArticle
A Gene Signature of Survival Prediction for Kidney Renal Cell Carcinoma by Multi-Omic Data Analysis
Int. J. Mol. Sci. 2019, 20(22), 5720; https://doi.org/10.3390/ijms20225720 - 14 Nov 2019
Abstract
Kidney renal cell carcinoma (KIRC), which is the most common subtype of kidney cancer, has a poor prognosis and a high mortality rate. In this study, a multi-omics analysis is performed to build a multi-gene prognosis signature for KIRC. A combination of a [...] Read more.
Kidney renal cell carcinoma (KIRC), which is the most common subtype of kidney cancer, has a poor prognosis and a high mortality rate. In this study, a multi-omics analysis is performed to build a multi-gene prognosis signature for KIRC. A combination of a DNA methylation analysis and a gene expression data analysis revealed 863 methylated differentially expressed genes (MDEGs). Seven MDEGs (BID, CCNF, DLX4, FAM72D, PYCR1, RUNX1, and TRIP13) were further screened using LASSO Cox regression and integrated into a prognostic risk score model. Then, KIRC patients were divided into high- and low-risk groups. A univariate cox regression analysis revealed a significant association between the high-risk group and a poor prognosis. The time-dependent receiver operating characteristic (ROC) curve shows that the risk group performs well in predicting overall survival. Furthermore, the risk group is contained in the best multivariate model that was obtained by a multivariate stepwise analysis, which further confirms that the risk group can be used as a potential prognostic biomarker. In addition, a nomogram was established for the best multivariate model and shown to perform well in predicting the survival of KIRC patients. In summary, a seven-MDEG signature is a powerful prognosis factor for KIRC patients and may provide useful suggestions for their personalized therapy. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessArticle
Molecular Inverse Comorbidity between Alzheimer’s Disease and Lung Cancer: New Insights from Matrix Factorization
Int. J. Mol. Sci. 2019, 20(13), 3114; https://doi.org/10.3390/ijms20133114 - 26 Jun 2019
Cited by 1
Abstract
Matrix factorization (MF) is an established paradigm for large-scale biological data analysis with tremendous potential in computational biology. Here, we challenge MF in depicting the molecular bases of epidemiologically described disease–disease (DD) relationships. As a use case, we focus on the inverse comorbidity [...] Read more.
Matrix factorization (MF) is an established paradigm for large-scale biological data analysis with tremendous potential in computational biology. Here, we challenge MF in depicting the molecular bases of epidemiologically described disease–disease (DD) relationships. As a use case, we focus on the inverse comorbidity association between Alzheimer’s disease (AD) and lung cancer (LC), described as a lower than expected probability of developing LC in AD patients. To this day, the molecular mechanisms underlying DD relationships remain poorly explained and their better characterization might offer unprecedented clinical opportunities. To this goal, we extend our previously designed MF-based framework for the molecular characterization of DD relationships. Considering AD–LC inverse comorbidity as a case study, we highlight multiple molecular mechanisms, among which we confirm the involvement of processes related to the immune system and mitochondrial metabolism. We then distinguish mechanisms specific to LC from those shared with other cancers through a pan-cancer analysis. Additionally, new candidate molecular players, such as estrogen receptor (ER), cadherin 1 (CDH1) and histone deacetylase (HDAC), are pinpointed as factors that might underlie the inverse relationship, opening the way to new investigations. Finally, some lung cancer subtype-specific factors are also detected, also suggesting the existence of heterogeneity across patients in the context of inverse comorbidity. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessArticle
Chromogranin-A Expression as a Novel Biomarker for Early Diagnosis of Colon Cancer Patients
Int. J. Mol. Sci. 2019, 20(12), 2919; https://doi.org/10.3390/ijms20122919 - 14 Jun 2019
Abstract
Colon cancer is one of the major causes of cancer death worldwide. The five-year survival rate for the early-stage patients is more than 90%, and only around 10% for the later stages. Moreover, half of the colon cancer patients have been clinically diagnosed [...] Read more.
Colon cancer is one of the major causes of cancer death worldwide. The five-year survival rate for the early-stage patients is more than 90%, and only around 10% for the later stages. Moreover, half of the colon cancer patients have been clinically diagnosed at the later stages. It is; therefore, of importance to enhance the ability for the early diagnosis of colon cancer. Taking advantages from our previous studies, there are several potential biomarkers which have been associated with the early diagnosis of the colon cancer. In order to investigate these early diagnostic biomarkers for colon cancer, human chromogranin-A (CHGA) was further analyzed among the most powerful diagnostic biomarkers. In this study, we used a logistic regression-based meta-analysis to clarify associations of CHGA expression with colon cancer diagnosis. Both healthy populations and the normal mucosa from the colon cancer patients were selected as the double normal controls. The results showed decreased expression of CHGA in the early stages of colon cancer as compared to the normal controls. The decline of CHGA expression in the early stages of colon cancer is probably a new diagnostic biomarker for colon cancer diagnosis with high predicting possibility and verification performance. We have also compared the diagnostic powers of CHGA expression with the typical oncogene KRAS, classic tumor suppressor TP53, and well-known cellular proliferation index MKI67, and the CHGA showed stronger ability to predict early diagnosis for colon cancer than these other cancer biomarkers. In the protein–protein interaction (PPI) network, CHGA was revealed to share some common pathways with KRAS and TP53. CHGA might be considered as a novel, promising, and powerful biomarker for early diagnosis of colon cancer. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessArticle
A Prediction Model for Preoperative Risk Assessment in Endometrial Cancer Utilizing Clinical and Molecular Variables
Int. J. Mol. Sci. 2019, 20(5), 1205; https://doi.org/10.3390/ijms20051205 - 09 Mar 2019
Cited by 2
Abstract
The utility of comprehensive surgical staging in patients with low risk disease has been questioned. Thus, a reliable means of determining risk would be quite useful. The aim of our study was to create the best performing prediction model to classify endometrioid endometrial [...] Read more.
The utility of comprehensive surgical staging in patients with low risk disease has been questioned. Thus, a reliable means of determining risk would be quite useful. The aim of our study was to create the best performing prediction model to classify endometrioid endometrial cancer (EEC) patients into low or high risk using a combination of molecular and clinical-pathological variables. We then validated these models with publicly available datasets. Analyses between low and high risk EEC were performed using clinical and pathological data, gene and miRNA expression data, gene copy number variation and somatic mutation data. Variables were selected to be included in the prediction model of risk using cross-validation analysis; prediction models were then constructed using these variables. Model performance was assessed by area under the curve (AUC). Prediction models were validated using appropriate datasets in The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. A prediction model with only clinical variables performed at 88%. Integrating clinical and molecular data improved prediction performance up to 97%. The best prediction models included clinical, miRNA expression and/or somatic mutation data, and stratified pre-operative risk in EEC patients. Integrating molecular and clinical data improved the performance of prediction models to over 95%, resulting in potentially useful clinical tests. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessCommunication
Population Substructure Has Implications in Validating Next-Generation Cancer Genomics Studies with TCGA
Int. J. Mol. Sci. 2019, 20(5), 1192; https://doi.org/10.3390/ijms20051192 - 08 Mar 2019
Cited by 1
Abstract
In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or [...] Read more.
In the era of large genetic and genomic datasets, it has become crucially important to validate results of individual studies using data from publicly available sources, such as The Cancer Genome Atlas (TCGA). However, how generalizable are results from either an independent or a large public dataset to the remainder of the population? The study presented here aims to answer that question. Utilizing next generation sequencing data from endometrial and ovarian cancer patients from both the University of Iowa and TCGA, genomic admixture of each population was analyzed using STRUCTURE and ADMIXTURE software. In our independent data set, one subpopulation was identified, whereas in TCGA 4–6 subpopulations were identified. Data presented here demonstrate how different the genetic substructures of the TCGA and University of Iowa populations are. Validation of genomic studies between two different population samples must be aware of, account for and be corrected for background genetic substructure. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessCommunication
Molecular Characterization of Non-responders to Chemotherapy in Serous Ovarian Cancer
Int. J. Mol. Sci. 2019, 20(5), 1175; https://doi.org/10.3390/ijms20051175 - 07 Mar 2019
Abstract
Nearly one-third of patients with high-grade serous ovarian cancer (HGSC) do not respond to initial treatment with platinum-based therapy. Genomic and clinical characterization of these patients may lead to potential alternative therapies. Here, the objective is to classify non-responders into subsets using clinical [...] Read more.
Nearly one-third of patients with high-grade serous ovarian cancer (HGSC) do not respond to initial treatment with platinum-based therapy. Genomic and clinical characterization of these patients may lead to potential alternative therapies. Here, the objective is to classify non-responders into subsets using clinical and molecular features. Using patients from The Cancer Genome Atlas (TCGA) dataset with platinum-resistant or platinum-refractory HGSC, we performed a genome-wide unsupervised cluster analysis that integrated clinical data, gene copy number variations, gene somatic mutations, and DNA promoter methylation. Pathway enrichment analysis was performed for each cluster to identify the targetable processes. Following the unsupervised cluster analysis, three distinct clusters of non-responders emerged. Cluster 1 had overrepresentation of the stage IV disease and suboptimal debulking, under-expression of miRNAs and mRNAs, hypomethylated DNA, “loss of function” TP53 mutations, and the overexpression of genes in the PDGFR pathway. Cluster 2 had low miRNA expression, generalized hypermethylation, MUC17 mutations, and significant activation of the HIF-1 signaling pathway. Cluster 3 had more optimally cytoreduced stage III patients, overexpression of miRNAs, mixed methylation patterns, and “gain of function” TP53 mutations. However, the survival for all clusters was similar. Integration of genomic and clinical data from patients that do not respond to chemotherapy has identified different subgroups or clusters. Pathway analysis further identified the potential alternative therapeutic targets for each cluster. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Review

Jump to: Research

Open AccessReview
TCGA-TCIA Impact on Radiogenomics Cancer Research: A Systematic Review
Int. J. Mol. Sci. 2019, 20(23), 6033; https://doi.org/10.3390/ijms20236033 - 29 Nov 2019
Abstract
In the last decade, the development of radiogenomics research has produced a significant amount of papers describing relations between imaging features and several molecular ‘omic signatures arising from next-generation sequencing technology and their potential role in the integrated diagnostic field. The most vulnerable [...] Read more.
In the last decade, the development of radiogenomics research has produced a significant amount of papers describing relations between imaging features and several molecular ‘omic signatures arising from next-generation sequencing technology and their potential role in the integrated diagnostic field. The most vulnerable point of many of these studies lies in the poor number of involved patients. In this scenario, a leading role is played by The Cancer Genome Atlas (TCGA) and The Cancer Imaging Archive (TCIA), which make available, respectively, molecular ‘omic data and linked imaging data. In this review, we systematically collected and analyzed radiogenomic studies based on TCGA-TCIA data. We organized literature per tumor type and molecular ‘omic data in order to discuss salient imaging genomic associations and limitations of each study. Finally, we outlined the potential clinical impact of radiogenomics to improve the accuracy of diagnosis and the prediction of patient outcomes in oncology. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessReview
Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets
Int. J. Mol. Sci. 2019, 20(18), 4414; https://doi.org/10.3390/ijms20184414 - 07 Sep 2019
Abstract
Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be [...] Read more.
Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Open AccessReview
Immune Checkpoint Blockade for Advanced NSCLC: A New Landscape for Elderly Patients
Int. J. Mol. Sci. 2019, 20(9), 2258; https://doi.org/10.3390/ijms20092258 - 07 May 2019
Cited by 3
Abstract
The therapeutic scenario for elderly patients with advanced NSCLC has been limited to radiotherapy and chemotherapy. Recently, a novel therapeutic approach based on targeting the immune-checkpoints has showed noteworthy results in advanced NSCLC. PD1/PD-L1 pathway is co-opted by tumor cells through the expression [...] Read more.
The therapeutic scenario for elderly patients with advanced NSCLC has been limited to radiotherapy and chemotherapy. Recently, a novel therapeutic approach based on targeting the immune-checkpoints has showed noteworthy results in advanced NSCLC. PD1/PD-L1 pathway is co-opted by tumor cells through the expression of PD-L1 on the tumor cell surface and on cells within the microenvironment, leading to suppression of anti-tumor cytolytic T-cell activity by the tumor. The success of immune-checkpoints inhibitors in clinical trials led to rapid approval by the FDA and EMA. Currently, data regarding efficacy and safety of ICIs in older subjects is limited by the poor number of elderly recruited in clinical trials. Careful assessment and management of comorbidities is essential to achieve better outcomes and limit the immune related adverse events in elderly NSCLC patients. Full article
(This article belongs to the Special Issue Data Analysis and Integration in Cancer Research)
Show Figures

Figure 1

Back to TopTop