Next Article in Journal
Strategic Optimization of the Middle Domain IIIA in RBP-Albumin IIIA-IB Fusion Protein to Enhance Productivity and Thermostability
Previous Article in Journal
Expanding Upon Genomics in Rare Diseases: Epigenomic Insights
Previous Article in Special Issue
Network-Based Bioinformatics Highlights Broad Importance of Human Milk Hyaluronan
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning

School of Computer Science and Technology, Xidian University, Xi’an 710126, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(1), 136; https://doi.org/10.3390/ijms26010136
Submission received: 22 April 2024 / Revised: 20 May 2024 / Accepted: 22 May 2024 / Published: 27 December 2024

Abstract

:
Neuroblastoma is a common malignant tumor in childhood that seriously endangers the health and lives of children, making it essential to find effective prognostic markers to accurately predict their clinical outcomes. The development of high-throughput technology in the biomedical field has made it possible to obtain multi-omics data, whose integration can compensate for missing or unreliable information in a single data source. In this study, we integrated clinical data and two omics data, i.e., gene expression and DNA methylation data, to study the prognosis of neuroblastoma. Since the features in omics data are redundant, it is crucial to conduct feature selection on them. We proposed a two-step feature selection (TSFS) method to quickly and accurately select the optimal features, where the first step aims at selecting candidate features and the second step is to remove redundant features among them using our proposed maximal association coefficient (MAC). Our goal is to predict composite clinical outcomes for neuroblastoma patients, i.e., their survival time and vital status at the last follow-up, which was validated to be two inter-correlated tasks. We conducted a series of experiments and evaluated the experimental results using accuracy and AUC (area under the ROC curve) evaluation metrics, which indicated that by the combination of the integration of the three types of data, our proposed TSFS method and a multi-task learning method can synergistically improve the reliability and accuracy of the prediction models.

1. Introduction

Neuroblastoma is a malignant pediatric tumor arising from neuroblasts descended from neural crest cells [1,2] that accounts for roughly 10% of all diagnosed pediatric cancers [2] but 15% of all pediatric cancer deaths [3]. The pathogenesis of neuroblastoma is not yet clear, and there is extreme heterogeneity both clinically and biologically [4,5,6,7]. Owing to this heterogeneity, its essential to accurately predict the most likely disease outcome for patients when they were diagnosed with neuroblastoma [8,9], which is an important and meaningful task. Many studies at home and abroad have found that many factors are associated with the prognosis of neuroblastoma [10,11,12,13] such as age at diagnosis, stage of disease, amplification of the N-myc proto-oncogene MYCN, chromosome abnormality, genetic mutation, etc. Accurately predicting the clinical outcomes for neuroblastoma patients not only helps patients know about their life expectancy but also helps clinicians make well-founded decision and further develop appropriate treatments [14,15,16,17].
Some researchers have employed a single type of omics data to study the prognosis of neuroblastoma patients. In genome field, Hidalgo et al. [11] proposed a computational model by using gene expression data to reveal the molecular mechanism of high-risk neuroblastoma and reveal the determinant of survival in neuroblastoma patients. Cangelosi et al. [13] found that hypoxia is a prognostic prediction marker for neuroblastoma by analyzing the gene expression profiles. The aberrant patterns of DNA methylation are a common feature of most cancers [18] and are demonstrated at the single-gene and genome-wide levels of neuroblastoma. Moreover, DNA methylation is related to the occurrence [18,19,20,21,22,23] and prognosis [24,25,26,27,28] of neuroblastoma. Yang et al. [19] discovered that the neuroblastoma occurrence can be inhibited by using demethylating agents to reverse epigenetic changes. Furthermore, genome-wide analysis of DNA methylation [24] revealed that the CpG island methylator phenotype is a strong determinant of poor prognosis in neuroblastoma. For neuroblastoma, methylation profiles are associated with the clinical outcomes of patients [29,30,31]. Furthermore, DNA methylation is related to gene expression [32,33,34] which is a complex process.
Integrating multiple types of data can compensate for missing or unreliable information in any single data source, and multiple sources of evidence pointing to the same result are unlikely to lead to false positives [35]. This can further deepen the understanding of the occurrence and development of diseases and improve the accuracy of early diagnosis and prognostic prediction [36,37,38]. To our knowledge, some researchers have studied the prognosis of patients by integrating multi-omics data. Mihaylov et al. [39] proposed a data integration method for when the data are heterogeneous and weakly correlated that adopts two inter-related integrative approaches, horizontal integration and vertical integration, to integrate clinical, gene expression, and copy number variation and cancer progression data to predict patient survival. Klim et al. [40] proposed a prediction model framework for small datasets that integrates gene expression and copy number variation data to predict the overall survival of neuroblastoma patients. Tranchevent et al. [41] integrated clinical, gene expression, and transcriptome data and proposed a network analysis method in which the weights of the edges in the network are the normalized and rescaled Pearson coefficients between patient pairs. The integrated data are the input of the network and the topology information of the network is used to train the classification model for predicting the clinical outcomes (‘Death from disease’, ‘Disease progression’ and ‘High-risk’) of patients.
Patients with neuroblastoma have different clinical outcomes [42,43]. In this study, we mainly focused on composite clinical outcomes, i.e., patient survival times and vital status at the last follow-up. To the best of our knowledge, some papers studied one of these two clinical outcomes. The overall survival time of a patient [13] was defined as the time (in years) from disease diagnosis to patient death or the last follow-up (if this patient is alive). Some researchers [44,45,46] have studied the overall survival of patients when the occurrence of death from disease occurs by partitioning patients into two categories, dead or alive. In addition, some researchers [47,48], have divided patients into short- and long-term survivors using a 5-year cutoff and studied the overall survival of patients. In conclusion, these papers studied the 5-year survival times or vital statuses of patients. The survival time of a patient is not equivalent to his/her vital status, and there are four different combinations. A patient’s survival time may be less than 5 years and his/her vital status is still alive, indicating that he/she may survive a longer time in the future. Similarity, a patient’s survival time can more than 5 years but he/she has died, indicating that he/she has an actual and known survival time. In both cases, the survival time and vital status of a patient are in the two categories respectively. Additionally, there are two other cases: a patient’s survival time is less than 5 years and he/she has died or a patient’s survival time is more than 5 years and he/she is still alive. In these two cases, both clinical outcomes are in the same category.
The event we are interested in is a composite clinical outcome of 5-year survival time and vital status (i.e., death or not) for neuroblastoma patients, which constitute two binary classification tasks. We employed a Pearson coefficient between the labels of these two tasks as a quantitative indicator of task correlation [49,50]. The Pearson coefficient between the labels of the survival time and vital status is 0.85, which indicates that the two tasks are inter-correlated. Consequently, we employed a multi-task learning method to solve these two tasks simultaneously. Multi-task learning is a machine learning method based on shared representation that puts multiple related tasks together for learning where different tasks share all or part of the model parameters, which alleviates the demand for data volume to a certain extent. Multi-task learning is a widely used learning paradigm [51,52,53] that is considered as an algorithm-level integration strategy [54]. Multi-task learning has been applied in the field of cancer prognostic study. Shao et al. [55] proposed a multi-task learning framework to study the joint diagnosis and prognosis of cancers for identifying their related features. Maggio et al. [45] proposed a deep learning framework (Concatenated Diagnostic-Relapse Prognostic, CDRP) based on multi-task learning, which obtained more accurate risk stratification to choose appropriate treatment strategies to improve prognosis.
The reliability and performance of predicting clinical outcomes can be improved. Although there have been many papers studying clinical outcomes, they aimed at a single clinical outcome, whereas we studied composite clinical outcomes. Because of the high redundancy of omics data and the two inter-correlated tasks of the survival time and vital status, the combination of feature selection and multi-task learning methods seems to be a good solution to solve the problems of prognostic prediction in neuroblastoma. The overall workflow of this study is shown in Figure 1. It is divided into four steps: data, feature selection, integrating, and models. The major contributions of this study are summarized as follows: (1) we built a framework that predicts a composite clinical outcome of 5-year survival time and vital status for neuroblastoma patients by integrating three types of data, i.e., clinical, gene expression, and DNA methylation data, which improves the reliability and accuracy of the prognostic prediction models; (2) we proposed a two-step feature selection (TSFS) method can quickly and accurately select the available features for developing prediction models; and (3) we identified some reliable markers and employed them to build the survival time and vital status prediction models.

2. Results and Discussion

2.1. Evaluation Metrics

Previously, we have indicated that we studied two binary classification problems. We employed accuracy and AUC (area under the ROC curve) evaluation metrics to evaluate the model performance. It is necessary to understand the confusion matrix before introducing evaluation metrics. For a binary classification problem, the confusion matrix is shown in Table 1.
The confusion matrix summarizes the results of classification prediction according to the real category and the predicted category. As shown in Table 1, TP is true positive, which denotes the number of those samples that are predicted to be positive in positive samples; FN is false negative, which denotes the number of those samples are predicted to be negative in positive samples; TN is true negative, which denotes the number of those samples that are predicted to be negative in negative samples; and FP is false positive, which denotes the number of those samples that are predicted to be positive in negative samples. The commonly used accuracy (ACC) evaluation metric is shown in Equation (1).
A C C = T P + T N T P + F N + F P + T N
Additionally, the AUC is also a widely used evaluation metric to evaluate the performance of the classification model. The value of the AUC is from 0 to 1. The larger the AUC value, the better the performance of the classification model.

2.2. Single-Task Learning Method for Building Prediction Models

In this section, we aim to prove the effectiveness of the TSFS and the reliability brought by the integration of three types of data and the features related to survival time and vital status. Firstly, we employed the single-step feature selection method (FS, described in Section 3.2) and the two-step feature selection method (TSFS, described in Section 3.4) to conduct feature selection. Meanwhile, we integrated three types of data to build the survival time and vital status prediction models by using SVM classifier that is very friendly to small samples. The SVM classifier [56,57,58] has been applied in the field of multi-omics data.
For the clinical data, we referred to previous research [44] and directly employed the three clinical features age at diagnosis in days, INSS stage, and MYCN status to build the survival time prediction model. We employed one clinical feature, age at diagnosis in days, to build the vital status prediction model. Additionally, we conducted min–max standardization to age in the clinical data and conducted one-hot code to INSS stage and MYCN status. For the gene expression and DNA methylation data, we employed the FS and TSFS methods, respectively, to select the optimal features for building the survival time and vital status prediction models. We directly used the selected gene expression and DNA methylation data to build these two prediction models. For the two prognostic prediction models, the hyperparameters α (described in Section 3.3, α belongs to [0, 1]) and Rth (described in Section 3.4, Rth belongs to [0, 1]) used in the TSFS method are empirically set as shown in Table 2. The two hyperparameters are determined according to the prediction results that were obtained by using 5-fold cross-validation.
To demonstrate the validity of the TSFS method and the reliability of the integration of three types of data, we conducted eight different experiments to build the two prediction models. We selected a collection of potential prognosis divers with varying functions and roles in the prognosis of neuroblastoma. We divided the data into five train/test splits, as well as training and testing the model on each split. The classifier performance on each test split is shown in the form of box plots in Figure 2 in order to observe the performance distribution. The average of the results obtained from five test splits is adopted to show the model performance, as shown in Figure 3. In the legends of Figure 2 and Figure 3, ‘Cli’ represents the selected clinical data, ‘G_FS’ represents the selected gene expression data by the FS method, ‘G_TSFS’ represents the selected gene expression data by the TSFS method, ‘M_FS’ represents the selected DNA methylation data by the FS method, and ‘M_TSFS’ represents the selected DNA methylation data by the TSFS method. In addition, the symbol ‘+’ represents the early integration method, i.e., the omics data is integrated by the method of concatenation.
It can be seen from Figure 2 that the classifier performances of the ACC and AUC on ‘Cli+G_TSFS’ are better than those of ‘Cli+G_FS’ for these two prognostic prediction models. Although the classifier performances of the ACC and AUC on ‘Cli+M_TSFS’ are slightly better than those of ‘Cli+M_FS’, the features for building prediction models are reduced, indicating that we should remove the redundant features. These results verify the validity of the TSFS method in building prediction models. Additionally, the ACC of ‘Cli+G_TSFS’ is better than that of ‘Cli+M_TSFS’ but the AUC of ‘Cli+M_TSFS’ is better than ‘Cli+G_TSFS’ for both the survival time and vital status prediction models, indicating that the DNA methylation data provided better model reliability but poorer model performance than the gene expression data. Previously, DNA methylation data have been proven to be useful for the prognosis of neuroblastoma. Moreover, the classifier performances of the ACC and AUC on ‘Cli+G_TSFS+M_TSFS’ are slightly better than those of ‘Cli+G_TSFS’; we argue that simple multi-omics integration (concatenation-based data integration method) may not be able to effectively mine hidden or complementary information between different omics data.
The model performance, i.e., the ACC and AUC, as well as the number of features used to build survival time and vital status prediction models, are shown in Figure 3. In Figure 3A,B, these eight computational models are presented by using eight different markers. There is a list of numbers next to each marker that sequentially represent ACC, AUC, and the number of features of this specific model. For example, the list [0.7843, 0.9028, 80] in Figure 3A shows that 0.7843 is the ACC of this model, 0.9028 is the AUC of this model, and 80 is number of features for building this model. In Figure 3A,B, the model near to the bottom right is better than the other models, i.e., the model has larger ACC and AUC values and fewer features compared with other models.
It should be noted in Figure 3A that, compared with the survival time prediction model built by ‘Cli+G_TSFS+M_FS’, although the AUC of the model built by ‘Cli+G_TSFS+M_TSFS’ is slightly lower at about 0.5%, the ACC is higher, at about 2%, and the number of features is greatly reduced. After comprehensive consideration, we employed ‘Cli+G_TSFS+M_TSFS’ to build the survival time prediction model. It should be noted in Figure 3B that, when building the vital status prediction models by ‘Cli+G_TSFS+M_TSFS’ and ‘Cli+G_TSFS’, the differences in the ACC and AUC between these two models were 0.1%. When the differences in both the ACC and AUC between two models are small, we prefer the model with a larger AUC. Consequently, we employed ‘Cli+G_TSFS+M_TSFS’ to build the vital status prediction model. Finally, for the survival time prediction model, we integrated 3 clinical, 69 gene, and 250 methylation features to build this model. For the vital status prediction model, we integrated 1 clinical, 55 gene, and 237 methylation features to build this model.
We show the experimental results of the survival time and vital status prediction models built by integrating the clinical data and the multi-omics data selected according to the FS and TSFS methods in Figure 3A,B. From Figure 3A,B, we can obtain the following results: (1) the features selected by the TSFS method are better than FS method and (2) the integration of multiple types of data can provide reliable information compared with a single data source, which is of great significance in studying cancer prognosis.

2.3. Multi-Task Learning Method for Building Prediction Models

Since the two tasks of predicting survival time and vital status are inter-correlated and the number of samples is small, we employed the multi-task learning method MMoE (Multi-gate Mixture-of-Experts, described in Section 3.5) to solve these two tasks simultaneously. The research framework is shown in Figure 4, and the prediction results are outputted simultaneously during the final modeling.
To select features that are available for both survival time and vital status prediction models, the intersection of the clinical, gene, and methylation features regarding these two models identified in Section 2.2 were used as the input features for multi-task learning. The number of features of single-task learning and that of multi-task learning are shown in Table 3, where ‘SingleOS’ and ‘SingleVS’ represent the survival time and vital status prediction models built by a single-task learning method (SVM classifier) and ‘MultiOSVS’ represents the prediction model built simultaneously by a multi-task learning method (MMoE).
We identified some features that are used in the multi-task learning method, some of which have also been reported in other papers, as shown in Table 4. However, the fact that some biomarkers in this study have not yet been reported does not reflect that they have no effect on prognosis. Various further experiments are needed to verify these unreported biomarkers regarding prognosis.
We employed the MMoE multi-task learning method to build the survival time and vital status prediction models simultaneously and jointly evaluated these two models by the absolute accuracy and AUC. The so-called absolute accuracy refers to the ratio of the number of patients whose predicted survival time and vital status labels are completely consistent with their true labels compared to the total number of patients. We randomly split the dataset into 80% for the training dataset and 20% for the testing dataset. We used five-fold cross validation in training data to obtain the optimal parameters of the MMoE model and tested this specific model in testing data. The experimental results of the survival time and vital status prediction models built by using single-task and multi-task learning methods are shown in Figure 5. As shown in the legend on the right of Figure 5, the five different models are presented by different markers. The meanings represented by ‘singleOS’, ‘singleVS’, and ‘multiOSVS’ are consistent with those in Table 3. In addition, ‘OS_156’ and ‘VS_156’ represent the survival time and vital status prediction models built by the features related to both the survival time and vital status, which are 1 clinical, 4 gene, and 141 DNA methylation features. There is a list of numbers next to each marker that sequentially represent the ACC, AUC, and the number of features of this specific model. For example, the list [0.9444, 0.9769, 156] shows that 0.9444 is the ACC of this model, 0.9769 is the AUC of this model, and 156 is number of features for building this model. In Figure 5, the closer the model is to the bottom right, the better the performance of the model. This signifies that the model has larger ACC and AUC values and fewer features compared with other models.
The ACC and AUC of ‘singleOS’ are better than those of ‘OS_156’. Similarly, the ACC and AUC of ‘singleVS’ are better than those of ‘VS_156’. The comparison results show that the features previously described in Section 2.2 are effective for predicting the survival time or vital status of a patient. Compared with single-task learning, the eliminated gene features in multi-task learning are shown to have an impact on survival time or vital status.
It has been proven that multi-task learning can improve the performance of each task. With the MMoE multi-task learning method, we used features related to both the survival time and vital status to build the prognostic prediction model. As shown in Figure 5, we know that the ACC and AUC of the survival time and vital status prediction models simultaneously built by using multi-task learning method are better than the two models built by using the single-task learning method. Compared with the ‘SingleOS’ and ‘SingleVS’ models, not only are the number of features that are used to build the ‘MultiOSVS’ model reduced but also the performance of this model is improved. This shows that multi-task learning can achieve higher prediction accuracy by using fewer features than single-task learning.

3. Materials and Methods

3.1. Dataset

The neuroblastoma dataset contains clinical and various omics data from the TARGET database (Therapeutically Applicable Research To Generate Effective Treatments, see https://ocg.cancer.gov/programs/target (accessed on 9 November 2020)). In this study, we focused on integrating the clinical data and two types of omics data, i.e., the gene expression and DNA methylation data, to study the prognosis of neuroblastoma patients. We first downloaded these three types of data, as well as their ground-truth survival time and vital status contained in the clinical dataset. We used a Venn diagram to reveal the overlap between the three types of data, as shown in Figure 6. We only employed sample sets with clinical, gene expression, and DNA methylation data, leaving us with a total of 88 samples.
The features of the clinical dataset contain: age at diagnosis in days, INSS stage, MYCN status, and so on. We listed the demographic information of the patients from the neuroblastoma dataset in Table 5. We can know some information: (1) the average age of patients is 3 years old; (2) the number of male patients is 1.38 times higher than that of female patients; (3) 78.41% of patients are White; (4) the INSS stage of 85.23% patients is stage 4; (5) 86.36% of patients are high risk; (6) the number of dead patients is 1.38 times higher than that of alive patients at the last follow-up; and (7) the average survival time is 5.02 years and the median survival time is 4.58 years.
There are 22,985 genes in gene expression dataset, in which the missing data were filled by the mean method. For DNA methylation dataset, all empty features and features with a missing data amount greater than 5% were deleted, and finally 372,843 features remained in which the missing data were also filled using the mean method.
In cancer prognostic research [47,71,72], patients are divided into short-term and long-term survivors according to a set survival threshold, where the threshold is usually set to 3 or 5 years. In this study, the threshold was set to 5 years, the neuroblastoma patients were divided into 46 short-term survivors labeled 0 and 42 long-term survivors labeled 1. For patients, there were two types of vital status at the last follow-up, i.e., dead or alive; there were 51 dead samples labeled 0 and 37 alive samples labeled 1.

3.2. FS Feature Selection Method

The Fisher score (FS) [73,74] is a high-efficiency filter feature selection method. Its main idea is that the distance between samples is as small as possible in the same categories and as far as possible in different categories for the features with a strong discrimination performance. The score Si of the i-th feature obtained by the FS method is shown in Equation (2)
S i = j = 1 k n j μ i j μ i 2 j = 1 k n j ρ i j 2
where μ i j and ρ i j are the mean and the variance of the i-th feature in the j-th class, n j is the number of instances in the j-th class, and μ i is the mean of the i-th feature.
The importance of the features is ranked according to the scores of all features. The higher the score, the higher the ranking. We employed the FS method on each piece of omics data and determined the optimal feature number k based on the prediction results that were obtained by using 5-fold cross validation. However, the FS feature selection method evaluates the significance of features individually, while ignoring the potential correlation information among them [75]. Consequently, we employed our previously proposed maximal association coefficient (MAC) method [76] to detect the association between any two features among the features selected by the FS method to remove redundant features, which is briefly described in Section 3.3.

3.3. Maximal Association Coefficient

In this section, we will briefly introduce the maximal association coefficient (MAC) that was used to measure the association between two features. The association between them may be nonlinear, which can be partitioned into several piecewise-linear associations. This is a method to use a linear correlation coefficient to measure the nonlinear association between them. The piecewise-linear association can be achieved by a partitioning method. The schematic diagram of the MAC is shown in Figure 7A.
It is necessary to set a maximum number of grids (MG) to avoid infinite grids. The maximum number of grids (MG) is M G = max 4 , n α , where n is data size and α is a hyperparameter and α belongs to [0, 1]. For variable x and variable y, they are divided into s and t bins, where t is equal to M G / s . The data are divided into s × t s*t grids, where s belongs to [2, MG/2], so that we can obtain many different forms of grid partition between two variables. Under each grid partition, an association coefficient (AC) between variable x and y is calculated based on Equation (3).
A C x , y s , t = i w i p i
In Equation (3), i is the i-th grid containing data, w i is the weight of i-th grid, and | p i | is the absolute value of the Pearson coefficient of data in i-th grid. The w i can be obtained according to Equation (4).
w i = s i j s j
In Equation (4), s i is the area of the i-th grid and j s j is the sum of the areas of all grids that contain data. Moreover, in Equation (4), the weight w i is the normalization process to the area, so that the sum of the weights is 1, i.e., i w i is 1.
The association coefficient under each grid partition can be obtained by Equation (2), in which the maximum value of these association coefficients is the maximal association coefficient (MAC), as shown in Equation (5).
M A C x , y = max s , t A C x , y x , t | s × t = M G

3.4. TSFS Method

Considering that it is not only to quickly select candidate features for classifying but also to accurately remove the redundant features between them, we propose a novel two-step feature selection (TSFS) method in this section. The framework of the TSFS method is shown in Figure 8.
For removing redundant features, we defined a redundance threshold (Rth) that is a hyperparameter. If the MAC between two features is equal to or greater than Rth, then one of them is a redundant feature. As shown in Figure 9, we revealed the association between features before and after removing redundant features in the form of a schematic diagram. This is an undirected graph, where a node represents a feature, and different letters in the node represent different features. Before removing redundant features, there is an edge between two nodes when the MAC between them is equal to or greater than Rth, as shown in Figure 9A. After removing redundant features, the remaining features are shown in Figure 9B.
The adjacency matrix W represents the connection relationship between nodes in the graph, where Wij is the weight of the edge between node i and node j, as shown in Equation (6)
W i j = 1 ,   i f   M A C i , j R t h , 0 ,     o t h e r w i s e ,   i , j R n , i j
where i is the i-th feature, j is the j-th feature, and MAC(i, j) is the maximal association coefficient (MAC) between feature i and feature j and is equal to or greater than the hyperparameter Rth.
The redundant score for each feature can be obtained according to the above adjacency matrix W, as shown in Equation (7), where W i j is obtained according to Equation (5). Afterwards, the purpose of removing redundant features is achieved according to the redundant scores of the features. The obtained redundant scores of all features were sorted in descending order. The higher the feature ranking, the higher the redundancy of the feature. We removed some redundant features based on the prediction results that were obtained by using 5-fold cross validation.
S c o r e i = j i W i j

3.5. Multi-Task Learning

Multi-task learning [50,77,78,79] is a machine learning method whose goal is to optimize several tasks simultaneously to improve the performance of each task. In this study, we employed a multi-task learning method [50], Multi-gate Mixture-of-Experts (MMoE) proposed by Google, to simultaneously build the survival time and vital status prediction models for neuroblastoma patients. The MMoE considers the commonalities and characteristics of different tasks, thereby reducing the drag between tasks with a weak correlation and improving the sharing between tasks with a strong correlation. The MMoE contains input, gate, shared-bottom expert, top tower, and output layers. The MMoE set up multiple expert networks on the shared-bottom expert layer, each of which is called an expert. Each task corresponds to one gate model. For different tasks, the output of a specific gate represents the probability that different experts are selected. The probability-weighted sum of different experts is obtained and used as the input of a specific tower model to obtain the final output.

4. Conclusions

In this study, we proposed a research framework that can quickly and accurately select the optimal features and integrate clinical data and two types of omics data for developing a prediction model with better performance than one with a single data source. Moreover, more and more omics data can be used to test and improve the prediction model due to advances in high-throughput technology. However, there may be a limitation in integrating more omics data since (1) data collection is hard and (2) patients with multiple types of data are few. Nevertheless, researchers can still employ our research framework in their studies no matter how many types of omics data they have.
Accurately predicting the clinical outcomes of neuroblastoma patients can help physicians design personalized treatments. In this study, we aimed to investigate composite clinical outcomes, i.e., survival time and vital status, by integrating clinical data and two types of multi-omics data, i.e., gene expression and DNA methylation data. Owing to the high redundancy of omics data, we proposed a two-step feature selection method (TSFS) to quickly and accurately select the available features for downstream tasks. Ultimately, we identified 1 clinical, 4 gene, and 151 DNA methylation features to build a prediction model. These identified biomarkers may need to be verified by various biological experiments.
As a small sample and inter-correlated multi-task problem, the final prediction model is given via the Multi-gate Mixture-of-Experts (MMoE) multi-task learning method. Consequently, we employed the MMoE method to simultaneously build the survival time and vital status prediction models for patients by integrating three types of data. The clinical outcomes derived from prediction models not only help clinicians to improve the accuracy of predictions but also help doctors pay attention to words when facing patients and their families, which can make them mentally and physically healthy and reduce their mental stress. Moreover, we can know which patients will have an indeterminate survival time according to the predicted vital status, and these patients may survive longer than the predicted results.

Author Contributions

P.W.: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Visualization, Writing—original draft, Writing—review and editing. J.Z.: Conceptualization, Methodology, Formal analysis, Writing—review and editing, Supervision, Funding acquisition. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Proof of Concept Foundation of Xidian University Hangzhou Institute of Technology of China under Grant No. GNYZ2023XJ0416, the National Natural Science Foundation of China under Grant No. 62374121, and the Natural Science Basic Research Program of Shaanxi Province of China under Grant No. 2021SF-184.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article or requested from the authors.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Bosse, K.R.; Maris, J.M. Advances in the Translational Genomics of Neuroblastoma: From Improving Risk Stratification and Revealing Novel Biology to Identifying Actionable Genomic Alterations. Cancer 2016. 122, 20–33. [CrossRef]
  2. Ponzoni, M.; Bachetti, T.; Corrias, M.V.; Brignole, C.; Pastorino, F.; Calarco, E.; Bensa, V.; Giusto, E.; Ceccherini, I.; Perri, P. Recent Advances in the Developmental Origin of Neuroblastoma: An Overview. J. Exp. Clin. Cancer Res. 2022, 41, 92. [Google Scholar] [CrossRef] [PubMed]
  3. Liu, Y.F.; Jia, Y.X.; Hou, C.Z.; Li, N.; Zhang, N.; Yan, X.S.; Yang, L.; Guo, Y.; Chen, H.T.; Li, J.; et al. Pathological Prognosis Classification of Patients with Neuroblastoma Using Computational Pathology Analysis. Comput. Biol. Med. 2022, 149, 105980. [Google Scholar] [CrossRef] [PubMed]
  4. Jiang, M.R.; Stanke, J.; Lahti, J.M. The Connections between Neural Crest Development and Neuroblastoma. Curr. Top. Dev. Biol. 2011, 94, 77–127. [Google Scholar]
  5. Salazar, B.M.; Balczewski, E.A.; Ung, C.Y.; Zhu, S.Z. Neuroblastoma, a Paradigm for Big Data Science in Pediatric Oncology. Int. J. Mol. Sci. 2016, 18, 37. [Google Scholar] [CrossRef]
  6. Rybinski, B.; Wolinsky, T.; Brohl, A.; Moerdler, S.; Reed, D.R.; Ewart, M.; Weiser, D. Multifocal Primary Neuroblastoma Tumor Heterogeneity in Siblings with Co-Occurring PHOX2B and NF1 Genetic Aberrations. Gene. Chromosome. Canc. 2020, 59, 119–124. [Google Scholar] [CrossRef]
  7. Lundberg, K.I.; Treis, D.; Johnsen, J.I. Neuroblastoma Heterogeneity, Plasticity, and Emerging Therapies. Curr. Oncol. Rep. 2022, 24, 1053–1062. [Google Scholar] [CrossRef]
  8. Esposito, M.R.; Aveic, S.; Seydel, A.; Tonini, G.P. Neuroblastoma Treatment in the Post-Genomic Era. J. Biomed. Sci. 2017, 24, 14. [Google Scholar] [CrossRef]
  9. He, X.Y.; Qin, C.; Zhao, Y.D.; Zou, L.; Zhao, H.; Cheng, C. Gene Signatures Associated with Genomic Aberrations Predict Prognosis in Neuroblastoma. Cancer Commun. 2020, 40, 105–118. [Google Scholar] [CrossRef]
  10. Pugh, T.J.; Morozova, O.; Attiyeh, E.F.; Asgharzadeh, S.; Wei, J.S.; Auclair, D.; Carter, S.L.; Cibulskis, K.; Hanna, M.; Kiezun, A.; et al. The Genetic Landscape of High-Risk Neuroblastoma. Nat. Genet. 2013, 45, 279–284. [Google Scholar] [CrossRef]
  11. Hidalgo, M.R.; Alicia, A.; Çubuk, C.; Carbonell-Caballero, J.; Dopazo, J. Models of Cell Signaling Uncover Molecular Mechanisms of High-Risk Neuroblastoma and Predict Disease Outcome. Biol. Direct. 2018, 13, 16. [Google Scholar] [CrossRef] [PubMed]
  12. Applebaum, M.A.; Barr, E.K.; Karpus, J.; Nie, J.; Zhang, Z.; Armstrong, A.E.; Uppal, S.; Sukhanova, M.; Zhang, W.; Chlenski, A.; et al. 5-Hydroxymethylcytosine Profiles are Prognostic of Outcome in Neuroblastoma and Reveal Transcriptional Networks that Correlate with Tumor Phenotype. JCO Precis. Oncol. 2019, 3, PO.18.00402. [Google Scholar] [CrossRef] [PubMed]
  13. Cangelosi, D.; Morini, M.; Zanardi, N.; Sementa, A.R.; Muselli, M.; Conte, M.; Garaventa, A.; Pfeffer, U.; Bosco, M.C.; Varesio, L.; et al. (2020). Hypoxia Predicts Poor Prognosis in Neuroblastoma Patients and Associates with Biological Mechanisms Involved in Telomerase Activation and Tumor Microenvironment Reprogramming. Cancers 2020, 12, 2343. [Google Scholar] [CrossRef] [PubMed]
  14. Sun, Y.J.; Goodison, S.; Li, J.; Liu, L.; Farmerie, W. Improved Breast Cancer Prognosis Through the Combination of Clinical and Genetic Markers. Bioinformatics 2007, 23, 30–37. [Google Scholar] [CrossRef] [PubMed]
  15. Sun, D.D.; Wang, M.H.; Li, A. A Multimodal Deep Neural Network for Human Breast Cancer Prognosis Prediction by Integrating Multi-Dimensional Data. IEEE/ACM Trans. Comput. Biol. Bioinform. 2018, 16, 841–850. [Google Scholar] [CrossRef]
  16. Zafar, A.; Wang, W.; Liu, G.; Wang, X.J.; Xian, W.; McKeon, F.; Foster, J.; Zhou, J.; Zhang, R.W. Molecular Targeting Therapies for Neuroblastoma: Progress and Challenges. Med. Res. Rev. 2021, 41, 961–1021. [Google Scholar] [CrossRef]
  17. Bagatell, R.; DuBois, S.G.; Naranjo, A.; Belle, J.; Coldsmith, K.C.; Park, J.R.; Irwin, M.S.; COG Neuroblastoma Committee. Children’s Oncology Group’s 2023 Blueprint for Research: Neuroblastoma. Pediatr. Blood Cancer. 2023, 70, e30572. [Google Scholar] [CrossRef]
  18. Olsson, M.; Beck, S.; Kogner, P.; Martinsson, T.; Carén, H. Genome-Wide Methylation Profiling Identifies Novel Methylated Genes in Neuroblastoma Tumors. Epigenetics 2016, 11, 74–84. [Google Scholar] [CrossRef] [PubMed]
  19. Yang, Q.W.; Tian, Y.F.; Ostler, K.R.; Chlenski, A.; Guerrero, L.J.; Salwen, H.R.; Godley, L.A.; Cohn, S.L. Epigenetic Alterations Differ in Phenotypically Distinct Human Neuroblastoma Cell Lines. BMC Cancer 2010, 10, 286. [Google Scholar] [CrossRef] [PubMed]
  20. Domingo-Fernandez, R.; Watters, K.; Piskareva, O.; Stallings, R.L.; Bray, I. The Role of Genetic and Epigenetic Alterations in Neuroblastoma Disease Pathogenesis. Pediatr. Surg. Int. 2013, 29, 101–119. [Google Scholar] [CrossRef]
  21. Gómez, S.; Castellano, G.; Mayol, G.; Suñol, M.; Queiros, A.; Bibikova, M.; Nazor, K.L.; Loring, J.F.; Lemos, I.; Rodríguez, E.; et al. DNA Methylation Fingerprint of Neuroblastoma Reveals new Biological and Clinical Insights. Epigenomics 2015, 5, 1137–1153. [Google Scholar] [CrossRef] [PubMed]
  22. Charlet, J.; Tomari, A.; Dallosso, A.R.; Szemes, M.; Kaselova, M.; Curry, T.J.; Almutairi, B.; Etchevers, H.C.; McConville, C.; Malik, K.T.; et al. Genome-Wide DNA Methylation Analysis Identifies MEGF10 as a Novel Epigenetically Repressed Candidate Tumor Suppressor Gene in Neuroblastoma. Mol. Carcinog. 2017, 56, 1290–1301. [Google Scholar] [CrossRef] [PubMed]
  23. Durinck, K.; Speleman, F. Epigenetic Regulation of Neuroblastoma Development. Cell Tissue. Res. 2018, 372, 309–324. [Google Scholar] [CrossRef]
  24. Abe, M.; Ohira, M.; Kaneda, A.; Yagi, Y.; Yamamoto, S.; Kitano, Y.; Takato, T.; Nakagawara, A.; Ushijima, T. CpG Island Methylator Phenotype is a Strong Determinant of Poor Prognosis in Neuroblastomas. Cancer Res. 2005, 65, 828–834. [Google Scholar] [CrossRef] [PubMed]
  25. Gumy-Pause, F.; Pardo, B.; Khoshbeen-Boudal, M.; Ansari, M.; Gayet-Ageron, A.; Sappino, A.P.; Attiyeh, E.F.; Ozsahin, H. GSTP1 Hypermethylation is Associated with Reduced Protein Expression, Aggressive Disease and Prognosis in Neuroblastoma. Gene. Chromosome. Canc. 2012, 51, 174–185. [Google Scholar] [CrossRef]
  26. Fetahu, I.S.; Taschner-Mandl, S. Neuroblastoma and the Epigenome. Cancer Metast. Rev. 2021, 40, 173–189. [Google Scholar] [CrossRef] [PubMed]
  27. Watanabe, K.; Kimura, S.; Seki, M.; Isobe, T.; Kubota, Y.; Sekiguchi, M.; Sato-Otsubo, A.; Hiwatari, M.; Kato, M.; Oka, A.; et al. Identification of the Ultrahigh-Risk Subgroup in Neuroblastoma Cases through DNA Methylation Analysis and Its Treatment Exploiting Cancer Metabolism. Oncogene 2022, 41, 4994–5007. [Google Scholar] [CrossRef] [PubMed]
  28. Lalchungnunga, H.; Hao, W.; Maris, J.M.; Asgharzadeh, S.; Henrich, K.O.; Westermann, F.; Tweddle, D.A.; Schwalbe, E.C.; Strathdee, G. Genome Wide DNA Methylation Analysis Identifies Novel Molecular Subgroups and Predicts Survival in Neuroblastoma. Brit. J. Cancer. 2022, 127, 2006–2015. [Google Scholar] [CrossRef]
  29. Yang, Q.W.; Zage, P.; Kagan, D.; Tian, Y.F.; Seshadri, R.; Salwen, H.R.; Liu, S.Q.; Chlenski, A.; Cohn, S.L. Association of Epigenetic Inactivation of RASSF1A with Poor Outcome in Human Neuroblastoma. Clin. Cancer Res. 2004, 10, 8493–8500. [Google Scholar] [CrossRef] [PubMed]
  30. Banelli, B.; Gelvi, I.; Di Vinci, A.; Scaruffi, P.; Casciano, I.; Allemanni, G.; Bonassi, S.; Tonini, G.P.; Romani, M. Distinct CpG Methylation Profiles Characterize Different Clinical Groups of Neuroblastic Tumors. Oncogene 2005, 24, 5619–5628. [Google Scholar] [CrossRef]
  31. Abe, M.; Westermann, F.; Nakagawara, A.; Takato, T.; Schwab, M.; Ushijima, T. Marked and Independent Prognostic Significance of the CpG Island Methylator Phenotype in Neuroblastomas. Cancer Lett. 2007, 247, 253–258. [Google Scholar] [CrossRef] [PubMed]
  32. Jones, P.A. Functions of DNA Methylation: Islands, Start Sites, Gene Bodies and Beyond. Nat. Rev. Genet. 2012, 13, 484–492. [Google Scholar] [CrossRef] [PubMed]
  33. Moore, L.D.; Le, T.; Fan, G.P. DNA Methylation and Its Basic Function. Neuropsychopharmacol. 2013, 38, 23–38. [Google Scholar] [CrossRef]
  34. Pickles, J.C.; Stone, T.J.; Jacques, T.S. Methylation-Based Algorithms for Diagnosis: Experience from Neuro-Oncology. J. Pathol. 2020, 250, 510–517. [Google Scholar] [CrossRef]
  35. Zitnik, M.; Nguyen, F.; Wang, B.; Leskovec, J.; Goldenberg, A.; Hoffman, M.M. Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities. Inform. Fusion. 2019, 50, 71–91. [Google Scholar] [CrossRef] [PubMed]
  36. Kasemeier-Kulesa, J.C.; Schnell, S.; Woolley, T.; Spengler, J.A.; Morrison, J.A.; McKinney, M.C.; Pushel, I.; Wolfe, L.A.; Kulesa, P.M. Predicting Neuroblastoma Using Developmental Signals and a Logic-Based Model. Biophys. Chem. 2018, 238, 30–38. [Google Scholar] [CrossRef] [PubMed]
  37. Tranchevent, L.C.; Azuaje, F.; Rajapakse, J.C. A Deep Neural Network Approach to Predicting Clinical Outcomes of Neuroblastoma Patients. BMC Med. Genomics. 2019, 12, 178. [Google Scholar] [CrossRef]
  38. Masecchia, S.; Coco, S.; Barla, A.; Verri, A.; Tonini, G.P. Genome Instability Model of Metastatic Neuroblastoma Tumorigenesis by a Dictionary Learning Algorithm. BMC Med. Genomics. 2015, 8, 57. [Google Scholar] [CrossRef]
  39. Mihaylov, I.; Kańduła, M.; Krachunov, M.; Vassilev, D. A Novel Framework for Horizontal and Vertical Data Integration in Cancer Studies with Application to Survival Time Prediction Models. Biol. Direct. 2019, 14, 22. [Google Scholar] [CrossRef]
  40. Polewko-Klim, A.; Lesiński, W.; Mnich, K.; Piliszek, R.; Rudnicki, W.R. Integration of Multiple Types of Genetic Markers for Neuroblastoma May Contribute to Improved Prediction of the Overall Survival. Biol. Direct. 2018, 13, 17. [Google Scholar] [CrossRef]
  41. Tranchevent, L.C.; Nazarov, P.V.; Kaoma, T.; Schmartz, G.P.; Muller, A.; Kim, S.Y.; Rajapakse, J.C.; Azuaje, F. Predicting Clinical Outcome of Neuroblastoma Patients Using an Integrative Network-Based Approach. Biol. Direct. 2018, 13, 12. [Google Scholar] [CrossRef]
  42. Baali, I.; Acar, D.A.E.; Aderinwale, T.W.; HafezQorani, S.; Kazan, H. Predicting Clinical Outcomes in Neuroblastoma with Genomic Data Integration. Biol. Direct. 2018, 13, 20. [Google Scholar] [CrossRef]
  43. Körber, V.; Stainczyk, S.A.; Kurilov, R.; Henrich, K.O.; Hero, B.; Brors, B.; Westermann, F.; Höfer, H. Neuroblastoma Arises in Early Fetal Development and Its Evolutionary Duration Predicts Outcome. Nat. Genet. 2023, 55, 619–630. [Google Scholar] [CrossRef] [PubMed]
  44. Cangelosi, D.; Muselli, M.; Parodi, S.; Blengio, F.; Becherini, P.; Versteeg, R.; Conte, M.; Varesio, L. Use of Attribute Driven Incremental Discretization and Logic Learning Machine to Build a Prognostic Classifier for Neuroblastoma Patients. BMC Bioinformatics 2014, 15, S4. [Google Scholar] [CrossRef]
  45. Maggio, V.; Chierici, M.; Jurman, G.; Furlanello, C. Distillation of the Clinical Algorithm Improves Prognosis by Multi-Task Deep Learning in High-Risk Neuroblastoma. PLoS ONE 2018, 13, e0208924. [Google Scholar] [CrossRef] [PubMed]
  46. Zhou, Y.; Gao, J. A Novel Online Nomogram Established with Five Features before Surgical Resection for Predicating Prognosis of Neuroblastoma Children: A Population-Based Study. Technol. Cancer Res. Treat. 2023, 22, 15330338221145141. [Google Scholar] [CrossRef]
  47. Oberthuer, A.; Kaderali, L.; Kahlert, Y.; Hero, B.; Westermann, F.; Berthold, F.; Brors, B.; Eils, R.; Fischer, M. Subclassification and Individual Survival Time Prediction from Gene Expression Data of Neuroblastoma Patients by Using CASPAR. Clin. Cancer Res. 2008, 14, 6590–6601. [Google Scholar] [CrossRef]
  48. Stigliani, S.; Coco, S.; Moretti, S.; Oberthuer, A.; Fischer, M.; Theissen, J.; Gallo, F.; Garavent, A.; Berthold, F.; Bonassi, S.; et al. High Genomic Instability Predicts Survival in Metastatic High-Risk Neuroblastoma. Neoplasia 2012, 14, 823–832. [Google Scholar] [CrossRef]
  49. Kang, Z.L.; Grauman, K.; Sha, F. Learning with Whom to Share in Multi-Task Feature Learning. In ICML’11: Proceedings of the 28th International Conference on International Conference on Machine Learning; Bellevue, WA, USA, 28 June–2 July 2011; Getoor, L., Scheffer, T., Eds.; Omnipress: Madison, WI, USA, 2011; pp. 521–528. [Google Scholar]
  50. Ma, J.Q.; Zhao, Z.; Yi, X.Y.; Chen, J.L.; Hong, L.C.; Chi, E.H. Modeling Task Relationships in Multi-Task Learning with Multi-Gate Mixture-of-Experts. In KDD ‘18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ‘18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 1930–1939. [Google Scholar]
  51. Wang, X.; Zhang, Y.; Ren, X.; Zhang, Y.H.; Zitnik, M.; Shang, J.B.; Langlotz, C.; Han, J.W. Cross-Type Biomedical Named Entity Recognition with Deep Multi-Task Learning. Bioinformatics 2019, 35, 1745–1752. [Google Scholar] [CrossRef]
  52. Liang, W.; Zhang, K.; Cao, P.; Liu, X.L.; Yang, J.Z.; Zaiane, O. Rethinking Modeling Alzheimer’s Disease Progression from a Multi-Task Learning Perspective with Deep Recurrent Neural Network. Comput. Biol. Med. 2021, 138, 104935. [Google Scholar] [CrossRef] [PubMed]
  53. Tu, G.; Wen, J.T.; Liu, H.; Chen, S.T.; Zheng, L.; Jiang, D.Z. Exploration Meets Exploitation: Multitask Learning for Emotion Recognition Based on Discrete and Dimensional Models. Knowl.-Based Syst. 2022, 235, 107598. [Google Scholar] [CrossRef]
  54. Castro, D.M.; de Veaux, N.R.; Miraldi, E.R.; Bonneau, R. Multi-Study Inference of Regulatory Networks for More Accurate Models of Gene Regulation. PLoS Comput. Biol. 2019, 15, e1006591. [Google Scholar] [CrossRef] [PubMed]
  55. Shao, W.; Wang, T.X.; Sun, L.; Dong, T.H.; Han, T.; Huang, Z.; Zhang, J.; Zhang, D.Q.; Huang, K. Multi-Task Multi-Modal Learning for Joint Diagnosis and Prognosis of Human Cancers. Med. Image Anal. 2020, 65, 101795. [Google Scholar] [CrossRef] [PubMed]
  56. Stetson, L.C.; Pearl, T.; Chen, Y.W.; Barnholtz-Sloan, J.S. Computational Identification of Multi-Omic Correlates of Anticancer Therapeutic Response. BMC Genom. 2014, 15, S2. [Google Scholar] [CrossRef]
  57. Auslander, N.; Yizhak, K.; Weinstock, A.; Budhu, A.; Tang, W.; Wang, X.W.; Ambs, S.; Ruppin, E. A Joint Analysis of Transcriptomic and Metabolomic Data Uncovers Enhanced Enzyme-Metabolite Coupling in Breast Cancer. Sci. Rep. 2016, 6, 29662. [Google Scholar] [CrossRef]
  58. Giang, T.T.; Nguyen, T.P.; Tran, D.H. Stratifying Patients Using Fast Multiple Kernel Learning Framework: Case Studies of Alzheimer’s Disease and Cancers. BMC Med. Inform. Decis. Mak. 2020, 20, 108. [Google Scholar] [CrossRef] [PubMed]
  59. Crawford, S.E.; Stellmach, V.; Ranalli, M.; Huang, X.; Huang, L.; Volpert, O.; De Vries, G.H.; Abramson, L.P.; Bouck, N. Pigment Epithelium-Derived Factor (PEDF) in Neuroblastoma: A Multifunctional Mediator of Schwann Cell Antitumor Activity. J. Cell. Sci. 2001, 114, 4421–4428. [Google Scholar] [CrossRef]
  60. Cheng, G.; Zhong, M.; Kawaguchi, R.; Kassai, M.; Al-Ubaidi, M.; Deng, J.; Ter-Stepanian, M.; Sun, H. Identification of PLXDC1 and PLXDC2 as the Transmembrane Receptors for the Multifunctional Factor PEDF. eLife 2014, 3, e05401. [Google Scholar] [CrossRef] [PubMed]
  61. Sheikh, A.; Takatori, A.; Hossain, M.S.; Hasan, M.K.; Tagawa, M.; Nagase, H.; Nakagawara, A. Unfavorable Neuroblastoma Prognostic Factor NLRR2 Inhibits Cell Differentiation by Transcriptional Induction through JNK Pathway. Cancer Sci. 2016, 107, 1223–1232. [Google Scholar] [CrossRef] [PubMed]
  62. Banelli, B.; Bonassi, S.; Casciano, I.; Mazzocco, K.; Di Vinic, A.; Scaruffi, P.; Brigati, C.; Allemanni, G.; Borzì, L.; Tonini, G.P.; et al. Outcome Prediction and Risk Assessment by Quantitative Pyrosequencing Methylation Analysis of the SFN Gene in Advanced Stage, High-Risk, Neuroblastic Tumor Patients. Int. J. Cancer 2010, 126, 656–668. [Google Scholar] [CrossRef] [PubMed]
  63. Whittle, S.B.; Reyes, S.; Du, M.; Gireud, M.; Zhang, L.N.; Woodfield, S.E.; Ittmann, M.; Scheurer, M.E.; Bean, A.J.; Zage, P.E. A Polymorphism in the FGFR4 Gene is Associated with Risk of Neuroblastoma and Altered Receptor Degradation. J. Pediatr. Hematol. Oncol. 2016, 38, 131–138. [Google Scholar] [CrossRef] [PubMed]
  64. Wagner, L.M.; McLendon, R.E.; Yoon, K.J.; Weiss, B.D.; Billups, C.A.; Danks, M.K. Targeting Methylguanine-DNA Methyltransferase in the Treatment of Neuroblastoma. Clin. Cancer Res. 2007, 13, 5418–5425. [Google Scholar] [CrossRef] [PubMed]
  65. Zhang, C.Y.; Ding, Z.Z.; Luo, H. The Prognostic Role of m6A-Related Genes in Paediatric Neuroblastoma Patients. Comput. Math. Method. M. 2022, 8354932. [Google Scholar] [CrossRef] [PubMed]
  66. Zhu, K.; Gao, T.Y.; Wang, Z.R.; Zhang, L.R.; Tan, K.Z.; Lv, Z.B. RNA N6-Methyladenosine Reader IGF2BP3 Interacts with MYCN and Facilitates Neuroblastoma Cell Proliferation. Cell Death Discov. 2023, 9, 151. [Google Scholar] [CrossRef]
  67. Romani, M.; Scaruffi, P.; Casciano, I.; Mazzocco, K.; Lo Cunsolo, C.; Cavazzana, A.; Gambini, C.; Boni, L.; De Bernardi, B.; Tonini, G.P. Stage-Independent Expression and Genetic Analysis of TP73 in Neuroblastoma. Int. J. Cancer 1999, 84, 365–369. [Google Scholar] [CrossRef]
  68. Kaghad, M.; Bonnet, H.; Yang, A.; Creancier, L.; Biscan, J.C.; Valent, A.; Minty, A.; Chalon, P.; Lelias, J.M.; Dumont, X.; et al. Monoallelically Expressed Gene Related to p53 at 1p36, a Region Frequently Deleted in Neuroblastoma and Other Human Cancers. Cell 1997, 90, 809–819. [Google Scholar] [CrossRef]
  69. Rossi, M.; Sayan, E.A.; Terrinoni, A.; Melino, G.; Knight, R.A. Mechanism of Induction of Apoptosis by p73 and Its Relevance to Neuroblastoma Biology. Ann. N. Y. Acad. Sci. 2004, 1028, 143–149. [Google Scholar] [CrossRef] [PubMed]
  70. Sobhan, P.K.; Zhai, Q.W.; Green, L.C.; Hansford, L.M.; Funa, K. ASK1 Regulates the Survival of Neuroblastoma Cells by Interacting with TLX and Stabilizing HIF-1α. Cell. Signal. 2017, 30, 104–117. [Google Scholar] [CrossRef]
  71. Khademi, M.; Nedialkov, N.S. Probabilistic Graphical Models and Deep Belief Networks for Prognosis of Breast Cancer. ICMLA 2015: 14th IEEE International Conference on Machine Learning and Applications, Miami, FL, USA, 9–11 December 2015; pp. 727–732. [Google Scholar]
  72. Lundin, M.; Lundin, J.; Burke, H.B.; Toikkanen, S.; Pylkkänen, L.; Joensuu, H. Artificial Neural Networks Applied to Survival Prediction in Breast Cancer. Oncology 1999, 57, 281–286. [Google Scholar] [CrossRef]
  73. Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification, 2nd ed.; Wiley-interscience: Hoboken, NJ, USA, 2000; Volume 4, pp. 44–47. [Google Scholar]
  74. Gu, Q.Q.; Li, H.; Han, J.W. Generalized Fisher Score for Feature Selection. In UAI’11: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence; Barcelona, Spain, 14–17 July 2011; Cozman, F., Pfeffer, A., Eds.; AUAI Press: Arlington, VA, USA, 2011; pp. 266–273. [Google Scholar]
  75. Dudoit, S.; Fridlyand, J.; Speed, T.P. Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. J. Am. Stat. Assoc. 2002, 97, 77–87. [Google Scholar] [CrossRef]
  76. Wang, P.R.; Zhang, J.Y. A Novel Piecewise-Linear Method for Detecting Associations between Variables. PLoS ONE 2023, 18, e0290280. [Google Scholar] [CrossRef]
  77. Lu, Y.X.; Kumar, A.; Zhai, S.F.; Cheng, Y.; Javidi, T.; Feris, R. Fully-Adaptive Feature Sharing in Multi-Task Networks with Applications in Person Attribute Classification. CVPR 2017: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1131–1140. [Google Scholar]
  78. Silva, A.G.D.A.E.; Gomes, H.M.; Batista, L.V. A Collaborative Deep Multitask Learning Network for Face Image Compliance to ISO/IEC19794-5 Standard. Expert Syst. Appl. 2022, 198, 116756. [Google Scholar] [CrossRef]
  79. Zhang, J.W.; Zhang, P.; Guo, D.Y.; Zhou, Y.; Wu, Y.K.; Yang, B.; Lin, L. Automatic Repetition Instruction Generation for Air Traffic Control Training Using Multi-Task Learning with an Improved Copy Network. Knowl.-Based Syst. 2022, 241, 108232. [Google Scholar] [CrossRef]
Figure 1. The overall workflow of our study.
Figure 1. The overall workflow of our study.
Ijms 26 00136 g001
Figure 2. Comparing the classifier performance of each prediction model under eight different types of data. The red circles represent outliers. The green triangles represent the mean of the performance. The red lines represent the median of the performance. (A) ACC for the survival time prediction model; (B) AUC for the survival time prediction model; (C) ACC for the vital status prediction model; (D) AUC for the vital status prediction model. Each panel shows the performance for one of the eight types of data; the box plots show the performance distribution over five test sets.
Figure 2. Comparing the classifier performance of each prediction model under eight different types of data. The red circles represent outliers. The green triangles represent the mean of the performance. The red lines represent the median of the performance. (A) ACC for the survival time prediction model; (B) AUC for the survival time prediction model; (C) ACC for the vital status prediction model; (D) AUC for the vital status prediction model. Each panel shows the performance for one of the eight types of data; the box plots show the performance distribution over five test sets.
Ijms 26 00136 g002
Figure 3. The experimental results of the two prediction models built by the SVM classifier. (A) The experimental results of the eight survival time prediction models built by the SVM classifier. (B) The experimental results of the eight vital status prediction models built by the SVM classifier. The legend on the right of figure shows the eight different data types for building models.
Figure 3. The experimental results of the two prediction models built by the SVM classifier. (A) The experimental results of the eight survival time prediction models built by the SVM classifier. (B) The experimental results of the eight vital status prediction models built by the SVM classifier. The legend on the right of figure shows the eight different data types for building models.
Ijms 26 00136 g003
Figure 4. The research framework for the prognostic prediction in neuroblastoma is divided into 4 steps. I. Data. Clinical data and two types of omics data, gene expression and DNA methylation data, are used to predict composite clinical outcomes for neuroblastoma patients. II. TSFS method. The first step aims to select candidate features. The second step aims to remove the redundant features among them. III. Integrating three data types. The three selected types of data related to survival time and vital status are integrated by using the concatenation method. IV. Multi-task output. The intersection of two sets obtained in step III is used as the input of the multi-task learning method. Then, the results of two tasks are outputted simultaneously.
Figure 4. The research framework for the prognostic prediction in neuroblastoma is divided into 4 steps. I. Data. Clinical data and two types of omics data, gene expression and DNA methylation data, are used to predict composite clinical outcomes for neuroblastoma patients. II. TSFS method. The first step aims to select candidate features. The second step aims to remove the redundant features among them. III. Integrating three data types. The three selected types of data related to survival time and vital status are integrated by using the concatenation method. IV. Multi-task output. The intersection of two sets obtained in step III is used as the input of the multi-task learning method. Then, the results of two tasks are outputted simultaneously.
Ijms 26 00136 g004
Figure 5. Performance comparison of the prediction models built by single-task and multi-task learning methods.
Figure 5. Performance comparison of the prediction models built by single-task and multi-task learning methods.
Ijms 26 00136 g005
Figure 6. The overlap between the three types of data shown by a Venn diagram.
Figure 6. The overlap between the three types of data shown by a Venn diagram.
Ijms 26 00136 g006
Figure 7. The overview of the proposed maximal association coefficient (MAC). (A) The scheme for the MAC. (B) The schematic diagram of grid partition. The nonlinear association is composed of some piecewise-linear ones, but no one knows where the breakpoint for connecting two piecewise-linear ones is. Clustering techniques are one of the options to achieve the partition for avoiding the infinite number of partitions caused by a random partition. Afterwards, we employed a simple and commonly used clustering method, K-means, to partition each variable space into different bins. All data can be divided into different grids; the schematic diagram is shown in (B). And then, the Pearson coefficient was used to detect the linear association of the data in each grid. The weighted sum obtained by directly using the Pearson coefficient of data in each grid cannot reflect the association between two variables since the Pearson coefficient is between [−1, 1], which causes an offset. Consequently, we applied the absolute value of the Pearson coefficient to detect the linear association of data in each grid. In summary, we employed partition and the Pearson coefficient to measure the association between two variables.
Figure 7. The overview of the proposed maximal association coefficient (MAC). (A) The scheme for the MAC. (B) The schematic diagram of grid partition. The nonlinear association is composed of some piecewise-linear ones, but no one knows where the breakpoint for connecting two piecewise-linear ones is. Clustering techniques are one of the options to achieve the partition for avoiding the infinite number of partitions caused by a random partition. Afterwards, we employed a simple and commonly used clustering method, K-means, to partition each variable space into different bins. All data can be divided into different grids; the schematic diagram is shown in (B). And then, the Pearson coefficient was used to detect the linear association of the data in each grid. The weighted sum obtained by directly using the Pearson coefficient of data in each grid cannot reflect the association between two variables since the Pearson coefficient is between [−1, 1], which causes an offset. Consequently, we applied the absolute value of the Pearson coefficient to detect the linear association of data in each grid. In summary, we employed partition and the Pearson coefficient to measure the association between two variables.
Ijms 26 00136 g007
Figure 8. The scheme for the two-step feature selection (TSFS) method. As shown in Figure 8, the TSFS method selects the optimal features in two steps, where the first step is elementary selection and the second step is secondary selection. The combination of the elementary and secondary selections can quickly and accurately conduct feature selection. The novel idea is that the purpose of the elementary selection is to initially screen the candidate features that can be used to build a model; the secondary selection is to refine the features obtained from the elementary selection to obtain the optimal features for modeling. The FS is a high-efficiency feature selection method that can be used as the elementary selection. The MAC can be used to measure the association between two features to mine the potential information between them for revealing the correlation of the two features. Since the MAC can accurately detect an association between two features, it can be used as the secondary selection. In the TSFS method, step 1 is to select the candidate features using the FS feature selection method and step 2 uses the MAC to detect the association among candidate features obtained in the step 1 to achieve the purpose of removing redundant features utilizing the information provided through association strength between them. In this way, the optimal features for downstream classification tasks are selected by the TSFS method.
Figure 8. The scheme for the two-step feature selection (TSFS) method. As shown in Figure 8, the TSFS method selects the optimal features in two steps, where the first step is elementary selection and the second step is secondary selection. The combination of the elementary and secondary selections can quickly and accurately conduct feature selection. The novel idea is that the purpose of the elementary selection is to initially screen the candidate features that can be used to build a model; the secondary selection is to refine the features obtained from the elementary selection to obtain the optimal features for modeling. The FS is a high-efficiency feature selection method that can be used as the elementary selection. The MAC can be used to measure the association between two features to mine the potential information between them for revealing the correlation of the two features. Since the MAC can accurately detect an association between two features, it can be used as the secondary selection. In the TSFS method, step 1 is to select the candidate features using the FS feature selection method and step 2 uses the MAC to detect the association among candidate features obtained in the step 1 to achieve the purpose of removing redundant features utilizing the information provided through association strength between them. In this way, the optimal features for downstream classification tasks are selected by the TSFS method.
Ijms 26 00136 g008
Figure 9. The association between features and redundance threshold (Rth). (A) The association between features before removing redundant features. (B) The remaining features after removing redundant features.
Figure 9. The association between features and redundance threshold (Rth). (A) The association between features before removing redundant features. (B) The remaining features after removing redundant features.
Ijms 26 00136 g009
Table 1. A binary classification confusion matrix. TP: True Positive; FN: False Negative; FP: False Positive; TN: True Negative.
Table 1. A binary classification confusion matrix. TP: True Positive; FN: False Negative; FP: False Positive; TN: True Negative.
Confusion MatrixReal
Predicted PositiveNegative
PositiveTPFP
NegativeFNTN
Table 2. The hyperparameters of the TSFS method in two prognostic prediction models.
Table 2. The hyperparameters of the TSFS method in two prognostic prediction models.
HyperparameterSurvival Time Prediction ProblemVital Status Prediction Problem
Gene datamethylation dataGene datamethylation data
α0.70.550.350.6
Rth0.650.90.20.9
Table 3. Comparison of the number of features for prediction models built by single-task and multi-task learning methods.
Table 3. Comparison of the number of features for prediction models built by single-task and multi-task learning methods.
FeatureSingleOSSingleVSMultiOSVS
Clinical311
Gene69554
Methylation250237151
Table 4. Reports on features related to the prognosis of neuroblastoma.
Table 4. Reports on features related to the prognosis of neuroblastoma.
NameCategoryDescriptionReferences
AgeClinicalAge at diagnosis.[44]
PLXDC2Gene expression dataPLXDC2 is the surface receptor of pigment epithelium-derived factor (PEDF);
PEDF effects induces cell differentiation and neurite outgrowth.
[59,60]
LRRN2DNA methylation dataLRRN2 serves as a prognostic marker of NB.[61]
SFNDNA methylation dataThe methylation of the SFN gene above a defined threshold is a strong and reliable predictor of adverse outcome independently from other prognostic factors.[62]
FGFR4DNA methylation dataThe FGFR4 is associated with an increased prevalence of neuroblastoma in children.[63]
MGMTDNA methylation dataMGMT methylation is a relevant therapeutic target in neuroblastoma.[64]
IGF2BP3DNA methylation dataAn IGF2BP3 positive coefficient was a risk factor for poor prognosis and the levels of IGF2BP3 and N-myc are positively correlated in NB.[65,66]
TP73DNA methylation dataThe TP73 gene, also called p73, banded at 1p36.3, is homologue of the TP53 tumor suppressor. TP73 can inhibit cell proliferation and induce apoptosis; a role for TP73 in the development of neuroblastoma could not be completely ruled out.
TP73 has been proposed as a candidate tumor suppressor gene involved in neuroblastoma development.
[67,68,69]
NR2E1DNA methylation dataElevated expression of NR2E1, also called TLX, in neuroblastoma (NB) correlates with unfavorable prognosis.[70]
DNA methylation regulates gene expression.[32,33,34]
Table 5. Clinical characteristics of 88 patients with neuroblastoma.
Table 5. Clinical characteristics of 88 patients with neuroblastoma.
VariablesCategoriesFrequencyPercentage (%)
Average age at diagnosis in years: Mean (SD *)Male3.01 (2.80)
Female3.03 (1.55)
All patients3.02 (2.36)
GenderMale5157.95
Female3742.05
RaceWhite6978.41
Black or African American910.23
Unknown77.95
Native Hawaiian or other Pacific Islander22.27
Asian11.14
INSS stageStage 11213.64
Stage 311.14
Stage 47585.23
COG risk groupLow risk1213.64
High risk7686.36
Vital status at the last follow-upDead5157.95
Alive3742.05
Overall survival time of all patients in years: Mean (SD *)Median survival time4.58 (3.43)
Average survival time5.02 (3.43)
* SD: Standard deviation.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, P.; Zhang, J. Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning. Int. J. Mol. Sci. 2025, 26, 136. https://doi.org/10.3390/ijms26010136

AMA Style

Wang P, Zhang J. Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning. International Journal of Molecular Sciences. 2025; 26(1):136. https://doi.org/10.3390/ijms26010136

Chicago/Turabian Style

Wang, Panru, and Junying Zhang. 2025. "Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning" International Journal of Molecular Sciences 26, no. 1: 136. https://doi.org/10.3390/ijms26010136

APA Style

Wang, P., & Zhang, J. (2025). Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning. International Journal of Molecular Sciences, 26(1), 136. https://doi.org/10.3390/ijms26010136

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop