Abstract
Deep learning has demonstrated better performance than traditional regression methods in handling right-censored cancer survival data; however, its application in survival analysis remains limited due to censoring-related data loss and the lack of approaches addressing left-censored or left-truncated cases. The aim of this study is to introduce a novel deep neural network algorithm designed to incorporate left truncation in survival analysis. Building upon prior research, we propose an innovative approach that integrates deep learning with survival analysis by applying complementary and consensus statistical principles to simultaneously account for left truncation and right censoring. The cumulative hazard function was estimated using the Breslow estimator and incorporated as an output variable in the proposed model. Model performance was assessed using the integrated area under the curve (iAUC), which quantifies predictive accuracy across all time points. The proposed method demonstrated robust predictive performance with an iAUC of 0.745 (95% CI: 0.705–0.785) when applied to left-truncated and right-censored survival data. Its performance was comparable to widely recognized statistical and machine learning techniques for survival analysis, including the Cox proportional hazards model (iAUC = 0.779, 95% CI: 0.739–0.812) and random survival forests (iAUC = 0.829, 95% CI: 0.795–0.858). This study introduces a methodological extension of deep neural network models, previously restricted to right-censored survival data, to also accommodate left-truncated survival data. The validation results provide evidence that the proposed model achieves predictive performance comparable to well-established methods, thereby broadening the applicability of deep learning in survival analysis.
1. Introduction
Left truncation arises in many clinical studies when the origin of the time scale, such as conception, is not directly observed within the relevant timeline, such as gestational age. Ignoring left truncation can introduce bias because study entry may depend on both outcome and exposure status []. This issue has been documented in several areas of epidemiology research, including the development of AIDS among individuals with HIV infection, the occurrence of spontaneous abortion or birth defects, and cases of cancer survivors who were enrolled after diagnosis []. For instance, the observed inverse association between smoking during pregnancy and preeclampsia, typically showing a 10–40% lower risk, may reflect left-truncation bias due to higher early pregnancy loss among smokers before preeclampsia diagnosis [].
In cancer survival analysis, accounting for left truncation is particularly critical when survival time is measured from the initiation of treatment or the date of surgery. Patients who die before these events are inherently excluded from the analytic cohort, which leads to delayed entry bias and distorts the estimation of treatment effects [,,]. Such bias may cause treatments to appear more effective than they actually are, or conversely, attenuate estimates due to selection effects. Properly modeling left-truncated data ensures that each individual contributes their time only after becoming at risk, thereby yielding unbiased estimates of survival probabilities and hazard ratios.
Survival models are widely used to examine relationships between patient characteristics and treatment outcomes. The Cox proportional hazards (CPH) model, a semi-parametric framework, assumes a log-linear relationship between covariates and the hazard, which may be overly restrictive in complex biomedical settings. Nonlinear approaches, such as neural networks and survival forests, may better capture high-dimensional interactions among variables.
This study introduces a deep neural network-based survival model, referred to as DeepLTRC, developed to appropriately handle left-truncated and right-censored data and to evaluate its performance using real-world clinical datasets. Deep learning-based models have achieved substantial success across various fields, including natural language processing, speech recognition, and image analysis [,,], owing to their multilayer nonlinear structure optimized by loss functions with regularization. However, applying deep learning to censored or truncated outcomes remains challenging because conventional loss functions cannot directly account for incomplete follow-up information.
To overcome this limitation, several studies have proposed loss functions assuming a Weibull-distributed failure time [,], while models such as DeepSurv have extended the proportional hazards framework using partial likelihood-based loss functions [,,,], and several authors have further proposed models for survival time based on discrete distributions [,]. Despite these methodological advances, no deep neural network approach has been specifically developed for left-truncated survival data, representing an important unmet need in modern survival analysis.
2. Materials and Methods
2.1. Theoretical Background
2.1.1. Survival Data
Survival data is a set of statistical variables used to investigate the time it takes for an event of interest to occur. In medical studies, it is often employed to analyze the time to disease remission, progression, or death among patient cohorts, as well as to compare outcomes across different treatment groups within clinical trials []. However, the true event times are generally not known for all patients in real clinical data as the follow-up time for a patient may not be long enough for the event to happen or the patient may leave the study before its end. The purpose of survival analysis is to model the event distribution as a continuous function of time reflecting those limitations, and this is called censoring. Even if we do not observe the true event time, we can define a right-censored time as an event time using censoring indicators and setting the observed event time as an event or a censored observation.
For right-censored survival data, each individual observation can be expressed as a triplet, , where is the observed event time or censored time for the subject; if is the event time; if is the censored time; and is the covariate vector for the subject. In the left-truncated and right-censored (LTRC) observation, the quadruple denotes the subject, where is the left-truncation time, is the observed survival time or censored time, is the event indicator, and is the covariate vector [].
2.1.2. CPH Model
The CPH model is a well-recognized statistical model for exploring the relationship between patient survival and several explanatory variables. Cox model provides an estimate of the treatment effect on survival after adjustment for other explanatory variables. It allows us to estimate the hazard (or risk) of death, or other events of interest, for individuals given their prognostic variables using semi-parametric specification of the hazard function as follows:
where is a non-parametric baseline hazard function, and is the relative hazard function [].
2.1.3. Random Survival Forest
Random survival forest is a method embedding the survival tree algorithm with the log-rank test as the splitting method []. In this method, the node is split using the candidate variables that maximize the difference in survival probability between the daughter nodes. The conditional cumulative hazard function is estimated in each terminal node and subject of a tree using the Nelson–Aalen estimator as follows:
where is the number of subjects and is the risk set at time .
The algorithm of random survival forest is largely composed of two steps: generating a survival tree from in-bag bootstrap samples and summing them to build an ensemble cumulative hazard function. Finally, the model is fitted, and the performance of the model is examined by applying OOB data. The random survival forest (RSF) model was implemented using R software (version 4.3.2; R Core Team, 2023) with the package randomForestSRC (3.1.0; Ishwaran & Kogalur, 2022) [].
To ensure methodological consistency and fairness in the comparative evaluation under LTRC conditions, RSF model was modified to account for delayed entry by applying a cumulative baseline hazard-based time transformation prior to model training. Specifically, the RSF was trained using the transformed time variable derived from the training subset based on the baseline cumulative hazard estimated via the Breslow method. Although the RSF algorithm can natively handle LTRC data through the native interface Surv (entry, time, event), the transformed-time approach was adopted to maintain methodological consistency with the DeepLTRC and CPH models. This unified training protocol ensured a fair comparison of predictive performance under identical conditions and prevented potential data leakage, as the baseline hazard estimation and time transformation were computed exclusively within each training partition at every bootstrap replication.
2.1.4. Deep Neural Network for Right-Censored Survival Data
Based on previous studies which have demonstrated mixed results on neural networks’ ability to predict risk, an innovative deep feed-forward neural network algorithm called DeepSurv was proposed []. The model estimates the influence of patient-specific covariates on the hazard function, which is parameterized through the network’s learned weights. The input to the network is a patient’s baseline covariates, . The hidden architecture of the network comprises a fully connected layer of nodes, subsequently followed by a dropout layer []. Let us assume the output of the network has a single node with a linear activation which estimates the log-hazard function in the CPH model. This model trains the neural network by fitting the objective function to be the average negative log partial likelihood with regularization, as shown in the following equation:
where is the number of patients with an observed event, and is the regularization parameter. Subsequently, gradient descent optimization is employed to estimate the network weights that minimize the objective function .
2.2. Methods
2.2.1. Deep Neural Network Model for LTRC Survival Data
We propose a new deep neural network model referred to as DeepLTRC, as this model reflects even the left-truncation effect in addition to right censoring in survival prediction. Figure 1 illustrates the process of DeepLTRC based on the cutting-edge deep neural network DeepSurv which was developed based on the Cox proportional assumption.
Figure 1.
Diagram of DeepLTRC; ① estimate the baseline cumulative hazard; ② calculate as a new time ; ③ apply the new time as a survival time for subject ; ④ estimate log-hazard using negative partial log-likelihood; ⑤ train the deep neural network with the log-hazard value .
The structure and training process of the proposed DeepLTRC model are summarized as follows:
- Network structure: two fully connected hidden layers, each containing 32 neurons.
- Activation function: Scaled Exponential Linear Unit (SELU).
- Regularization: L2 regularization (λ = 0.001).
- Dropout: AlphaDropout (rate = 0.5).
- Output layer: single neuron with a linear combination.
- Optimizer: Adam optimizer.
- Training parameters: learning rate of 0.001, a batch size of 32, and up to 1000 epochs.
- Early stopping: patience = 20 epochs.
- Data splitting strategy: training 70%, test 15%.
- Random seed: fixed random seed (1).
For the estimation of the baseline cumulative hazard , it was computed separately within each training partition at every bootstrap replication using the Breslow estimator from the fitted CPH model. The estimated training-based baseline hazard was subsequently used to compute the transformed time variable and to derive individual cumulative hazard predictions within the DeepLTRC framework. To strictly prevent potential data leakage, the dataset was randomly divided into independent training (70%) and test (30%) subsets prior to model development, and the test data were never used in any part of the estimation or transformation of . This procedure was repeated across all bootstrap replications, thereby ensuring that each estimation of the baseline hazard and corresponding transformed times relied solely on training data. This methodological design also allows for potential extension to cross-fitting or nested validation, which could further enhance robustness and mitigate any residual dependency between training and test partitions.
The full likelihood for the right-censored dataset is as follows:
where is survival or right-censored time for subject and is an indicator of an event or a censored observation with covariate . is the hazard function and is the survival function for .
The full likelihood for the LTRC data is as follows:
where is the left-truncated time, and is the survival or right-censored time for subject . indicates an event or a censored observation with covariate . is the hazard function, and is the survival function for . Thus, log-likelihood is calculated as follows:
Referring to LeBlanc and Crowley’s studies, which proposed a survival tree algorithm that can directly handle LTRC survival data [,], we also effectively extended the deep neural network model to the LTRC data by replacing with .
As shown in the flow of Figure 1, five steps are needed to implement this new method, and they are described below.
First, estimate the baseline cumulative hazard based on all of the LTRC data using the Breslow estimator. Note that observation is only counted in the risk set for time when , where is the censoring time.
Second, compute and based on the estimated baseline cumulative hazard .
Third, estimate the survival time and left-truncated time from and to calculate as a new time .
Fourth, we estimate log-hazard using negative partial log-likelihood by applying the new time as a survival time for subject with covariate vector and parameter vector .
Fifth, we train the deep neural network with the log-hazard value as an output of the network.
We applied modern deep learning techniques using Python software (version 3.11; Python Software Foundation, Wilmington, DE, USA) with TensorFlow (version 2.13; version 2.13; Google LLC, Mountain View, CA, USA) to optimize the training of the network by standardizing the input, Scaled Exponential Linear Units (SELU) [], as an activation function and as a gradient descent algorithm with L2 regularization. We also used early stopping and a learning rate scheduler to prevent overfitting and improve model performance, respectively.
In addition to training DeepLTRC on each dataset, we ran linear Cox proportional hazard regression as a baseline model for comparison. We also fitted a random survival forest to compare the performance of DeepLTRC against a state-of-the-art nonlinear survival model.
2.2.2. Evaluation Criteria (iAUC)
As our performance metric, we used the iAUC (integrated Area Under Curve). We drew a graph with the -axis as the AUC value—which is the area under the ROC (receiver operating characteristic) curve at the time point —and the -axis at the time point and integrated the area under it to obtain the integrated AUC value []. Alternative definitions of time-dependent sensitivity and specificity were adopted using the product of the covariate vector and the regression coefficient vector, as the ROC is a curve expressing sensitivity and specificity at all possible points [,]. The ROC curve according to time is called a time-dependent ROC curve, and the incident/dynamic(I/D) ROC curve is the incident true-positive rate defined as a function, where is the dynamic false-positive rate. The AUC for the incident/dynamic ROC curve at time can be obtained as follows:
The predictive power at any time point can be obtained by integrating the AUC up to , and this is defined as iAUC. The equation is as follows:
where the integration horizon τ is set to the maximum observed event time for each dataset. The weighting scheme means weight at time and the proportion of individuals at risk at time []. is defined as follows:
Confidence intervals for iAUC were obtained through 100 bootstrap resamples of the test data.
2.3. Simulation Study
In this section, we conducted a simulation study to evaluate the performance of the DeepLTRC model with various survival data having different censoring rates and covariates.
2.3.1. Generating Right-Censored Survival Time
Simulation data were generated using all possible combinations of the parameters listed including the censoring rate, which is relevant to an actual censoring distribution. If we let denote the probability density function of the CPH model and let be the random variable in density function , then follows a uniform distribution on the interval from 0 to 1. We simulated the survival time based on the Weibull distribution as follows:
where denotes the baseline hazard [].
When generating survival time, a multivariate normal distribution with a mean of and a variance of 1 was assumed for the independent variable , where is the random real number between 0 and 1 generated from the uniform distribution, is 4, and is 0.01, to generate the survival time.
On the other hand, the censoring time for subject is which is independent of the distribution of survival time and generated by assuming ‘non-informative censoring’, which is independent of the covariates. From this, we can generate censored data using the definitions of the CPH model as follows:
where is the survival time, and is the indicator function labeling the observed event time as an event or a censored observation.
2.3.2. Generating LTRC Survival Time
To generate LTRC survival time, three distinct times, each with a different distribution, were generated. First, we generated , following the baseline hazard and beta coefficient as follows:
Second, we generated the waiting time with the following parameters: baseline hazard and beta coefficient :
An intermediate event occurs depending on the waiting time and . If an intermediate event occurs, we generate a left-truncated distribution of the survival time . Specifically, if , then , and if , then we generate a random variable with the baseline hazard and beta coefficient :
where , and []. Three distinct survival times are generated above, following a uniform distribution: , and .
2.3.3. Simulation Scenarios
Survival data was generated based on the Cox proportional hazards (CPH) model, which served as the baseline. The predictive performance of the deep neural network (DNN) model and random survival forest (RSF) was compared with that of the CPH model for right-censored data. For left-truncated and right-censored (LTRC) data, the performance of DeepLTRC and RSF was also compared with the baseline model.
In Scenario 1, we generated 20 covariates, including 15 relevant and 5 irrelevant (noise) variables.
The mean values were set as follows:
- Relevant variables: (0, 1, 2, 1, 0, 0, 1, 2, 1, 0, 0, 1, 2, 1, 0).
- Irrelevant variables: (0, 1, 0, 1, 0).
- Survival times followed a Weibull distribution with shape and scale parameters:
The regression coefficients for generating survival times were defined as follows:
- Independent covariates:= (1, 2, 1, −1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0).= (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0.5, 1, −1, 1).= (0, 0, 0, 0, 0, 0.5, 1, −0.1, −1, 1, 0, 0, 0, 0, 0).
- Interaction covariates:= (1, 0).= (0, 0).= (0, 1).
Different censoring rates (20%, 40%, 60%) were applied to assess model performance under varying levels of censoring. Each simulation used 1000 samples and was repeated 100 times.
Scenario 2 introduces nonlinear and interaction effects. A total of 20 covariates were again generated, consisting of 15 relevant and 5 irrelevant variables with the same mean structure as in Scenario 1:
The mean values were set as follows:
- Relevant variables: (0, 1, 2, 1, 0, 0, 1, 2, 1, 0, 0, 1, 2, 1, 0);
- Irrelevant variables: (0, 1, 0, 1, 0).
To introduce nonlinearity, three covariates (3rd, 8th, 13th) were squared. Two interaction covariates were added by multiplying pairs of variables: (1st × 3rd) and (4th × 5th).
- Survival times followed a Weibull distribution with shape and scale parameters:
The regression coefficients for generating survival times were defined as follows:
- Independent covariates:= (1, 2, 1, −1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0).= (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0.5, 1, −1, 1).= (0, 0, 0, 0, 0, 0.5, 1, −0.1, −1, 1, 0, 0, 0, 0, 0).
- Interaction covariates:= (1, 0).= (0, 0).= (0, 1).
Each of the two scenarios has 18 sub-cases, making for a total of 36 cases. The 36 different datasets were created for the experiments depending on the set of covariates, censoring type, censoring rate, and model. Each dataset contained 1000 observations and 20 covariates, with simulation repeated 100 times under varying censoring conditions (20%, 40%, 60%). Survival times followed a Weibull distribution with shape parameter α = 4 and scale parameter λ = 0.001. All datasets were generated based on the CPH model, which served as the baseline for comparison with DeepLTRC, RSF, and the standard DNN models (Table 1).
Table 1.
Scenarios for simulation.
3. Results
3.1. Simulation Scenario 1
In Scenario 1, we estimate the prediction performance of four models, which are a deep neural network for right-censored survival data, DeepLTRC for LTRC data, and CPH and RSF only with 15 independent covariates and 5 noise variables and without squared or interaction covariates.
3.1.1. Right-Censored Survival Data
Table 2 describes the prediction performance of the deep neural network for right-censored survival data without squared or interaction covariates, compared with the performance of CPH and RSF.
Table 2.
Prediction performance of deep neural network, Cox proportional hazard, and random survival forest in right-censored survival data without squared or interaction covariates (N=1000).
With a censoring rate of 20%, the deep neural network showed slightly lower performance with an iAUC of 0.749 than CPH (0.834) and RSF (0.815). As the information about censoring was increased to up to 40% and 60%, the performance of the deep neural network improved, with an iAUC of 0.850, which was similar to the performance of CPH (0.876) and RSF (0.853).
3.1.2. LTRC Survival Data
Table 3 describes the prediction performance of DeepLTRC for LTRC survival data without squared or interaction covariates, compared with the performance of CPH and RSF. For a censoring rate of 20%, the iAUC of DeepLTRC was 0.649, which is lower than the performance of CPH (0.760) and RSF (0.807). As the information about censoring was increased up to 40% and 60%, the performance of the deep neural network improved with an iAUC of 0.745, which was still lower than random survival forest (0.829) but similar to the performance of CPH (0.779)
Table 3.
Prediction performance of DeepLTRC, CPH, and RSF in LTRC survival data excluding squared or interaction covariates (N = 1000).
3.2. Simulation Scenario 2
In Scenario 2, we estimate the prediction performance of four models: the deep neural network for right-censored survival data, DeepLTRC for LTRC survival data, and CPH and RSF, with 12 independent covariates, 3 squared covariates, 2 interaction covariates, and 5 noise variables.
3.2.1. Right-Censored Survival Data
Table 4 describes the prediction performance of the deep neural network for right-censored survival data, compared with the performance of CPH and RSF.
Table 4.
Prediction performance of deep neural network, Cox proportional hazard, and random survival forest with right-censored survival data (N = 1000).
For the censoring rate of 20%, the deep neural network showed a performance with an iAUC of 0.847, slightly lower than that of CPH (0.904) but higher than RSF (0.801). As the information about censoring was increased up to 40% and 60%, the performance of the deep neural network was improved, with an iAUC of 0.928, which was higher than random survival forest (0.853) and similar to the performance of CPH (0.938).
3.2.2. LTRC Survival Data
Table 5 describes the prediction performance of DeepLTRC for LTRC data, compared with the performance of CPH and RSF. For a censoring rate of 20%, the iAUC of DeepLTRC was 0.589, which was lower than the performance of CPH (0.814) and RSF (0.800). As the information about censoring was increased up to 40% and 60%, the performance of the deep neural network was improved, with an iAUC of 0.717, which was still slightly lower than CPH (0.843) and RSF (0.836).
Table 5.
Prediction performance of DeepLTRC, Cox proportional hazard and random survival forest with LTRC survival data (N = 1000).
3.3. Real Data Analysis
We used the bone marrow transplant dataset presented by Klein and Moeschberger, which is one of the most well-known examples of competing risk analysis in leukemia treatment and includes left-truncated data for patients who had not yet received a bone marrow transplant at study entry []. The study cohort comprised transplantations performed at these institutions between 1 March 1984 and 30 June 1989, with a maximum follow-up duration of seven years. Additionally, 42 relapses and 41 deaths occurred in the remission period. A total of 26 patients had an episode of acute graft-versus-host disease (aGVHD), and 17 relapses or deaths occurred in the remission period without platelets returning to normal levels. Bone marrow transplant (BMT), which is the standard treatment for acute leukemia, has a complex recovery process. The likelihood of recovery may be influenced by pre-transplantation risk factors, including recipient and donor age or sex, cytomegalovirus (CMV) status, and disease classification, among others. Another important risk factor is the intermediate event: development of chronic graft-versus-host disease (cGVHD), development of acute graft-versus-host disease (aGVHD), and the return of the patient’s platelet count to a self-sustaining level. The general characteristics and distribution of the bone marrow transplant data are presented in Table 6. In this study, we mainly included the 42 patients who returned to relapse after transplant as an intermediate event with left-truncation effects.
Table 6.
Characteristics and distribution of the bone marrow transplant data (N=137).
Table 7 presents the analysis results for a subset of 42 patients who experienced relapse after transplantation, which was extracted from the full BMT cohort. This subset was specifically used to evaluate the model performance under left-truncation conditions. In this context, the left-truncation variable was defined as the waiting time between study enrollment and bone marrow transplantation, during which the subject was not yet at risk of relapse or death. The distribution of the left-truncated BMT dataset is presented in the Supplementary Material (Table S1).
Table 7.
Prediction performance of DeepLTRC, Cox proportional hazard, and random survival forest with real bone marrow transplant dataset (N = 42).
The prediction performance of the three models on real data samples (BMT) is summarized in Table 7. The iAUC of DeepLTRC was 0.575, which was higher than random survival forest (0.504) and slightly lower than CPH (0.776).
4. Discussion
This study proposed and evaluated DeepLTRC, a deep neural network that jointly accounts for left truncation and right censoring in survival analysis. Results from simulations and real-world data show that DeepLTRC maintains strong predictive performance across various censoring rates and data structures, extending deep learning’s applicability to settings where conventional methods often fail.
In Simulation Scenario 1, which included only independent noise variables, the deep neural network for right-censored data improved with increasing censoring, achieving an iAUC of 0.850 at 60%, which was comparable to the CPH model and RSF. When left truncation was introduced, DeepLTRC initially showed a slightly lower performance (iAUC = 0.649 at 20% censoring), which improved steadily as censoring increased (iAUC = 0.745 at 60% censoring), approaching the performance of the CPH model (0.779). These results indicate that DeepLTRC effectively learns survival structures even when truncation reduces available information.
In Simulation Scenario 2, which incorporated nonlinear and interaction effects, the network achieved high accuracy for right-censored data (iAUC = 0.928 at 60% censoring), comparable to CPH (0.938) and outperforming RSF (0.866). For left-truncated data, DeepLTRC again showed improved performance with higher censoring (iAUC = 0.589 to 0.717), showing its capacity to model complex nonlinear covariate relationships while maintaining consistent survival estimation under truncation.
Application to the BMT dataset further confirmed its practical utility. DeepLTRC achieved an iAUC of 0.575, which outperformed RSF (0.504) but was slightly lower than CPH (0.776). Because this dataset involves delayed entry after transplantation, the results underscore DeepLTRC’s ability to manage incomplete at-risk periods, which are common in clinical survival studies. Despite the small sample size, its performance was comparable to advanced deep survival models such as DeepSurv [] and DeepHit [], which were trained on much larger datasets. The DeepSurv model, for example, used the United Network for Organ Sharing (UNOS) database, which contains data from 60,400 patients who underwent heart transplantation between 1985 and 2015, including 29,436 uncensored (48.7%) and 30,964 censored (51.3%) cases with 50 clinical features. It also utilized the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset, consisting of 2092 patients (999 uncensored and 1093 censored) with 21 gene expression and clinical variables. The C-index [] was 0.573 (0.555–0.571) for UNOS and 0.648 (0.636–0.660) for METABRIC, while DeepHit achieved values of 0.589 and 0.691, respectively. These comparisons demonstrate that DeepLTRC delivers competitive predictive accuracy even with limited data, confirming its robustness and adaptability to left-truncated survival settings.
Overall, DeepLTRC bridges the methodological gap between traditional linear-effect survival models and deep learning models that overlook truncation. By integrating the Breslow estimator in its output layer, it combines the interpretability of semi-parametric models with the flexibility of neural networks. Thus, DeepLTRC advances methodologically transparent and generalizable deep learning approaches for complex clinical survival data, particularly in cancer prognosis and registry-based research.
In this simulation framework, the Weibull data-generating process assumes a log-linear relationship between covariates and the hazard function, which inherently favors the CPH model. As a result, DeepLTRC did not consistently outperform CPH in settings that strictly follow linear hazard structures. However, the adaptability of DeepLTRC reveals its potential under nonlinear effects, higher censoring, or complex covariate interactions, demonstrating its capability to effectively model survival patterns that extend beyond the parametric constraints of the CPH model.
Despite these promising results, several limitations should be noted. First, the current simulation design used a moderate number of covariates; therefore, further validation on high-dimensional genomic or imaging data is needed. Second, the model’s hyperparameters were tuned manually. Future research should apply automated hyperparameter optimization (AutoHPO) techniques such as population-based training (PBT) [], Bayesian optimization with Hyperband (BOHB) [], or optimization frameworks like Optuna [] to improve efficiency and reproducibility. Third, the current study focused on single-event survival analysis. Extending the framework to competing risks or multi-state survival modeling could further increase its clinical utility. Fourth, although this study employed the integrated area under the time-dependent ROC curve (iAUC) as the primary metric for evaluating the discriminative performance of the survival models, the Integrated Brier Score (IBS) also represents a comprehensive measure for assessing overall calibration and discrimination performance over time [,]. Therefore, future studies should incorporate IBS as a complementary performance metric to enable a more rigorous evaluation of model reliability and predictive accuracy over time. In addition, it is necessary to incorporate the Δ-iAUC (model minus reference) with paired bootstrap confidence intervals and the time-dependent Brier and calibration plots for a more rigorous and comprehensive performance assessment []. Although the DeepLTRC framework effectively handled delayed entry in the BMT dataset, a limitation of this study is that competing risks, such as death before relapse, were not explicitly modeled. This omission may influence the interpretation of relapse-related survival probabilities, as event dependencies can alter hazard estimates in real-world clinical settings. Future work should extend the model to a competing-risks or multi-state survival framework to more accurately capture multiple event processes. Finally, incorporating explainable artificial intelligence (XAI) methods may help identify significant prognostic variables and enhance interpretability for clinical applications [].
5. Conclusions
In conclusion, this study demonstrated that deep neural networks can achieve performance comparable to traditional gold-standard methods such as the Cox proportional hazards model and random survival forest when applied to right-censored survival data with an adequate sample size and appropriately structured covariates. The proposed DeepLTRC model achieved iAUC of 0.745 (95% CI: 0.705–0.785), which was comparable to that of the CPH model (iAUC = 0.779, 95% CI: 0.739–0.812) and RSF (iAUC = 0.829, 95% CI: 0.795–0.858). Although DeepLTRC did not outperform these conventional models, it showed a clear tendency to reduce the performance gap under higher censoring rates. This finding suggests that DeepLTRC offers a robust and scalable framework for modeling left-truncated and right-censored survival data, thereby contributing to the advancement of deep learning applications in medical survival analysis.
Supplementary Materials
The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/app152212093/s1, Table S1: Characteristics and distribution of the left truncated bone marrow transplant data (N = 42).
Author Contributions
Conceptualization, M.-J.L.; methodology, B.-J.S.; software, B.-J.S.; validation, M.-J.L.; formal analysis, B.-J.S.; investigation, M.-J.L.; resources, M.-J.L.; data curation, B.-J.S.; writing—original draft preparation, B.-J.S.; writing—review and editing, M.-J.L.; visualization, B.-J.S.; supervision, M.-J.L.; project administration, B.-J.S. All authors have read and agreed to the published version of the manuscript.
Funding
This study was supported by a research grant from Kongju National University [grant numbers 2021-0349-01, 2021].
Institutional Review Board Statement
Not applicable. This study is a simulation-based methodological study on survival analysis and does not involve human participants or identifiable personal data; therefore, ethical approval was not required.
Informed Consent Statement
Not applicable. This study is a simulation-based methodological investigation in survival analysis and does not involve human participants or the use of identifiable personal data; therefore, informed consent was not required.
Data Availability Statement
The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request.
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| aGVHD | Acute graft-versus-host disease |
| AML | Acute myeloid leukemia |
| BMT | Bone marrow transplant |
| C-index | Concordance index |
| CPH | Cox proportional hazard |
| cGVHD | Chronic graft-versus-host disease |
| CI | Confidence interval |
| CMV | Cytomegalovirus |
| DNN | Deep neural network |
| FAB | French–American–British classification of acute myeloid leukemia |
| IBS | Integrated Brier score |
| iAUC | Integrated area under the curve |
| LTRC | Left-truncated and right-censored |
| METABRIC | Molecular Taxonomy of Breast Cancer International Consortium |
| MTX | Methotrexate |
| OOB | Out-of-bag |
| PBT | Population-based training |
| RF | Random forest |
| ROC | Receiver operating characteristic |
| RSF | Random survival forest |
| SELU | Scaled exponential linear unit |
| UNOS | United Network for Organ Sharing |
| XAI | Explainable artificial intelligence |
References
- Howards, P.P.; Hertz-Picciotto, I.; Poole, C. Conditions for bias from differential left truncation. Am. J. Epidemiol. 2007, 165, 444–452. [Google Scholar] [CrossRef]
- Applebaum, K.M.; Malloy, E.J.; Eisen, E.A. Left truncation, susceptibility, and bias in occupational cohort studies. Epidemiology 2011, 22, 599–606. [Google Scholar] [CrossRef]
- Lisonkova, S.; Joseph, K.S. Left truncation bias as a potential explanation for the protective effect of smoking on preeclampsia. Epidemiology 2015, 26, 436–440. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.G.; Chen, H.Z.; Zhu, J.; Shen, A.G.; Sun, X.Y.; Parkin, D.M. Cancer survival: Left truncation and comparison of results from hospital-based cancer registry and population-based cancer registry. Front. Oncol. 2023, 13, 1173828. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- McGough, S.F.; Incerti, D.; Lyalina, S.; Copping, R.; Narasimhan, B.; Tibshirani, R. Penalized regression for left-truncated and right-censored survival data. Stat Med. 2021, 40, 5487–5500. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zhai, Y.; Amadou, A.; Mercier, C.; Praud, D.; Faure, E.; Iwaz, J.; Severi, G.; Mancini, F.R.; Coudon, T.; Fervers, B.; et al. The impact of left truncation of exposure in environmental case-control studies: Evidence from breast cancer risk associated with airborne dioxin. Eur. J. Epidemiol. 2022, 37, 79–93. [Google Scholar] [CrossRef] [PubMed]
- Goldberg, Y. A primer on neural network models for natural language processing. J. Artif. Intell. Res. 2016, 57, 345–420. [Google Scholar] [CrossRef]
- Graves, A.; Mohamed, A.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Liao, L.; Ahn, H. Combining deep learning and survival analysis for asset health management. Int. J. Progn. Health Manag. 2016, 16, 020-1. [Google Scholar] [CrossRef]
- Ranganath, R.; Perotte, A.; Elhadad, N.; Blei, D. Deep survival analysis. arXiv 2016, arXiv:1608.02158v2. [Google Scholar]
- Faraggi, D.; Simon, R. A neural network model for survival data. Stat. Med. 1995, 14, 73–82. [Google Scholar] [CrossRef]
- Katzman, J.L.; Shaham, U.; Cloninger, A.; Bates, J.; Jiang, T.; Kluger, Y. DeepSurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med. Res. Methodol. 2018, 18, 24. [Google Scholar] [CrossRef]
- Mobadersany, P.; Yousefi, S.; Amgad, M.; Gutman, D.A.; Barnholtz-Sloan, J.S.; Vega, J.E.V.; Brat, D.J.; Cooper, L.A.D. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl. Acad. Sci. USA 2018, 115, 2970–2979. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Boimel, P.; Janopaul-Naylor, J.; Xiao, Y.; Ben-Josef, E.; Fan, Y. Deep convolutional neural networks for imaging data based survival analysis of rectal cancer. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019; pp. 846–849. [Google Scholar]
- Miscouridou, X.; Perotte, A.; Elhadad, N.; Ranganath, R. Deep survival analysis: Nonparametrics and missingness. In Proceedings of the Machine Learning for Healthcare Conference, Palo Alto, CA, USA, 17–18 August 2018; pp. 244–256. [Google Scholar]
- Gensheimer, M.F.; Narasimhan, B. A scalable discrete-time survival model for neural networks. PeerJ 2019, 7, e6257. [Google Scholar] [CrossRef]
- Klein, J.P.; Moeschberger, M.L. Semiparametric Proportional Hazards Regression with Fixed Covariates. In Survival Analysis. Statistics for Biology and Health; Springer: New York, NY, USA, 2003. [Google Scholar] [CrossRef]
- Fu, W.; Simonoff, J.S. Survival trees for left-truncated and right-censored data, with application to time-varying covariate data. Biostatistics 2017, 18, 352–369. [Google Scholar] [CrossRef]
- David, R. Cox. Regression models and life-tables. J. R. Stat. Society. Ser. B (Methodol.) 1972, 34, 187–220. [Google Scholar] [CrossRef]
- Ishwaran, H.; Kogalur, U.B.; Blackstone, E.H.; Lauer, M.S. Random survival forests. Ann. Appl. Stat. 2008, 2, 841–860. [Google Scholar] [CrossRef]
- Ishwaran, H.; Kogalur, U.B.; Kogalur, M.U.B. Package ‘randomForestSRC’. Available online: https://kogalur.r-universe.dev/builds (accessed on 14 August 2025).
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- LeBlanc, M.; Crowley, J. Relative Risk Trees for Censored Survival Data. Biometrics 1992, 48, 411–425. [Google Scholar] [CrossRef] [PubMed]
- Klambauer, G.; Unterthiner, T.; Mayr, A.; Hochreiter, S. Self-normalizing neural networks. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Heagerty, P.J.; Lumley, T.; Pepe, M.S. Time dependent ROC curves for censored survival data and a diagnostic marker. Biometrics 2000, 56, 337–344. [Google Scholar] [CrossRef] [PubMed]
- Etzioni, R.; Pepe, M.; Longton, G.; Hu, C.; Goodman, G. Incorporating the time dimension in receiver operating characteristic curves: A case study of prostate cancer. Med. Decis. Mak. 1999, 19, 242–251. [Google Scholar] [CrossRef]
- Slate, E.H.; Turnbull, B.W. Statistical models for longitudinal biomarkers of disease onset. Stat. Med. 2000, 19, 617–637. [Google Scholar] [CrossRef]
- Uno, H.; Cai, T.; Pencina, M.J.; D’Agostino, R.B.; Wei, L.J. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat. Med. 2011, 30, 1105–1117. [Google Scholar] [CrossRef] [PubMed]
- Ralf, B.; Thomas, A.; Maria, B. Generating survival times to simulate Cox proportional hazards models. Stat. Med. 2005, 24, 1713–1723. [Google Scholar] [CrossRef] [PubMed]
- Nam, C.M.; Zelen, M. Comparing the Survival of Two Groups with an Intermediate Clinical Event. Lifetime Data Anal. 2001, 7, 5–19. [Google Scholar] [CrossRef]
- Lee, C.; Zame, W.R.; Yoon, J.; van der Schaar, M. Deephit: A deep learning approach to survival analysis with competing risks. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, 2–7 February 2018; AAAI Press: Palo Alto, CA, USA, 2018; Volume 32. [Google Scholar]
- Harrell, F.E.; Califf, R.M.; Pryor, D.B.; Lee, K.L.; Rosati, R.A. Evaluating the yield of medical tests. J. Am. Med. Assoc. 1982, 247, 2543–2546. [Google Scholar] [CrossRef]
- Jaderberg, M.E.; Czarnecki, W.; Green, T.O.F.G.; Dalibard, V.C. Population Based Training of Neural Networks. U.S. Patent 11,604,985, 14 March 2023. [Google Scholar]
- Falkner, S.; Klein, A.; Hutter, F. BOHB: Robust and efficient hyperparameter optimization at scale. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PMLR: New York, NY, USA, 2018; pp. 1437–1446. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar]
- Gerds, T.A.; Schumacher, M. Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biom. J. 2006, 48, 1029–1040. [Google Scholar] [CrossRef]
- Mogensen, U.B.; Ishwaran, H.; Gerds, T.A. Evaluating random forests for survival analysis using prediction error curves. J. Stat. Softw. 2012, 50, 1–23. [Google Scholar] [CrossRef]
- Ogutu, S.; Mohammed, M.; Mwambi, H. Deep learning models for the analysis of high-dimensional survival data with time-varying covariates while handling missing data. Discov. Artif. Intell. 2025, 5, 176. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4766–4777. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).