Sensitivity Analysis for Survival Prognostic Prediction with Gene Selection: A Copula Method for Dependent Censoring

Prognostic analysis for patient survival often employs gene expressions obtained from high-throughput screening for tumor tissues from patients. When dealing with survival data, a dependent censoring phenomenon arises, and thus the traditional Cox model may not correctly identify the effect of each gene. A copula-based gene selection model can effectively adjust for dependent censoring, yielding a multi-gene predictor for survival prognosis. However, methods to assess the impact of various types of dependent censoring on the multi-gene predictor have not been developed. In this article, we propose a sensitivity analysis method using the copula-graphic estimator under dependent censoring, and implement relevant methods in the R package “compound.Cox”. The purpose of the proposed method is to investigate the sensitivity of the multi-gene predictor to a variety of dependent censoring mechanisms. In order to make the proposed sensitivity analysis practical, we develop a web application. We apply the proposed method and the web application to a lung cancer dataset. We provide a template file so that developers can modify the template to establish their own web applications.


Introduction
Prognostic analysis for survival often employs gene expressions obtained from highthroughput screening for tumor tissues from patients [1][2][3][4][5][6]. Here, the primary interest is often to find a set of genes that are strong predictors of survival. The simplest and yet most commonly used approach is to select a small number of significant genes found in a sequence of univariate Cox models fitted to patients' survival data [5][6][7][8][9].
However, Cox regression critically relies on the independent censoring assumption: survival time and censoring time need to be statistically independent. This assumption is easily violated when patients drop out from the study due to the worsening of his/her

Backgrounds
In this section, we shall first review gene selection methods [1,9,41] based on survival data. Gene selection is a required step before building a prognostic prediction method. Next, we shall review prognostic prediction methods based on selected genes.
We define basic notations as follows: T i is survival time and x i = (x i1 , . . . , x ip ) is a p-vector of genes from individuals i = 1, 2, . . . , n. The conditional survival function given x i is denoted by S(t|x i ) = P(T i > t|x i ), and the conditional hazard function given x i is defined by h(t|x i ) = [−dS(t|x i )/dt]/S(t|x i ). Similarly, h(t x ij ) denotes the conditional hazard function given the j-th gene x ij .

Classical Gene Selection Method
The approach called univariate selection [1,9] is performed as follows: In the initial step, a significance of each gene is examined by univariate Cox regression one-by-one. Then, a subset of significant genes is selected with a p-value threshold, such as 0.05, 0.01, and 0.001. This is the most standard approach for gene selection in medical research.
In the univariate selection, a significance of gene x ij on survival time T i is measured by fitting a Cox proportional hazards model [58] given by h t x ij = h 0j (t)e β j x ij , j = 1, . . . , p for each gene j. Here, the parameter β j measures the association between survival and the gene, and the baseline hazard function h 0j (.) is a nuisance parameter. Cox's partial likelihood estimator [58], denoted byβ j , is used to obtain the p-value for the Wald test for H 0j : β j = 0 vs. H 1j : β j = 0 without specifying h 0j (.). One selects genes that exhibit smaller p-values than a threshold value. While the Cox's approach is simple and widely accepted, the validity of the gene selection method relies on an important assumption.
To describe the assumption for censoring mechanisms, we define U i as random censoring time. The (training) samples consist of (t i , δ i , x i ), i = 1, 2, . . . , n, where t i = min(T i , U i ) and δ i = 1{T i ≤ U i }, where 1{·} is the indicator function. The samples are often referred to as training samples since they are used to train the model (1). For the estimatorβ j to capture the true β j in Equation (1), one needs to impose the following assumption [37,41]: Independent censoring assumption: Survival time and censoring time are independent conditionally on x ij . That is, If this assumption is violated, Cox's estimatorβ j is biased for the true value of β j [37,41]. This assumption is violated when U i is dropout time from the study due to the worsening of his/her health status or U i is time at removal for transplantation [36,37].

Copula-Based Gene Selection Method
Copulas provide a general model to relax the assumption of independent censoring. Copula-based dependent censoring models have gained popularity in recent years for dealing with survival data [41][42][43][44][45][46][47][48][49][50]55,56]. For the purpose of gene selection, [41] introduced the following copula model: where C α is a copula [59] with a parameter α that specifies the strength of dependence. The copula model (2) can specify a variety of dependent censoring structures. For implementation, a copula with a simple form, such as the Clayton, Gumbel, or Frank copula, is suggested. Furthermore, a copula is better parameterized so that α = 0 gives the independence copula C α=0 (v, w) = vw. An example is the Clayton copula of the form This copula accommodates the independence copula as its limiting case by C α (v, w) → vw as α → 0 . As the value of α departs from zero, the level of dependence increases. To measure the degree of dependence, it is convenient to use Kendall's tau, defined as τ = α/(α + 2) under the Clayton copula. The value of τ ranges from independence (τ = 0) to perfect positive dependence (τ = 1). While the Clayton copula permits only positive dependence, other copulas (e.g., the Frank copula) give negative dependence.
Under the Clayton copula model, a significance of gene x ij on survival time T i is measured by fitting the Cox model (1), namely, Here, β j measures the association between survival and a gene. The significance of a gene is measured by the estimator proposed by [41], denoted asβ j (α). Accordingly, a subset of significant genes is constructed via a p-value threshold for testing H 0j : β j = 0 vs. H 1j : β j = 0. The value of α can be estimated byα that maximizes the cross-validated c-index [37,41]. The computation is implemented by the R package, compound.Cox [9], and explained by the book [37].

Prognostic Prediction Method
Once genes are selected based on survival data, one often wishes to assess their prognostic ability for future patients. In many cases, the metric of interest is the ability of the genes to separate good and poor prognosis groups. For this purpose, researchers typically use a multi-gene predictor, defined as a weighted sum of gene expressions [1,[5][6][7][8][9].
The PI is a weighted sum of genes whose weights reflect the degree of univariate association. A high (low) value of the PI gives a high (low) risk of death. Thus, a patient may be classified into either a low-risk group or a high-risk group using a cut-off value (e.g., the median of the PI values calculated from the training samples).
If α = 0 were assumed in PI(α), the PI would be the classical predictor based on univariate Cox regression under the independent censoring assumption [1,[5][6][7][8][9]. As the value of α is estimated by maximizing the cross-validated c-index [37,41], the resultant PI(α) may have an improved predictive ability over the traditional PI. In medical research, it is the standard to display two Kaplan-Meier (KM) estimates for survival functions calculated for the two groups to see if there is a clear separation between them. However, if dependent censoring exists in test samples, the KM estimators are biased [52]. Therefore, we shall propose a method to examine a sensitivity of the PI to dependent censoring.

Proposed Methods
In this section, we propose a sensitivity analysis method using the CG estimator to assess the impact of dependent censoring on survival prognostic prediction.
This section deals with test samples that are different from the training samples used to select genes and develop the PI (Section 2). In our proposed method, the primary role of the test samples is to examine the sensitivity of the prognostic ability of the PI to dependent censoring. In many medical studies, random multiple splits for training/test sets are applied to develop and validate a PI. However, the goal of the proposed sensitivity analysis is different. A sensitivity analysis is performed to examine the potential effect of dependent censoring that may be present in the test samples.
Define T i as survival time, U i as censoring time, and x i1 , . . . , x iq as a q-vector of genes from individuals i = 1, 2, . . . , n in the test samples. The test samples consist of (t i , δ i , x i ), i = 1, 2, . . . , n, where t i = min(T i , U i ) and δ i = 1{T i ≤ U i }, where the sample size n is possibly different from the size of the training samples. Let PI i (α) =β 1 (α)x i1 + · · · +β q (α)x iq be the PI for the i-th patient in the test samples, wherê β j (α) is computed by the training samples (Section 2.2). Figure 1 shows how the PI is applied to classify patients into a high-risk (low-risk) group if his/her PI is greater (less) than a cut-off value c. Concretely, {i; PI i (α) ≤ c} is a good prognosis (low-risk) group, and {i; PI i (α) > c} is a poor prognosis (highrisk) group. If the classification is successful, the conditional survival function The cut-off value c may be chosen to be the 50th percentile (median) of the PIs that are computed by the training samples. The median was chosen to ensure a desirable statistical power to detect the difference of the two groups and to achieve robust groupings by eliminating the instability due to unevenly allocated sample sizes. Additionally, this rule has been commonly employed in the analysis of survival prognostic prediction with a multi-gene predictor [6,20,25,26].

Sensitivity Analysis via the Copula-Graphic Estimator
This section describes the proposed method of assessing the sensitivity of the PI by using the CG estimator.

Sensitivity Analysis via the Copula-Graphic Estimator
This section describes the proposed method of assessing the sensitivity of the PI by using the CG estimator.
Let { ; PI ( ) ≤ } be a good prognosis (low-risk) group, and { ; PI ( ) > } be a poor prognosis (high-risk) group, where is a cut-off value. Let = ∑ 1{PI ( ) ≤ } and = ∑ 1{PI ( ) > } be the sizes of the groups. To assess the sensitivity of the prognostic ability of the PI, one needs to understand how well the two survival probabilities are differentiated between the two groups ( Figure 1). Accordingly, one needs to estimate two conditional survival functions: We suggest estimating them by the CG estimator of [52] that adjusts for the effect of dependent censoring . In order to apply the CG estimator, it is necessary to impose the following Archimedean copula models: where is a generator function that specifies the dependence structure. It is a continuous and decreasing function such that (0) = ∞ and (1) = 0 [59]. The value of in Equations (3) and (4) can be chosen arbitrarily and can be different from the value estimated by the training samples.
If the models (3) and (4) are assumed, one can estimate ( | ) and ( | ) consistently by the following CG estimators: Let {i; PI i (α) ≤ c} be a good prognosis (low-risk) group, and {i; PI i (α) > c} be a poor prognosis (high-risk) group, where c is a cut-off value. Let n Good = ∑ n i=1 1{PI i (α) ≤ c} and n Poor = ∑ n i=1 1{PI i (α) > c} be the sizes of the groups. To assess the sensitivity of the prognostic ability of the PI, one needs to understand how well the two survival probabilities are differentiated between the two groups ( Figure 1). Accordingly, one needs to estimate two conditional survival functions: We suggest estimating them by the CG estimator of [52] that adjusts for the effect of dependent censoring { XE "Kendall's tau:Kendall's tau"}. In order to apply the CG estimator, it is necessary to impose the following Archimedean copula models: where φ α is a generator function that specifies the dependence structure. It is a continuous and decreasing function such that φ α (0) = ∞ and φ α (1) = 0 [59]. The value of α in Equations (3) and (4) can be chosen arbitrarily and can be different from the valueα estimated by the training samples.
If the models (3) and (4) are assumed, one can estimate S Good (t|c) and S Poor (t|c) consistently by the following CG estimators: Note that the CG estimator reduces to the KM estimator by φ 0 (t) = − log(t), and hence, the resultant estimator reduces to the KM estimator, e.g., For instance, for the Clayton copula, one has φ α (t) This means that the KM estimator is the CG estimator under the independent censoring assumption.
To compute the CG estimators, a parametric form of the copula must be specified (examples of parametric copulas will be shown in Section 3.2). That is, both the form of φ α (.) and the parameter value of α must be specified. However, both of them are difficult to estimate with test samples since they provide modest information about the copula [51,52,60].
Our strategy to tackle this difficulty is a sensitivity analysis that is performed under a variety of copula models [52][53][54][55][56]. More precisely, we suggest plottingŜ CG Good (t c, α) andŜ CG Poor (t c, α) for the three forms of φ α (.) and many different parameter values of α. If S CG Good (t c, α) andŜ CG Poor (t c, α) are clearly separated for a variety of φ α (.) and α, we conclude that the prognostic ability the PI is shown to be robust against dependent censoring. This strategy is in line with previous sensitivity analyses [52][53][54][55][56], yet it is a new method in the context of prognostic survival analysis.
An objective metric of "clear separation" betweenŜ CG Good (t c, α) andŜ CG Poor (t c, α) is obtained via a significance test proposed by Emura and Chen [37,41]. Formally, one can reject the null hypothesis H 0 : S Good (t|c) = S Poor (t|c) against H 1 : S Good (t|c) = S Poor (t|c) for a large value of |D|, where is the largest time point where survivors are seen for both good and poor groups.
The actual implementation of the proposed sensitivity analysis is technical. Thus, the development of the software implementation and the online web application are desirable for clinicians and patients to use. We leave this topic to Section 4. The following subsection provides specific examples of parametric copulas that are currently implemented in our developed R functions and web application.

Examples of Parametric Copulas
This subsection describes some examples of parametric copulas in order to perform a sensitivity analysis.

Software and Web Application
This section describes the development of software to establish a Shiny-based web application for the practical implementation of the proposed sensitivity analysis.
In order to implement the aforementioned sensitivity analyses (Section 3.1), we developed computational/graphical facilities for the CG estimators. First, we refined the computational routines for the CG estimators in the R package compound.Cox [9]. The package now includes the functions CG.Clayton(.), CG.Frank(.), and CG.Gumbel(.), which can compute the CG estimators under the three copulas. These functions compute the CG estimators and graphically display the plot possibly colored by an option (the S.col option). This is useful for visually distinguishing the good prognosis group (specify S.col="blue") and poor prognosis group (specify S.col="red"). Second, we devise a new function, CG.test(.), for testing the prognostic difference for the good and poor prognosis groups based on the difference statistic between two CG estimates. The p-value is computed by the permutation test (Section 3.1). This is useful for objectively measuring the prognostic ability of the PI. While N = 10, 000 is recommended, this number may be reduced to N = 200 to produce a web application that works reasonably fast.
An example of the CG estimator under the Clayton copula is given in Figure 2. Users may run the R code (upper-left panel) after installing the compound.Cox package. Then, the plot of the CG estimates is displayed in the right panel. The numerical values of the estimates (survival probabilities at t i ) are shown in the lower-left panel. This example uses data t i =1, 3,5,4,7,8,10,13; δ i =1, 0, 0, 1, 0, 1, 0. The data are artificial data, which may be regarded as patients from a good prognosis group (hence, we specify the S.col ="blue" option to make the plot blue-colored). The Clayton copula's parameter is specified as α = 18 (Kendall's tau is 0.9). If one specifies the copula parameter as α = 0, this yields the option to make the plot blue-colored). The Clayton copula's parameter is specified as = 18 (Kendall's tau is 0.9). If one specifies the copula parameter as = 0, this yields the KM estimator. We see that the numerical results are identical to the results computed by the survfit(.) function (lower-left panel). An example for testing the difference for the good and poor prognosis is given in Figure 3. This example uses prognostic indices PI = 8, 7, 6, 5, 4, 3, 2, 1 to predict the survival outcomes and already defined. The figure displays the plots of the Clayton CG estimates that are separated between good and poor prognoses group. In addition, the figure shows the p-value computed by the permutation test. More detailed results, such as the difference ( ) and the RMSTD ( ), are given in the left panel. An example for testing the difference for the good and poor prognosis is given in Figure 3. This example uses prognostic indices PI i = 8, 7, 6, 5, 4, 3, 2, 1 to predict the survival outcomes t i and δ i already defined. The figure displays the plots of the Clayton CG estimates that are separated between good and poor prognoses group. In addition, the figure shows the p-value computed by the permutation test. More detailed results, such as the difference (D) and the RMSTD (τD), are given in the left panel.
While we developed the infrastructures of the computational/graphical facilities for the CG estimators, the actual implementation of the sensitivity analysis is still complex. This is because the sensitivity analysis requires the understanding of the R programming and the proposed methods of Section 3. To further make our methodologies accessible to users, we propose a method to produce an interactive web application that can be manipulated by users with minimal knowledge (e.g., clinicians and patients).
We suggest an R package Shiny to make an online web application that can display the plots ofŜ CG Good (t|c, α) andŜ CG Poor (t|c, α) given a user-specified form of φ α (.) and a value of α. With Shiny, one can easily convert R programs into a web application. This process can be carried out by creating the "app.R" file, a program file written in the R language. We develop a template file (available in Supplementary Materials) so that developers can modify the template for their own prediction settings. Once the template is appropriately tuned by developers, the web application is immediately built and can be made publicly available through a platform shinyapps.io (https://www.shinyapps.io/ (accessed on 10 January 2023). A concrete implementation of the web application shall be described in the next section. While we developed the infrastructures of the computational/graphical facilities for the CG estimators, the actual implementation of the sensitivity analysis is still complex. This is because the sensitivity analysis requires the understanding of the R programming and the proposed methods of Section 3. To further make our methodologies accessible to users, we propose a method to produce an interactive web application that can be manipulated by users with minimal knowledge (e.g., clinicians and patients).
We suggest an R package Shiny to make an online web application that can display the plots of ( | , ) and ( | , ) given a user-specified form of (. ) and a value of . With Shiny, one can easily convert R programs into a web application. This process can be carried out by creating the "app.R" file, a program file written in the R language. We develop a template file (available in Supplementary Materials) so that developers can modify the template for their own prediction settings. Once the template is appropriately tuned by developers, the web application is immediately built and can be made publicly available through a platform shinyapps.io (https://www.shinyapps.io/ (accessed on 10 January 2023). A concrete implementation of the web application shall be described in the next section.

Results
The purpose of this section is to demonstrate the proposed methods of Section 3 by using the lung cancer data originally reported in [6] and now available in the Lung object in the compound.Cox package [9]. We first introduce the dataset.

Lung Cancer Data
Briefly, the Lung data are a survival dataset containing = 63 training samples and = 62 test samples of 125 surgically treated non-small-cell lung cancer patients [6]. The endpoint of interest is overall survival , i.e., months from surgery to death. Overall

Results
The purpose of this section is to demonstrate the proposed methods of Section 3 by using the lung cancer data originally reported in [6] and now available in the Lung object in the compound.Cox package [9]. We first introduce the dataset.

Lung Cancer Data
Briefly, the Lung data are a survival dataset containing n = 63 training samples and n = 62 test samples of 125 surgically treated non-small-cell lung cancer patients [6]. The endpoint of interest is overall survival { XE "overall survival:overall "}, i.e., months from surgery to death. Overall survival time and censoring status (death or alive) were recorded. Additionally, available are high-dimensional gene expressions that may predict survival. All the gene expressions were coded as 1, 2, 3, or 4 according to the quartile levels following the original study [6].
In the following analysis, we selected the 16 genes and then constructed the multigene predictor as described in Sections 2.2 and 2.3 (also reported in [37,41]). The resultant predictor takes the form PI i (α) =β 1 (α)x i,1 + · · · +β 16 (α)x i,16 for the i-th patient in the test samples, whereβ j (α) is computed by the training samples given the copula parameter α = 18 (Kendall's tau = 0.90), where (x 1 , · · · , x 16 ) are gene expression{ XE "gene expression:gene expression "}s taking values 1, 2, 3, and 4. In terms of gene symbols, the 16-gene predictor is In order to examine the sensitivity of the PI under a variety of dependent censoring scenarios, we shall apply the proposed sensitivity analysis methods.

Sensitivity Analysis via the Web App
As prescribed in Section 3, we formed a good prognosis group {i; PI i (α) ≤ c} and a poor prognosis group {i; PI i (α) > c}, where the cut-off value c was set to be the 50th percentile (median) of the PIs in the training samples (as we proposed in Section 3). The test samples were then separated into two groups. Then, we evaluated the prognostic performance of the PI by displaying the estimated survival functionsŜ CG Good (t|c, α) and S CG Poor (t|c, α) given a copula and a value of α.
We constructed a web application (Figure 4) by the methods of Section 4. The interactive web version is available in https://takeshi.shinyapps.io/lung/ (accessed on 10 January 2023). The main goal of this web application is to visualize how the two prognosis groups are separated by the PI for a user-specified choice of copulas and any value of α. In order to examine the sensitivity of the PI under a variety of dependent censoring scenarios, we shall apply the proposed sensitivity analysis methods.

Sensitivity Analysis via the Web App
As prescribed in Section 3, we formed a good prognosis group { ; PI ( ) ≤ } and a poor prognosis group { ; PI ( ) > }, where the cut-off value was set to be the 50th percentile (median) of the PIs in the training samples (as we proposed in Section 3). The test samples were then separated into two groups. Then, we evaluated the prognostic performance of the PI by displaying the estimated survival functions ( | , ) and ( | , ) given a copula and a value of . We constructed a web application (Figure 4) by the methods of Section 4. The interactive web version is available in https://takeshi.shinyapps.io/lung/ (accessed on 10 January 2023). The main goal of this web application is to visualize how the two prognosis groups are separated by the PI for a user-specified choice of copulas and any value of .  Using this web application, we tried the Clayton, Gumbel, and Frank copulas under several values of α. For the Clayton copula and α = 2, one can see a good separation between the plots ofŜ CG Good (t|c, α) andŜ CG Poor (t|c, α) as shown in the right panel of Figure 4; one can also see the p-value (=0.04 based on N = 200 permutations). Table 1 Figure 4). One can observe that the p-values obtained are less than or around 0.05 for positive dependence (Kendall's tau = 0.75 or 0.50), while they are greater than 0.10 for negative dependence and independence (tau = −0.75, −0.46, 0). Therefore, the PI would lose the ability of prognostic prediction when censoring in the test data is negatively dependent on survival. Therefore, our analysis reveals a potential drawback for the PI developed by Emura and Chen [41] when negatively dependent censoring arises. On the other hand, the PI gains its improved prognostic ability when censoring is positively dependent on survival. The p-value reaches nearly 1% level for strongly positive dependence (tau = 0.88 or 0.91). This is reasonable since the PI is developed under the Clayton copula with strongly positive dependence (α = 18, tau = 0.90). Table 1 The PI is either 16-gene PI of Emura and Chen [41] or the 16-gene PI of Chen et al. [6]. The p-value is obtained by N = 10,000 permutations.
Recall that the cut-off value c was set to be the median of PI i (α)'s. If we change the cut-off value to be the first or third quartile (25% or 75% points) of PI i (α)'s, then the separation between the plots ofŜ CG Good (t|c, α) andŜ CG Poor (t|c, α) is not clear. This confirms our suggestion to apply the median as the cut-off value.

Comparison with the 16-Gene Pedictor Developed by Chen et al. [6]
The proposed sensitivity analysis is useful for any type of established predictor, not restricted to the predictor developed by the copula-based gene selection method. Furthermore, the proposed web application allows one to compare different predictors in a user-friendly manner. In order to demonstrate this feature, we performed a sensitivity analysis for the PI based on Chen et al. [6] Table 1 shows the p-values testing the difference of survival prognosis (good vs. poor) made by the PI. We observe that the p-values are mostly less than 0.05 for the three copulas with selected values of α (Kendall's tau takes −0.75, −0.46, 0, 0.33, 0.46, 0.50, 0.75). Therefore, the predictive ability of the PI is justified for a variety of dependent censoring models. This extremely confirms the validity and robustness of the PI developed by Chen et al. [6], even though they are developed under the independent censoring model. On the other hand, the ability of the PI under strong positive dependence is reduced compared to the PI developed by Emura and Chen [41]. If strong positive dependence exists in the test samples, the latter PI better prognosticates survival. However, we do not suggest determining the single best copula and the single value of α in the sensitivity analysis based on the CG estimators [52][53][54][55][56]. Instead, we suggest tying different copulas and their parameters to understand the sensitivity of the PI.

Conclusions and Discussion
In this article, we propose a novel sensitivity analysis method for a multi-gene predictor for survival. The primary tool for the proposed method is the CG estimator [52], which can effectively adjust for the impact of dependent censoring on survival data. Consequently, a unique feature of the proposed method is the ability to handle dependent censoring, which has been of increasing concern to medical statisticians in a variety of survival analysis methods [41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56]. The lung cancer data analysis demonstrates the implementation of the proposed method on a web application (Figure 4; https://takeshi.shinyapps.io/lung/ (accessed on 10 January 2023)). A user of this application can easily perform a sensitivity analysis on survival prognostic models using a smartphone.
From a methodological point of view, the proposed method is a novel implementation of the CG estimator [52] to the context of prognostic prediction. So far, the CG estimator has been applied to a single population setting [44,[51][52][53][54], two-sample comparison [55], simple regression [43], multiple regression [60,61], factorial designs [56], and survival forests [47]. The objective of survival forests is to classify patients into several groups, which is similar to the goal of the proposed method. The difference is that survival forests use a decision tree while ours uses a multi-gene predictor. However, survival forests have not been adapted to deal with high-dimensional gene expressions. Hence, it is of great interest to extend survival forests to handle both dependent censoring and high-dimensional covariates as in a survival tree [62].
In this article, we employed the R package Shiny to transfer our sensitivity analysis formulas to a user-friendly web application. Some authors have used the Shiny package in other contexts. A series of works by Fournier et al. [63], Asar et al. [64], and Lenain et al. [65] made a Shiny-based web application for the dynamic prediction of long-term kidney graft failure. Similarly, the dynamic prediction method of Emura et al. [21] (see also the updated works of [66]) was transferred to web applications [67]. Clearly, this type of survival prognostic prediction tool can promote the development of personalized medicine, allowing clinicians to utilize powerful prognostic prediction models.
We note that the issue of dependent censoring in survival prognostic prediction is not restricted to medical research, which can be seen in the reliability analysis of mechanical items or systems [68][69][70][71][72][73]. In the reliability prediction of mechanical equipment, the survival probability is usually modeled by parametric distributions, such as the Weibull distribution. The analysis of a prognostic prediction model in engineering settings poses additional challenges tailored to parametric analyses. Recall that the CG estimator is a semiparametric estimator that does not assume any parametric form of the survival function. The semiparametric CG estimator [52] is not easily modified to parametric models since the estimator may require some numerical integrations.
We focus on the three Archimedean copulas: the Clayton, Gumbel, and Frank copulas, which have been extensively used in survival analysis [74][75][76][77][78] and other fields [79][80][81]. Owing to their remarkable popularity, our choices of copulas are natural. Furthermore, these Archimedean copulas allow simple formulas to compute the CG estimators [52]. However, there are a large number of non-Archimedean copulas popular in a variety of applications, such as the FGM copula, Gaussian copula, trigonometric copula, and Celebioglu-Cuadras copula [82][83][84][85][86][87]. As the CG estimators have not been considered for these non-Archimedean copulas, it is of great interest to develop computational tools for them.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/biomedicines11030797/s1, the "app.R" file, a program file written in the R language. Readers may run this program (after installing the Shiny R package) to see how our web application was created.