Next Article in Journal
Accurate Approximation of the Matrix Hyperbolic Cosine Using Bernoulli Polynomials
Next Article in Special Issue
The Multivariate Skewed Log-Birnbaum–Saunders Distribution and Its Associated Regression Model
Previous Article in Journal
Reductions and Exact Solutions of Nonlinear Wave-Type PDEs with Proportional and More Complex Delays
Previous Article in Special Issue
COVID-19 Active Case Forecasts in Latin American Countries Using Score-Driven Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Slash Half-Normal Distribution Applied to a Cure Rate Model with Application to Bone Marrow Transplantation

by
Diego I. Gallardo
1,*,
Yolanda M. Gómez
1,2,
Héctor J. Gómez
3,
María José Gallardo-Nelson
2 and
Marcelo Bourguignon
4
1
Mathematics Department, Faculty of Engineering, University of Atacama, Copiapó 1530000, Chile
2
Medicine Department, Faculty of Medicine, University of Atacama, Copiapó 1530000, Chile
3
Department of Mathematical and Physical Sciences, Faculty of Engineering, Catholic University of Temuco, Temuco 4780000, Chile
4
Statistics Department, Federal University of Rio Grande do Norte, Natal 59078-970, Brazil
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(3), 518; https://doi.org/10.3390/math11030518
Submission received: 19 December 2022 / Revised: 10 January 2023 / Accepted: 13 January 2023 / Published: 18 January 2023
(This article belongs to the Special Issue Probability, Statistics & Symmetry)

Abstract

:
This paper proposes, for the first time, the use of an asymmetric positive and heavy-tailed distribution in a cure rate model context. In particular, it introduces a cure-rate survival model by assuming that the time-to-event of interest follows a slash half-normal distribution and that the number of competing causes of the event of interest follows a power series distribution, which defines six new cure rate models. Several properties of the model are derived and an alternative expression for the cumulative distribution function of the model is presented, which is very useful for the computational implementation of the model. A procedure based on the expectation–maximization algorithm is proposed for the parameter estimation. Two simulation studies are performed to assess some properties of the estimators, showing the good performance of the proposed estimators in finite samples. Finally, an application to a bone marrow transplant data set is presented.

1. Introduction and Motivation

Survival models that incorporate a proportion of cured subjects are becoming increasingly popular in the analysis of data from clinical trials and other studies, mainly due to advances in medicine. In a cure rate model context, specifically using a competitive risk model framework, it is assumed that the event of interest is produced due to M N 0 carcinogenic cells (a latent variable), each of them with an associated survival function S ( · ; λ ) , where λ is a vector of unknown parameters from a distribution defined on the positive line, and M = 0 represents the cure individuals. Different models are assumed for the number of cells (M) and the survival function S ( · ; λ ) . Specifically for the latter, we can mention Weibull, log-normal, generalized gamma, Birnbaum–Saunders distributions, among others. In particular, we focus on the proposed extension in Olmos et al. [1]. This model (parsimonious, since it only has two parameters) is called a slash half-normal (SHN) distribution. This distribution with a heavy tail has not yet been used in a cure rate model context. The objective of the proposed model is to increase the kurtosis with respect to its half-normal basal distribution, being more useful for modeling data sets that may have heavy tails. A version of the SHN reparameterized in terms of the mean with covariates was presented in Gómez et al. [2].
On the other hand, bone marrow transplantation is an important treatment for more than 50 fatal diseases such as leukemia, lymphoma, solid organ neoplasia, autoimmune diseases, among others. Bone marrow transplantation has unique biological characteristics and associated problems that markedly differ from solid organ transplantation, such as kidney, liver, or heart (Hardy et al. [3]). The problems are associated with the fact that immunocompetent cells can reject each other, which can cause graft rejection and generate graft-versus-host disease (GVHD). Furthermore, successful engraftment requires a strict match with the major histocompatibility complex (MHC) class II antigen of the donor and the recipient. Added to this is that patients must receive prior treatment with cytotoxic and myeloablative agents to avoid graft rejection (Chinen and Buckley [4], Simpson and Dazzi [5]). For these reasons, bone marrow transplantation is associated with considerable morbidity and mortality.
From a biological point of view, it makes sense to assume a heavy-tailed distribution when following patients over a period of time and following patient survival to assess transplant efficacy. These models can describe in detail the progression of the disease and allow us to make predictions for future clinical trials.
The motivation to introduce the SHN distribution in a cure rate model is given by a real data set application from the European Society for Blood and Marrow Transplantation (EBMT). The EBMT is a nonprofit medical, educational and scientific organization founded more than forty years ago, focused in the field of hematopoietic stem cell transplantation. The EBMT has more than 5000 members from 85 countries. This organization is focused on innovation, research and the advancement in the field of hematopoietic stem cell transplantation. The EBMT hosts a unique patient registry providing a pool of data to researchers all over the world to perform studies and assess new improvements in the field (EBTM [6]). Acute myeloid leukemia (AML) in children is rare in comparison with acute lymphoid leukemia and has overall survival of almost 75% at five years. Patients with high-risk characteristics need to achieve remission and undergo bone marrow transplantation with an overall survival of 70% (de Rooij et al. [7]). Post-transplant relapse is common and involves almost 80% of treated AML, of which 75% occur in the first year after transplantation (Sander et al. [8]).
The data were presented in Fiocco et al. [9] and are available in the mstate R package [10], labeled as ebmt4. The data set includes 2279 patients transplanted between 1985 and 1998. We considered the time in years from transplantation to relapse as the response variable, which had 84% of censored times. Figure 1 shows the Kaplan–Meier (KM) estimator for this data set and the histogram for the failure times.
Note that there is a plateau in the KM estimator, suggesting the presence of cure individuals for this problem. In addition, the histogram of failure times suggest a heavy-tailed distribution for the noncured individuals. Some descriptive statistics for such failure times are mean = 1.14, median = 0.52, standard deviation = 1.84 and kurtosis = 18.71. The high kurtosis for these data, again, suggests the use of a heavy-tailed model for this problem.
Allogeneic stem cell transplantation (allo-HSCT) is a therapeutic approach used for a wide array of malignant and nonmalignant hematologic disorders. Its effectiveness is related to mortality and morbidity. Several complications can occur, including hemorrhages, infections, GVHD and endothelial-damage-related toxicities. There are new immunological and genetic therapies to avoid these complications and reduce mortality (See Horio et al. [11], Radujkovic et al. [12] and Tsai et al. [13]). Other immunodeficiency diseases such as ataxia-telangiectasia can be treated by bone marrow transplantation as mentioned in Sabino Pinho de Oliveira et al. [14]. Further studies with an increased cohort of patients and a longer period of follow-ups are required to validate and monitor the associated risks of transplantation. For this reason, it is important to use a heavy-tailed distribution such as the SHN distribution to study the survival in this type of bone marrow transplant and therapies. We emphasize that, up to this moment, this distribution has not been used in a cure rate model context. In this context, we define the power series–SHN (PS–SHN) class of univariate cure rate models obtained by compounding the SHN and power series distributions. The proposed model defines six new cure rate models as special cases.
The contents in this paper are organized in the following manner. In Section 2, we introduce the SHN distribution and derive some of its distributional properties. In Section 3, we introduce a general cure rate model. In Section 4, we describe a parameter estimation by the maximum likelihood method using the expectation–maximization (EM) algorithm. In Section 5, we present two simulation studies, where the first one is related to assessing the performance of the maximum likelihood estimation (MLE) using the EM algorithm and the second one is the misspecification for the time-to-event for the concurrent causes. In Section 6, we present a real data set on bone marrow transplantation. Finally, some conclusions are included in Section 7.

2. The SHN Distribution

The SHN distribution [1] with parameters σ > 0 and α > 0 (denoted by SHN ( σ , α ) ) is a model with positive support. Its probability density function (pdf) is given by
f S H N ( t ; λ ) = α 2 α π σ α t ( α + 1 ) γ t 2 2 σ 2 , α + 1 2 , t > 0 ,
where λ = ( α , σ ) and γ ( x , a ) = 0 x u a 1 e u d u is the lower incomplete gamma function. The cumulative distribution function (cdf) of the SHN model is given by
F S H N ( t ; λ ) = 0 t f S H N ( w ; λ ) d w = α 2 α π σ α 0 t w ( α + 1 ) γ w 2 2 σ 2 , α + 1 2 d w .
In the next Proposition, we present an alternative expression for the cdf.
Proposition 1.
The cdf of the SHN distribution is given by
F S H N ( t ; λ ) = 2 Φ t σ 1 2 α π σ t α Γ α + 1 2 G w 2 2 σ 2 , α + 1 2 , t > 0 ,
where Φ ( · ) denotes the cdf of the standard normal model and G ( · , a ) is the cdf of the gamma model with shape and rate parameters equal to a and 1, respectively.
Proof. 
By definition
F S H N ( t ; λ ) = α 2 α π σ α 0 t w ( α + 1 ) Γ α + 1 2 G w 2 2 σ 2 , α + 1 2 d w .
Using techniques of integration by parts, we have that
u = G w 2 2 σ 2 , α + 1 2 d u = w σ 2 g w 2 2 σ 2 , α + 1 2 d w = w σ 2 Γ α + 1 2 w 2 2 σ 2 α + 1 2 1 exp w 2 2 σ 2 d w d v = w ( α + 1 ) v = w α α ,
where g ( · , · ) denotes the density function of the gamma distribution. Therefore,
F S H N ( t ; λ ) = α 2 α π σ α Γ α + 1 2 × G t 2 2 σ 2 , α + 1 2 t α α + 1 Γ α + 1 2 α σ α π 2 α 0 t 1 2 π σ exp w 2 2 σ 2 d w .
and the result follows after identifying the pdf for the normal distribution within the integral. □
Corollary 1.
The survival function for the SHN model is given by
S S H N ( t ; λ ) = 2 Φ t σ + 2 α π σ t α Γ α + 1 2 G t 2 2 σ 2 , α + 1 2 , t > 0 .
Note that Corollary 1 provides an alternative way to program the survival function, especially from a computational point of view, because many software tools include the cdf of the standard normal and gamma models and the lower incomplete gamma function.
The SHN distribution has a convenient stochastic representation. If X H N ( σ ) and Y U ( 0 , 1 ) are independent (i.e., the half-normal and uniform distributions, respectively), then
T = X Y 1 / α S H N ( σ , α ) .
The mean, variance, asymmetry ( β 1 ) and kurtosis ( β 2 ) coefficients from this distribution are
  • E ( T ) = 2 π α σ α 1 , α > 1 .
  • Var ( T ) = α σ 2 ( π 2 ) α ( α 2 ) + π π ( α 1 ) 2 ( α 2 ) , α > 2 .
  • β 1 = π 2 ( α 2 ) 4 π α 2 ( α 2 ) ( α 3 ) ( α 1 ) 2 ( α 4 ) ( α + 1 ) α ( α 3 ) α ( π 2 ) ( α 2 ) + π 3 / 2 , α > 3 .
  • β 2 = 3 α ( α 2 ) 2 ( α 3 ) π 2 ( α 1 ) 4 4 α 3 ( α 4 ) 4 π α 2 ( α 1 ) 2 ( α 2 ) ( α 4 ) ( α 2 3 α + 8 ) α 2 ( α 3 ) ( α 4 ) α ( π 2 ) ( α 2 ) + π 2 , α > 4 .
Note that σ is a scale parameter and α is related mainly with the kurtosis of the model. In fact, as discussed in Olmos et al. [1], low values for α allow us to accommodate distributions with greater kurtosis coefficients. On the other hand, when α , the HN distribution is recovered. Besides being a parsimonious model, an interesting property from the SHN distribution is that its tails are heavier than common distributions with positive support, as we enunciate in the following proposition.
Proposition 2.
Suppose a distribution with positive support satisfying the following conditions:
(A1) 
Its density function can be written in the form f ( t ; λ ) = C t a exp { g ( t ; λ ) } , t > 0 , where C does not depend on t, a R and g ( t ; λ ) 0 , t > 0 .
(A2) 
lim t t a exp { g ( t ; λ ) } = 0 , a > a + 1 .
Then, the SHN model has heavier tails than such a distribution.
Proof. 
To show that the SHN model has heavier tails than a model with a density function given as in assumption A1, it is enough to prove that
lim t + f ( t ; λ ) f S H N ( t ; σ , α ) = 0 .
Therefore,
lim t + f ( t ; λ ) f S H N ( t ; σ , α ) = lim t + C t a exp { g ( t ; λ ) } α σ α 2 α π t ( α + 1 ) γ t 2 2 σ 2 , α + 1 2 = C α σ α 2 α π lim t + t a + α + 1 exp { g ( t ; λ ) } γ t 2 2 σ 2 , α + 1 2 .
As lim t + γ t , c = Γ ( c ) , c > 0 , α > 0 , by assumption A2 and by the continuity of the limit, we have the result. □
Corollary 2.
If X 1 and X 2 are two random variables with density functions satisfying conditions A1 and A2 in Proposition 2 and Y = p X 1 + ( 1 p ) X 2 , p ( 0 , 1 ) , then the SHN model also has heavier tails than the distribution related to Y.
Remark 1.
Common distributions used in a cure rate model context satisfy the conditions A1 and A2 in Proposition 2. For instance:
(a) 
If C = α / σ α , a = α 1 and g ( t ; λ ) = ( t / σ ) α , the Weibull model is recovered (Chen et al. [15], Rodrigues et al. [16], Pal and Balakrishnan [17]).
(b) 
If C = [ Γ ( α ) σ α ] 1 , a = α 1 and g ( t ; λ ) = t / σ , the gamma distribution is recovered (Balakrishnan and Pal [18], Wiangnak and Pal. [19], Ortega et al. [20]).
(c) 
If C = [ σ 2 π ] 1 , a = 1 and g ( t ; λ ) = ( log t α ) 2 / ( 2 σ 2 ) , we obtain the log-normal model (Balakrishnan and Pal [21], Gallardo et al. [22]).
(d) 
If C = σ / ( 2 π ) exp ( σ / α ) , a = 3 / 2 and g ( t ; λ ) = ( σ / 2 ) t / α 2 + 1 / t , the inverse Gaussian model is recovered. We denote it as I G ( α , σ ) .
(e) 
If C = σ / ( 2 π ) exp ( σ α ) , a = 1 / 2 and g ( t ; λ ) = ( σ / 2 ) t / α 2 + 1 / t , the density function of the reciprocal of an inverse Gaussian random variable is recovered. Up to this moment, this model has not been used in a cure rate model context. We denote it as R I G ( α , σ ) .
(f) 
If C = [ B ( α , σ ) ] 1 , a = α 1 and g ( t ; λ ) = ( α + σ ) log ( 1 + t ) , the density of the beta prime distribution is recovered (Leao et al. [23]).
Remark 2.
The Birnbaum–Saunders (BS, Birnbaum and Saunders [24,25]) model with density function given by
f ( t ; α , σ ) = 1 2 π exp 1 2 α 2 t σ + σ t 2 ( t + σ ) 2 α σ t 3 , t > 0 ,
can be written as a mixture between I G ( β , β / α 2 ) and R I G ( β , 1 / ( β α 2 ) ) with equal probabilities (see Desmond [26] for details). Therefore, by Corollary 2, the SHN distribution has heavier tails than the BS distribution.
In the next Section, we discuss the use of the SHN model in the presence of long-term survivors.

3. A General Cure Rate Model and the SHN Distribution

To exemplify the use of the SHN distribution in a cure rate model context, we consider a general class of models based on a competitive risk scheme. The reader is referred to Chen et al. [15] for more details. Let M be a random variable denoting the initial number of carcinogenic cells of an individual. Several models have been considered for this scheme, among others, Bernoulli [27], Poisson [15] and negative binomial [16].
Let W a , a = 1 , , M be the random variable representing the time at which the ath concurrent cause produces the event of interest. For instance, in a cancer context, W a represents the progressive time for the ath tumor cell. As shown in the Appendix A, a general class of model is produced when it is assumed that M follows a power series distribution (PS, Cancho et al. [28]). As is common in this context, it is also assumed that, conditional on M, the W a ’s are independent identically distributed with a common cumulative distribution function F ( · ; λ ) and survival function S ( · ; λ ) = 1 F ( · ; λ ) , where λ is a vector of unknown parameters. Moreover, it is assumed that the sequence W 1 , W 2 , , is independent of M. Assuming that only one concurrent cause can produce the of interest, the observable time of the occurrence of the event of interest is given by
T = + , if M = 0 , min ( W 1 , , W M ) , if M > 0 .
Under this setting, the population survival function is given by
S p o p ( t ; θ , λ ) = A ( θ S ( t ; λ ) ) A ( θ ) ,
where A ( · ) is related to the PS distribution. Details about the function A ( · ) for different particular models are presented in Appendix A. From Equation (4), it is immediate that the cure rate of the model is given by q 0 = a 0 / A ( θ ) and the survival function for the susceptible individuals is
S s u s ( t ; θ , λ ) = a 0 A ( θ S ( t ; λ ) ) a 0 A ( θ ) .
For heterogeneous populations, we consider a set of covariates of dimension r + 1 for the ith individual, say x i = ( 1 , x i 1 , x i 2 , , x i r ) (i.e., considering the intercept term), related to the cure rate term of the model. For the Poisson, binomial and COM-Poisson, such covariates can be introduced as
θ i = exp x i β , i = 1 , , n ,
where n is the sample size and β = ( β 0 , β 1 , , β r ) is the vector of unknown regression coefficients. For the rest of the discrete models, the covariates can be included as
θ i = exp x i β 1 + exp x i β , i = 1 , , n .
In the literature, many models have been considered for S ( · ; λ ) . As discussed in Remark 1, Weibull, gamma and log-normal models are common alternatives, to name a few. However, we remark that, up to this moment, no distribution with heavy tails has been used in this context and in particular, the SHN distribution has not been considered to model the survival function for the concurrent causes. Henceforth, we refer this model as PS–SHN. It is interesting to explore the use of the SHN distribution in a cure rate model context because it is usual that most failures occur until a certain time and from that moment mainly censoring times are observed. However, in diseases, from a certain time, it is considered that the patient is “cured”. In practice, and especially in clinical trials with large sample sizes, it may occur that failure times are observed in unusual times, higher than expected.

4. Estimation

In order to estimate the model parameters, we utilize the maximum likelihood (ML) method. Let us consider the situation when the time-to-event is not completely observed and is subject to right-censoring. Let c i denote the censoring time for the ith individual. We observe y i = min ( t i , c i ) and δ i = I ( t i c i ) , where δ i = 1 if t i is a time-to-event and δ i = 0 if right-censored, i = 1 , , n . From n observed vectors ( y 1 , δ 1 , x 1 ) , , ( y n , δ n , x n ) , the corresponding likelihood function, under noninformative censoring, can be expressed as
( ψ ) = i = 1 n δ i log [ f p o p ( y i ; θ i , λ ) ] + ( 1 δ i ) log [ S p o p ( y i ; θ i , λ ) ] = i = 1 n { ( 1 δ i ) log [ A ( θ i S S H N ( y i ; λ ) ) ] log [ A ( θ i ) ] + δ i log ( θ i ) + log [ f S H N ( y i ; λ ) ] + log [ A ( θ i S S H N ( y i ; λ ) ) ] } ,
where ψ = ( β , λ ) , and f S H N ( · ; · ) and S S H N ( · ; · ) are defined in (1) and (2). Details for the construction of Equation (8) can be seen in Cancho et al. [28] and Gallardo et al. [29].
The MLE of ψ , denoted by ψ ^ , can be obtained by maximizing Equation (8) in relation to ψ . However, this procedure can be computationally expensive and it would not necessarily provide a global maximum, especially when the dimension of β is high. To facilitate the obtainment of the MLE for ψ , we can use the EM algorithm developed in Gallardo et al. [29] for this class of models. The kth iteration of such procedure consists of the following steps:
  • E-step: For i = 1 , , n , compute
    M ˜ i ( k ) = ( 1 δ i ) E M i ; θ i ( k 1 ) S i ( k 1 ) + δ i E M i 2 ; θ i ( k 1 ) S i ( k 1 ) E M i ; θ i ( k 1 ) S i ( k 1 ) ,
    where E M i ; μ i and E M i 2 ; μ i denote the two first moments of the PS distribution with parameter μ i given by
    E M i ; μ i = μ i log A ( μ i ) μ i and E M i 2 ; μ i = μ i E M i ; μ i μ i + E 2 M i ; μ i .
  • M1-step: Given M ( k ) = ( M ˜ 1 ( k ) , , M ˜ n ( k ) ) , find β that maximizes Q 1 ( β ψ ( k ) ) in relation to β ( k ) , where
    Q 1 ( β ψ ( k ) ) = i = 1 n M ˜ i ( k ) log ( θ i ) log [ A ( θ i ) ] .
  • M2-step: Given M ( k ) , find λ ( k ) that maximizes Q 2 ( λ ψ ( k ) ) in relation to λ , where
    Q 2 ( λ ψ ( k ) ) = i = 1 n { ( M ˜ i ( k ) δ i ) log [ 2 Φ y i σ + 2 α π σ y i α Γ α + 1 2 G y i 2 2 σ 2 , α + 1 2 ] + δ i [ log ( α ) + α 2 log ( 2 ) + α log ( σ ) ( α + 1 ) log ( y i ) + log γ y i 2 2 σ 2 , α + 1 2 ] } .
The E-, M1- and M2-steps are repeatedly alternated until a suitable convergence rule is satisfied, e.g., the difference in successive values of the estimates is less than a prespecified tolerance. Note that this procedure requires two independent maximization procedures in relation to β and λ of dimensions r + 1 and 2, respectively, instead of a unique maximization procedure in relation to ψ , which has a dimension ( r + 3 ) . Standard errors for ψ ^ can be computed based on the Hessian matrix, i.e., H ( ψ ^ ) = 2 ( ψ ) / ψ ψ | ψ = ψ . This matrix can be estimated numerically using, for instance, the package pracma [30] of R [31]. As the estimators obtained via the EM algorithm are MLE, they satisfy the asymptotic distribution
n ψ ^ ψ D N 0 , H ( ψ ^ ) 1 , as n + ,
where D denotes convergence in distribution. This result allows us to compute an approximated confidence interval for the parameters. Moreover, based on the delta method (see Sen et al. [32] for details), we can also build a confidence interval for any scalar function of ψ , say g ( ψ ) , using
n g ψ ^ g ψ D N 0 , g ψ H ( ψ ^ ) 1 g ψ , as n + ,
where g ψ = g ψ / ψ denotes the gradient of g ψ . In particular, a very useful case is g ψ = a 0 / A ( θ i ) , i.e., the cure rate of the corresponding model. The described estimation procedure was incorporated in the package PScr [33] of R [31].

5. Simulation Studies

In this Section, we present two simulation studies. The first one is related to assessing the performance of the MLE using the EM algorithm discussed in Section 3. The second study is devoted to assessing the performance of the PS–SHN compared with the PS–Weibull model when data are drawn from the PS–Weibull model with some modified scheme. The software tool used for this study was R [31].

5.1. Assessing the Performance of MLE in a Finite Sample Size

In this subsection, we present a simulation study to evaluate the performance of the MLE estimators of the PS–SHN in a finite sample size. We considered three particular distributions belonging to the PS model for the number of concurrent causes: Poisson, logarithmic and geometric. We also considered three sample sizes: 100, 200 and 400. For each individual, we drew two covariates, say x 1 and x 2 , where x 1 was drawn from the Bernoulli distribution with a success probability of 0.5 and x 2 from the standard normal model. For this reason, β = ( β 0 , β 1 , β 2 ) had dimension three. For each particular distribution for the concurrent causes, we considered two scenarios, named high and low cure rates. For the high cure rate, we considered that for β 2 fixed at zero, the values of the two levels for x 1 for the cure rate were 0.7 and 0.45. With such information, the values of β 0 and β 1 could be determined from the equations
exp exp ( β 0 ) = 0.7 and exp exp ( β 0 + β 1 ) = 0.45
for the Poisson case and
1 A exp β 0 1 + exp β 0 = 0.7 and 1 A exp β 0 + β 1 1 + exp β 0 + β 1 = 0.45 ,
for the logarithmic and geometric cases. A similar procedure could be applied for the low cure rate case, where we considered that, for β 2 fixed at zero, the values of the two levels for x 1 for the cure rate were 0.5 and 0.25. Summarizing, for the Poisson model, we had β 0 = 1.031 , β 1 = 0.806 and β 2 = 0 for the high cure rate case and β 0 = 0.367 , β 1 = 0.693 and β 2 = 0 for the lower cure rate case; for the logarithmic distribution we had β 0 = 0.132 , β 1 = 1.588 and β 2 = 0 for the high cure rate case and β 0 = 1.366 , β 1 = 2.534 and β 2 = 0 for the lower cure rate case; and for the geometric model, we had β 0 = 0.847 , β 1 = 1.048 and β 2 = 0 for the high cure rate case and β 0 = 0 , β 1 = 1.099 and β 2 = 0 for the lower cure rate case. Finally, we also considered two values for α , 0.75 and 2.5 and two values for σ , 1 and 3. The combinations of distributions for the concurrent causes, sample size, type of cure rate (high or low), α and σ produced 72 different scenarios. However, we only present the cases for the Poisson and geometric models, although conclusions are similar for the logarithmic distribution. For each, we considered 1000 replicates and computed the MLE based on the EM algorithm described in Section 3. Such results are summarized in Table 1 and Table 2 through the average of the estimated bias (bias), the average of the estimated standard error (s.e.) based on the Hessian matrix numerically evaluated at ψ = ψ ^ , the root of the estimated mean squared error (RMSE) and the 95% coverage probability (CP). As expected, the bias of the estimators was reduced and the terms s.e. and RMSE were closer when the sample size was increased, suggesting the asymptotic consistency of the estimators. Additionally, the terms CP were closer to the nominal values used for their construction when n increased.

5.2. Misspecification for the Time-to-Event for the Concurrent Causes

In this subsection, we assessed the performance of the MLE for the Poisson–SHN and geometric–SHN models when data were drawn from an artificial model with heavy tails based on the Weibull distribution.
The generation of the data was very similar to that in the last subsection, considering the following changes:
  • The times related to the concurrent causes were drawn from the Weibull model with mean E ( W ) = μ and variance 1. We considered three values for μ : 2.5, 5.0 and 7.5. For the parameterization considered in part a) from Remark 1, this corresponded to α = 2.70 and σ = 7.90 , α = 5.80 and σ = 29.16 and α = 8.97 and σ = 62.75 , respectively.
  • A determined number of failure times (2% of the sample size, i.e., two observations for n = 100 , four observations for n = 200 and eight observations for n = 400 ), were imputed as max t f a i l + U σ t f a i l , where max t f a i l and σ t f a i l denote the maximum and the standard error of the failure times and U U ( 1 , 2 ) . This scheme artificially simulated a distribution for the time-to-event of the concurrent causes with a heavier tail than the Weibull distribution, but not corresponding to the SHN.
The main goal of this study was to compare the coefficients related to the cure (i.e., the components of the vector β ) obtained with the Poisson–SHN and Poisson–Weibull models (when the true model for M is the Poisson distribution) and geometric–SHN and geometric–Weibull models (when the true model is the geometric distribution). All the considered combinations of E ( W ) , sample size, type of cure (high/low) and model for M (Poisson/geometric), totalizing 36 cases. Each was replicated 1000 times. Table 3 summarizes the bias and RMSE for each combination. Note that, in all the considered cases, the bias and RMSE were lower (or at least the same) for the models considering the SHN rather than the Weibull model. Such differences in favor of the SHN were minimized when the sample size was increased.

6. Application to Bone Marrow Transplantation

In this section, we present a real data set application from the EBMT presented in the introduction section. Additionally, the prophylaxis covariate (dichotomous, 1730 patients with “no” and 549 patients with “yes”) was available.
For the analysis, we considered some existent models in the literature, all of them belonging to the power-series cure rate model, such as the Poisson, logarithmic, negative binomial, Bernoulli, COM-Poisson and polylogarithm models. Additionally, we considered three commons models used for the concurrent causes such as the Weibull, gamma and BS models. Table 4 shows the Akaike information criterion (AIC) [34] and Bayesian information criterion (BIC) criteria [35] for those models in the ebmt4 data. Note that both criteria favored models where the SHN for concurrent causes was used.
Table 5 shows the estimated parameters for the Bern/SHN cure rate model and its standard errors. Note that both parameters related to the cure were significant with 5% of significance. We also considered a residuals analysis in the selected model based on the Cox–Snell residuals [36,37] and the randomized quantile residuals [38]. The first residual was compared with its Nelson–Aalen estimator and the second with the quantile of the standard normal distribution. In both cases, if the model was correct, we expected to observe the identity function for the respective graphs. According to Figure 2, such residuals suggested that the choice model was correct. Finally, we also computed the Kaplan–Meier (K–M) estimator for patients with and without prophylaxis and the respective point estimation and 95% confidence interval for the survival function for each time. Such confidence interval was based on the delta method discussed in Equation (10). Figure 3 shows those results. Note the closeness of the estimation for the survival function and that provided by the K–M estimator. As expected, patients without prophylaxis had a lower survival function than patients with prophylaxis.

7. Conclusions

In this work, we introduced the PS–SHN cure rate model. We provided a mathematical treatment of the new distribution including an alternative expression for the cumulative distribution function. An EM algorithm to compute maximum likelihood estimation procedures in finite samples was discussed. Two simulation studies were performed to assess some properties of the estimators, showing the good performance of the proposed estimation procedure in finite samples. In the analysis of the bone marrow transplant data, the SHN cure rate model yielded a better fit when compared to other well-known models in the literature, according to the AIC and BIC criteria. We developed an R package to perform inference in the power-series cure rate models. As part of future research, we plan to explore other estimation methods for the proposed cure rate model, for instance, a Bayesian approach. A second line of a future research will be a bias-reduction methodology to estimate the parameters of the proposed regression model.

Author Contributions

Conceptualization, D.I.G. and Y.M.G.; formal analysis, Y.M.G. and H.J.G.; investigation, Y.M.G. and M.J.G.-N.; software, D.I.G. and M.B.; validation H.J.G., M.J.G.-N. and M.B.; writing—original draft preparation H.J.G.; writing—review and editing D.I.G., Y.M.G., M.J.G.-N. and M.B.; Supervision, D.I.G. and Y.M.G. All of the authors contributed significantly to this research article. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data set used in Section 5 was duly referenced.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Parametrization Used for Some Discrete Models

The power series distribution has a probability mass function (p.m.f.) given by
P ( M = m ; θ ) = a m θ m A ( θ ) , m = 0 , 1 , 2 , ,
where A ( θ ) = m = 0 a m θ m and θ Θ , with Θ the parameter space for θ .
Table A1. Some particular cases of the power series distribution.
Table A1. Some particular cases of the power series distribution.
Negative
PoissonBinomialBinomialLogarithmicPolylogarithmCOM-Poisson
a m ( m ! ) 1 m + q 1 m q m ( m + 1 ) 1 ( m + 1 ) q ( m ! ) q
A ( θ ) e θ ( 1 θ ) q ( 1 + θ ) q log ( 1 θ ) θ L i q ( θ ) θ 1 Z ( θ , q )
Θ ( 0 , ) ( 0 , 1 ) ( 0 , ) ( 0 , 1 ) ( 0 , 1 ) ( 0 , )
NotationPo ( θ ) NB ( q , θ ) Bin ( q , θ ) Lo ( θ ) PL ( q , θ ) COM-Po ( q , θ )
NOTE: L i denotes the polylogarithm defined as L i q ( θ ) = m = 1 m q θ m and Z ( θ , q ) = m = 0 ( m ! ) q θ m . In the NB, Bin, PL and COM-Po distributions, q is considered known.
Observations:
  • NB ( 1 , θ ) = Geo ( θ ) , i.e., the geometric distribution.
  • Bin ( 1 , θ ) = Bern ( θ ) , i.e., the Bernoulli distribution.
  • NB ( q , θ ) = Bin ( q , θ ) , if q 1 N and 0 < θ / q < 1 .
  • lim q COM-Po ( q , θ ) = Bern ( ( 1 + θ ) 1 ) .
  • lim q 0 COM-Po ( q , θ ) = Geo ( 1 θ ) , if θ < 1 .

References

  1. Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap. 2012, 53, 875–886. [Google Scholar] [CrossRef]
  2. Gómez, Y.M.; Gallardo, D.I.; De Castro, M. A regression model for positive data based on the slash half-normal distribution. Revstat 2021, 19, 553–573. [Google Scholar]
  3. Hardy, R.E.; Ikpeazu, E.V. Bone marrow transplantation: A review. J. Natl. Med. Assoc. 1989, 81, 518–523. [Google Scholar]
  4. Chinen, J.; Buckley, R.H. Transplantation immunology: Solid organ and bone marrow. J. Allergy Clin. Immunol. 2010, 125, S324–S335. [Google Scholar] [CrossRef] [Green Version]
  5. Simpson, E.; Dazzi, F. Bone Marrow Transplantation 1957–2019. Front. Immunol. 2019, 10, 1246. [Google Scholar] [CrossRef] [Green Version]
  6. The European Society for Blood and Marrow Transplantation (s.f.). The EBMT. Available online: https://www.ebmt.org (accessed on 10 December 2022).
  7. de Rooij, J.D.; Zwaan, C.M.; van den Heuvel-Eibrink, M. Pediatric AML: From Biology to Clinical Management. J. Clin. Med. 2015, 4, 127–149. [Google Scholar] [CrossRef] [Green Version]
  8. Sander, A.; Zimmermann, M.; Dworzak, M.; Fleischhack, G.; von Neuhoff, C.; Reinhardt, D.; Kaspers, G.J.; Creutzig, U. Consequent and Intensified Relapse Therapy Improved Survival in Pediatric AML: Results of Relapse Treatment In 379 Patients of Three Consecutive Aml-Bfm Trials. Leukemia 2010, 24, 1422–1428. [Google Scholar] [CrossRef] [Green Version]
  9. Fiocco, M.; Putter, H.; van Houwelingen, H.C. Reduced-rank proportional hazards regression and simulation-based prediction for multi-state models. Stat. Med. 2008, 27, 4340–4358. [Google Scholar] [CrossRef]
  10. Putter, H.; de Wreede, L.; Fiocco, M. Mstate: Data Preparation, Estimation and Prediction in Multi-State Models. R package version 2016.0-2. 2016. Available online: https://CRAN.R-project.org/package=mstate (accessed on 10 December 2022).
  11. Horio, T.; Morishita, E.; Mizuno, S.; Uchino, K.; Hanamura, I.; Espinoza, J.L.; Morishima, Y.; Kodera, Y.; Onizuka, M.; Kashiwase, K.; et al. Donor Heme Oxygenase-1 Promoter Gene Polymorphism Predicts Survival after Unrelated Bone Marrow Transplantation for High-Risk Patients. Cancers 2020, 12, 424. [Google Scholar] [CrossRef] [Green Version]
  12. Radujkovic, A.; Kordelas, L.; Bogdanov, R.; Müller-Tidow, C.; Beelen, D.W.; Dreger, P.; Luft, T. Interleukin-18 and Hematopoietic Recovery after Allogeneic Stem Cell Transplantation. Cancers 2020, 12, 2789. [Google Scholar] [CrossRef]
  13. Tsai, X.C.; Chen, T.T.; Gau, J.P.; Wang, P.N.; Liu, Y.C.; Lien, M.Y.; Li, C.C.; Yao, M.; Ko, B.S. Outcomes of Different Haploidentical Transplantation Strategies from the Taiwan Blood and Marrow Transplantation Registry. Cancers 2022, 14, 1097. [Google Scholar] [CrossRef]
  14. Sabino Pinho de Oliveira, B.; Putti, S.; Naro, F.; Pellegrini, M. Bone Marrow Transplantation as Therapy for Ataxia-Telangiectasia: A Systematic Review. Cancers 2020, 12, 3207. [Google Scholar] [CrossRef]
  15. Chen, M.H.; Ibrahim, J.G.; Sinha, D. A new Bayesian model for survival data with a surviving fraction. J. Am. Stat. Assoc. 1999, 94, 909–919. [Google Scholar] [CrossRef]
  16. Rodrigues, J.; Cancho, V.G.; de Castro, M.A.; Louzada-Neto, F. On the unification of the long-term survival models. Stat. Probab. Lett. 2009, 79, 753–759. [Google Scholar] [CrossRef]
  17. Pal, S.; Balakrishnan, N. Likelihood inference for COM-Poisson cure rate model with interval-censored data and Weibull lifetimes. Stat. Methods Med. Res. 2017, 26, 2093–2113. [Google Scholar] [CrossRef]
  18. Balakrishnan, N.; Pal, S. Likelihood Inference for Flexible Cure Rate Models with Gamma Lifetimes. Commun. Stat. Theory Methods 2015, 19, 4007–4048. [Google Scholar] [CrossRef]
  19. Wiangnak, P.; Pal, S. Gamma Lifetimes and Associated Inference for Interval Censored Cure Rate Model with COM-Poisson Competing Cause. Commun. Stat. Theory Methods 2018, 47, 1491–1509. [Google Scholar] [CrossRef]
  20. Ortega, E.M.M.; Cordeiro, G.M.; Hashimoto, E.M.; Suzuki, A.K. Regression models generated by gamma random variables with long-term survivors. Commun. Stat. Appl. Methods 2017, 24, 43–65. [Google Scholar] [CrossRef] [Green Version]
  21. Balakrishnan, N.; Pal, S. Lognormal lifetimes and likelihood-based inference for flexible cure rate models based on COM-Poisson family. Comput. Stat. Data Anal. 2013, 67, 41–67. [Google Scholar] [CrossRef]
  22. Gallardo, D.I.; Bolfarine, H.; Pedroso-de-Lima, A.C. An EM algorithm for estimating the destructive weighted Poisson cure rate model. J. Stat. Comput. Simul. 2016, 86, 1497–1515. [Google Scholar] [CrossRef]
  23. Leao, J.; Bourguignon, M.; Saulo, H.; Santos-Neto, M.; Calsavara, V. The Negative Binomial Beta Prime Regression Model with Cure Rate: Application with a Melanoma Dataset. J. Stat. Theory Pract. 2021, 15, 63. [Google Scholar] [CrossRef]
  24. Birnbaum, Z.W.; Saunders, S.C. A new family of life distributions. J. Appl. Probab. 1969, 6, 319–327. [Google Scholar] [CrossRef]
  25. Birnbaum, Z.W.; Saunders, S.C. Estimation for a family of life distributions with applications to fatigue. J. Appl. Probab. 1969, 6, 328–347. [Google Scholar] [CrossRef]
  26. Desmond, A.F. On the relationship between two fatigue models. IEEE Trans. Reliab. 1986, 35, 167–169. [Google Scholar] [CrossRef]
  27. Berkson, J.; Gage, R. Survival curve for cancer patients following treatment. J. Am. Stat. Assoc. 1952, 47, 501–515. [Google Scholar] [CrossRef]
  28. Cancho, V.G.; Louzada, F.; Ortega, E.M. The power series cure rate model: An application to a cutaneous melanoma data. Commun. Stat. Simul. Comput. 2013, 42, 586–602. [Google Scholar] [CrossRef]
  29. Gallardo, D.I.; Romeo, J.S.; Meyer, R. A simplified estimation procedure based on the EM algorithm for the power series cure rate model. Commun. Stat. Simul. Comput. 2017, 46, 6342–6359. [Google Scholar] [CrossRef]
  30. Borchers, H.W. Pracma: Practical Numerical Math Functions. R Package Version 2.4.2. 2022. Available online: https://CRAN.R-project.org/package=pracma (accessed on 10 December 2022).
  31. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 10 December 2022).
  32. Sen, P.K.; Singer, J.M.; Pedroso-de-Lima, A.C. From Finite Sample to Asymptotic Methods in Statistics; Cambridge University Press: New York, NY, USA, 2010. [Google Scholar]
  33. Gallardo, D. PScr: Estimation for the Power Series Cure Rate Model. R Package Version 1.0. 2022. Available online: https://CRAN.R-project.org/package=PScr (accessed on 10 December 2022).
  34. Akaike, H. A new look at the statitstical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  35. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  36. Conlon, A.S.; Taylor, J.M.; Sargent, D.J. Multi-state models for colon cancer recurrence and death with a cured fraction. Stat. Med. 2014, 33, 1750–1766. [Google Scholar] [CrossRef] [Green Version]
  37. Cox, D.R. A general definition of residuals. J. R. Stat. Soc. Ser. B 1968, 30, 248–275. [Google Scholar] [CrossRef]
  38. Dunn, P.K.; Smyth, G.K. Randomized quantile residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
Figure 1. Kaplan–Meier estimator for the ebmt4 data set (left panel) and histogram for the failure times (right panel).
Figure 1. Kaplan–Meier estimator for the ebmt4 data set (left panel) and histogram for the failure times (right panel).
Mathematics 11 00518 g001
Figure 2. Cox–Snell residuals versus their Nelson–Aalen estimator (left panel) and the randomized quantile residuals versus the N(0,1) quantiles and its p-values for the Kolmogorov–Smirnov test (right panel).
Figure 2. Cox–Snell residuals versus their Nelson–Aalen estimator (left panel) and the randomized quantile residuals versus the N(0,1) quantiles and its p-values for the Kolmogorov–Smirnov test (right panel).
Mathematics 11 00518 g002
Figure 3. Kaplan–Meier (K–M) estimator for patients with and without prophylaxis and the respective point estimation and 95 % confidence interval for the survival function for each time.
Figure 3. Kaplan–Meier (K–M) estimator for patients with and without prophylaxis and the respective point estimation and 95 % confidence interval for the survival function for each time.
Mathematics 11 00518 g003
Table 1. Simulation study to assess the performance of MLE in the Poisson–SHN model.
Table 1. Simulation study to assess the performance of MLE in the Poisson–SHN model.
Cure n = 100 n = 200 n = 400
Rate σ α Parameterbiass.e.RMSECPbiass.e.RMSECPbiass.e.RMSECP
High10.75 β ^ 0 0.0230.3700.2550.9780.0120.2480.2000.9740.0100.1770.1470.965
β ^ 1 −0.0190.3350.3020.961−0.0060.2340.2100.953−0.0010.1640.1450.952
β ^ 2 0.0050.1650.1440.9630.0030.1160.1040.958−0.0020.0810.0740.949
α ^ 0.2940.9790.7510.9750.1720.6000.5050.9600.0550.3760.3350.951
σ ^ 0.0820.4520.3720.9650.0520.3210.2760.9550.0150.2270.2060.952
2.5 β ^ 0 0.0020.2080.1800.9620.0020.1450.1240.9590.0010.1020.0890.956
β ^ 1 −0.0140.3160.2810.959−0.0130.2210.1920.958−0.0080.1550.1400.953
β ^ 2 −0.0110.1550.1400.942−0.0050.1090.1030.943−0.0030.0770.0710.951
α ^ 0.2871.6371.4160.9290.1401.3421.0910.9370.1270.9950.8540.946
σ ^ −0.0390.3020.2290.933−0.0110.2110.1710.941−0.0100.1460.1310.943
30.75 β ^ 0 −0.1400.9220.3070.962−0.1250.7380.2500.952−0.1010.6140.2120.950
β ^ 1 0.0130.3710.3270.9600.0090.2580.2260.9540.0030.1810.1700.953
β ^ 2 0.0060.1830.1640.9480.0020.1270.1110.9480.0000.0890.0800.950
α ^ 0.6871.6581.3040.9720.6571.3891.2690.9610.5841.1931.1690.955
σ ^ −0.2691.0730.8460.957−0.0880.8920.7190.956−0.0400.6520.6020.953
2.5 β ^ 0 0.1000.5050.2630.9690.0850.3390.2010.9630.0780.2230.1600.958
β ^ 1 −0.0110.3250.2950.958−0.0090.2260.2010.952−0.0080.1590.1440.951
β ^ 2 0.0050.1610.1510.9390.0050.1120.0970.9420.0030.0780.0680.949
α ^ −0.7911.9761.5830.938−0.5491.7971.4960.941−0.4091.4901.4420.949
σ ^ −0.6411.4280.8930.919−0.4091.2370.6910.928−0.3190.9390.5920.930
Low10.75 β ^ 0 0.0130.4410.2960.9770.0090.2850.2260.9600.0050.2060.1720.954
β ^ 1 −0.0150.4130.3690.960−0.0130.2890.2700.952−0.0010.2020.1810.951
β ^ 2 0.0130.1980.1800.9510.0080.1380.1250.9500.0030.0960.0880.950
α ^ 0.2931.1270.7470.9720.2620.7780.6320.9650.0940.4590.3910.959
σ ^ 0.0840.5310.3850.9440.0630.3920.3240.9470.0370.2720.2430.949
2.5 β ^ 0 0.0040.2380.2100.9610.0020.1630.1440.9530.0010.1140.1040.951
β ^ 1 −0.0100.3840.3290.965−0.0090.2700.2390.957−0.0040.1890.1780.953
β ^ 2 0.0120.1850.1660.9600.0080.1290.1180.9590.0040.0910.0810.951
α ^ 0.1751.9751.2710.9240.1341.7221.1440.9380.0871.1980.9680.944
σ ^ −0.0570.3530.2580.912−0.0160.2440.1900.940−0.0020.1700.1460.943
30.75 β ^ 0 −0.1510.9990.3460.941−0.1390.8050.2820.948−0.1050.6790.2270.950
β ^ 1 −0.0120.4640.4020.969−0.0100.3230.2790.962−0.0070.2250.2020.958
β ^ 2 0.0120.2210.2060.9430.0100.1530.1400.9460.0040.1070.0980.950
α ^ 0.5261.6601.3360.9710.3711.3781.2510.9650.2941.2761.1540.959
σ ^ −0.4671.2850.9380.929−0.1762.0770.7190.936−0.0251.7850.6190.940
2.5 β ^ 0 0.1070.5860.2950.9730.0930.4050.2250.9620.0860.2610.1770.959
β ^ 1 0.0130.3980.3650.9550.0080.2760.2390.9540.0000.1940.1790.951
β ^ 2 −0.0170.1920.1730.965−0.0020.1330.1190.9610.0000.0930.0810.959
α ^ −0.9124.1811.5450.934−0.6823.9101.4890.936−0.4933.6011.4620.942
σ ^ −0.7431.4900.9690.961−0.5261.2870.7640.960−0.3791.0380.6470.958
Table 2. Simulation study to assess the performance of MLE in the geometric–SHN model.
Table 2. Simulation study to assess the performance of MLE in the geometric–SHN model.
Cure n = 100 n = 200 n = 400
Rate σ α Estimatorbiass.e.MSECPbiass.e.MSECPbiass.e.MSECP
High10.75 β ^ 0 0.0250.5560.3910.9700.0080.3740.2780.9830.0060.2590.2190.966
β ^ 1 0.0230.4690.4240.9550.0210.3280.2890.9560.0010.2300.2040.955
β ^ 2 0.0040.2340.2130.9540.0030.1620.1500.9420.0010.1140.1060.936
α ^ 0.2841.2030.7410.9960.2180.7960.6270.9820.1090.4950.4310.961
σ ^ 0.0780.5460.3790.9490.0700.3940.3420.9500.0330.2800.2510.948
2.5 β ^ 0 −0.0140.3530.3020.959−0.0030.2410.2180.9470.0030.1680.1520.949
β ^ 1 0.0160.4530.4030.9590.0120.3150.2940.9560.0080.2210.1990.941
β ^ 2 0.0060.2250.2000.9590.0040.1560.1430.9440.0020.1090.1000.943
α ^ 0.2421.8711.4340.8540.1631.7521.2700.9100.0811.3321.0180.946
σ ^ −0.0780.3750.2720.911−0.0170.2660.2060.944−0.0140.1840.1530.964
30.75 β ^ 0 −0.1530.6090.4170.989−0.1350.4880.3290.960−0.1110.3230.2620.964
β ^ 1 −0.0120.5100.4480.959−0.0030.3530.3100.961−0.0010.2470.2250.946
β ^ 2 −0.0060.2530.2260.948−0.0040.1740.1550.959−0.0030.1210.1050.961
α ^ 0.5341.5771.3891.0000.3811.4021.3131.0000.1941.3561.3091.000
σ ^ −0.4901.3020.9740.910−0.1971.1530.7470.972−0.0290.8250.6380.987
2.5 β ^ 0 0.1210.5050.3850.9890.0950.3800.2890.9900.0830.2320.2150.985
β ^ 1 −0.0050.4600.4270.961−0.0030.3210.2790.961−0.0020.2250.1990.957
β ^ 2 0.0120.2280.1960.9600.0020.1590.1370.955−0.0010.1110.0990.949
α ^ −0.9212.0291.7550.841−0.7091.9031.5700.835−0.5691.6611.5350.819
σ ^ −0.7881.3331.0040.851−0.5551.0520.8250.886−0.4070.8990.6720.914
Low10.75 β ^ 0 0.0080.4770.3440.9810.0030.3380.2540.9690.0010.2260.1940.971
β ^ 1 0.0040.4220.3840.9540.0030.2940.2560.9570.0030.2070.1890.951
β ^ 2 −0.0110.2120.1970.940−0.0060.1470.1330.953−0.0030.1030.0950.948
α ^ 0.2801.0430.7020.9980.1900.6930.5830.9810.0970.4410.3870.957
σ ^ 0.0710.4800.3490.9600.0590.3520.3000.9600.0370.2510.2170.952
2.5 β ^ 0 0.0120.3090.2740.9540.0100.2130.1860.9560.0040.1490.1360.953
β ^ 1 0.0100.4090.3600.9570.0070.2860.2470.9640.0030.2010.1800.942
β ^ 2 −0.0070.2060.1900.961−0.0050.1430.1250.957−0.0020.1000.0840.962
α ^ 0.3531.6221.3010.8680.2741.4381.2580.9320.2691.0250.9680.939
σ ^ −0.0780.3400.2600.900−0.0400.2440.1840.956−0.0150.1670.1440.952
30.75 β ^ 0 −0.1550.5240.3790.970−0.1280.4150.2840.973−0.1090.2660.2400.961
β ^ 1 0.0090.4480.3990.9630.0050.3110.2710.9620.0030.2190.1870.956
β ^ 2 −0.0160.2230.1960.965−0.0060.1550.1360.959−0.0030.1090.0990.950
α ^ 0.6051.8931.5061.0000.4811.5011.3881.0000.2761.3721.2921.000
σ ^ −0.3401.3720.8940.929−0.0891.1160.7030.966−0.0530.7730.6120.993
2.5 β ^ 0 0.0900.6070.3470.9870.0740.4390.2680.9900.0510.2820.2010.982
β ^ 1 0.0110.4150.3780.9540.0050.2890.2700.9460.0010.2030.1800.949
β ^ 2 −0.0050.2080.1870.952−0.0010.1450.1350.937−0.0010.1010.0940.944
α ^ −0.8651.7801.6120.852−0.6301.7081.5170.837−0.4491.6291.4620.838
σ ^ −0.6951.2830.9270.887−0.4921.0970.7620.905−0.3410.7320.6050.924
Table 3. Misspecification simulation study for Poisson–SHN and geometric–SHN models.
Table 3. Misspecification simulation study for Poisson–SHN and geometric–SHN models.
n = 100 n = 200 n = 400
Poisson Case
Cure biasRMSEbiasRMSEbiasRMSE
Rate E ( W ) EstimatorSHNWeibullSHNWeibullSHNWeibullSHNWeibullSHNWeibullSHNWeibull
High2.5 β ^ 0 0.0590.0890.3940.4310.0530.0840.2480.3090.0510.0730.1680.221
β ^ 1 −0.018−0.0660.3560.430−0.006−0.0490.2480.316−0.003−0.0480.1750.227
β ^ 2 −0.002−0.0030.1760.255−0.002−0.0030.1220.187−0.001−0.0020.0860.131
5.0 β ^ 0 −0.109−0.1590.3440.347−0.077−0.1260.2250.239−0.058−0.1230.1340.169
β ^ 1 −0.014−0.0200.4020.773−0.007−0.0180.2800.601−0.005−0.0170.1970.451
β ^ 2 −0.005−0.0060.1970.288−0.003−0.0040.1380.2090.0000.0000.0970.146
7.5 β ^ 0 −0.307−0.3590.2460.297−0.276−0.3550.2010.210−0.251−0.3220.1070.159
β ^ 1 −0.017−0.0250.4430.933−0.012−0.0150.3110.773−0.011−0.0130.2190.650
β ^ 2 0.0110.0120.2180.317−0.003−0.0030.1520.2300.0010.0010.1070.160
Low2.5 β ^ 0 0.0870.0970.2870.3140.0700.0790.1810.2110.0580.0630.1390.144
β ^ 1 −0.014−0.0690.4370.448−0.012−0.0610.3060.334−0.007−0.0520.2150.228
β ^ 2 −0.0010.0000.2070.2970.0000.0000.1460.215−0.001−0.0010.1020.155
5.0 β ^ 0 −0.093−0.1000.2340.255−0.071−0.0920.1650.176−0.052−0.0780.1100.121
β ^ 1 −0.029−0.0480.4890.809−0.022−0.0250.3440.656−0.008−0.0120.2440.487
β ^ 2 −0.003−0.0040.2310.336−0.001−0.0010.1640.241−0.001−0.0010.1160.172
7.5 β ^ 0 −0.204−0.2850.1970.217−0.194−0.2820.1330.155−0.192−0.2790.1020.111
β ^ 1 −0.012−0.0180.5330.893−0.003−0.0140.3790.742−0.002−0.0040.2670.583
β ^ 2 −0.008−0.0080.2520.372−0.005−0.0060.1790.265−0.003−0.0030.1270.187
Geometric Case
High2.5 β ^ 0 0.0710.0920.1960.2440.0530.0820.1630.1900.0310.0480.1240.139
β ^ 1 −0.105−0.1270.2090.281−0.083−0.0940.1530.218−0.054−0.0560.1480.164
β ^ 2 −0.003−0.0050.2540.4350.0010.0020.1760.2990.0010.0010.1230.207
5.0 β ^ 0 −0.091−0.1290.1350.154−0.088−0.1190.0980.103−0.071−0.1060.0420.071
β ^ 1 −0.089−0.1300.3520.427−0.080−0.1200.2870.329−0.057−0.0920.1720.243
β ^ 2 0.0010.0000.2740.478−0.001−0.0010.1920.3300.0010.0010.1350.229
7.5 β ^ 0 −0.288−0.3330.0940.125−0.251−0.3110.0670.082−0.242−0.2710.0490.059
β ^ 1 −0.055−0.0990.4930.512−0.054−0.0960.3140.389−0.042−0.0530.2900.301
β ^ 2 0.009−0.0100.2930.5240.005−0.0080.2040.3620.0010.0010.1430.248
Low2.5 β ^ 0 −0.075−0.1090.2450.298−0.062−0.0810.2190.234−0.046−0.0690.1470.165
β ^ 1 −0.094−0.1300.2170.245−0.074−0.0830.1330.192−0.050−0.0610.1280.138
β ^ 2 0.0120.0160.2350.3970.0110.0120.1630.2700.0020.0030.1140.187
5.0 β ^ 0 −0.224−0.2590.1970.209−0.216−0.2370.1010.144−0.153−0.2030.0610.089
β ^ 1 −0.108−0.1260.3030.385−0.097−0.1060.2510.283−0.084−0.0990.1460.207
β ^ 2 0.0020.0020.2530.4260.0020.0020.1760.2920.0010.0010.1230.201
7.5 β ^ 0 −0.260−0.4590.1160.150−0.256−0.4240.0960.106−0.202−0.3920.0420.073
β ^ 1 −0.050−0.1120.3370.471−0.049−0.1040.2740.359−0.035−0.0910.2020.268
β ^ 2 −0.002−0.0040.2690.456−0.001−0.0020.1870.315−0.001−0.0010.1310.216
Table 4. AIC and BIC criteria for some cure rate models in the literature based on a competing risk scheme for the ebmt4 data set.
Table 4. AIC and BIC criteria for some cure rate models in the literature based on a competing risk scheme for the ebmt4 data set.
Model for MWeibullGammaBSSHN
Po4650.0/4672.94683.0/4706.04580.8/4603.84537.9/4560.8
Lo4628.7/4651.64639.6/4662.64628.1/4651.04547.3/4570.3
NB4561.4/4590.14547.4/4576.14583.0/4611.74540.2/4568.8
Bern4665.1/4688.04712.8/4735.84577.0/4599.94535.8/4558.8
COM-Po4639.3/4667.94659.5/4688.14578.9/4607.64537.8/4566.5
PL4612.1/4658.04615.9/4661.84585.1/4630.94545.3/4591.2
Table 5. Estimates and standard errors (s.e.) for the Bern/SHN cure rate model in the ebmt4 data set.
Table 5. Estimates and standard errors (s.e.) for the Bern/SHN cure rate model in the ebmt4 data set.
Parameter β 0 β 1 α σ
Estimate 0.4912 0.2224 0.3844 0.3120
s.e. 0.0907 0.1030 0.0753 0.0310
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gallardo, D.I.; Gómez, Y.M.; Gómez, H.J.; Gallardo-Nelson, M.J.; Bourguignon, M. The Slash Half-Normal Distribution Applied to a Cure Rate Model with Application to Bone Marrow Transplantation. Mathematics 2023, 11, 518. https://doi.org/10.3390/math11030518

AMA Style

Gallardo DI, Gómez YM, Gómez HJ, Gallardo-Nelson MJ, Bourguignon M. The Slash Half-Normal Distribution Applied to a Cure Rate Model with Application to Bone Marrow Transplantation. Mathematics. 2023; 11(3):518. https://doi.org/10.3390/math11030518

Chicago/Turabian Style

Gallardo, Diego I., Yolanda M. Gómez, Héctor J. Gómez, María José Gallardo-Nelson, and Marcelo Bourguignon. 2023. "The Slash Half-Normal Distribution Applied to a Cure Rate Model with Application to Bone Marrow Transplantation" Mathematics 11, no. 3: 518. https://doi.org/10.3390/math11030518

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop