Using First-Passage Times to Analyze Tumor Growth Delay

: A central aspect of in vivo experiments with anticancer therapies is the comparison of the effect of different therapies, or doses of the same therapeutic agent, on tumor growth. One of the most popular clinical endpoints is tumor growth delay, which measures the effect of treatment on the time required for tumor volume to reach a speciﬁc value. This effect has been analyzed through a variety of statistical methods: conventional descriptive analysis, linear regression, Cox regression, etc. We propose a new approach based on stochastic modeling of tumor growth and the study of ﬁrst-passage time variables. This approach allows us to prove that the time required for tumor volume to reach a speciﬁc value must be determined empirically as the average of the times required for the volume of individual tumors to reach said value instead of the time required for the average volume of the tumors to reach the value of interest. In addition, we deﬁne several measures in random environments to compare the time required for the tumor volume to multiply k times its initial volume in control, as well as treated groups, and the usefulness of these measures is illustrated by means of an application to real data.


Introduction
In oncological clinical practice, the scarcity of dose-response data makes it difficult to assess the response of human tumors to a given treatment or combination of treatments. However, preclinical modeling of anticancer therapies can overcome this problem. Since the development of the first tumor models in vivo, modeling solid tumors in mice has been a concern of tumor biology, and a useful tool in the search for new drugs that may improve treatment [1]. A remarkable breakthrough, human tumor xenograft models in immuno-deficient mice, allows researchers to conduct preclinical evaluations of the effect of experimental therapies on tumor growth. When in vivo experiments are performed on mice, adequate modeling of the effect of therapies makes it possible to understand when these are efficient and how administration protocols must be arranged.
In general, preclinical therapeutic trials compare tumor volume (or tumor volume relative to initial volume) in various treated groups and a control (untreated) one in order to find the strongest antitumor agents and the most effective treatments. When tumors are treated with one therapy (or a combination of therapies), the surviving tumor cells regrow with a delay induced by treatment. Tumor regrowth will depend on both growth and the cell destruction rates.
Several measures have been proposed to evaluate the effectiveness of a therapy. In general, these measures compare the volume a tumor reaches or the time required for the tumor to reach a certain volume in both treated and control groups.
The tumor growth inhibition (TGI) rate and relative tumor volume variation (RTVV) are two of the most recently used metrics to quantify treatment response based on tumor volume (see Koning et al. [2]).
On the other hand, one of the most used clinical endpoints is the time taken for the tumor to reach a specific volume (usually a multiple of its initial volume). With this measure as a starting point, others have been considered such as tumor growth delay (TGD), defined as the difference between the time taken for the tumor to reach a specific volume in the treated and control group (see, for example, Yarnold et al. [3] and Karsch et al. [4]), or specific tumor growth delay (see, for example, Rofstad et al. [5] and Rygaard et al. [6]), used to compare therapeutic effects on tumors that exhibit different growth rates prior to treatment.
Demidenko [7] referred to some relevant articles in relation to these last endpoints, reviewed some basic facts about exponential and Gompertz-like growth, mathematically defined doubling time and tumor growth delay using the exponential growth model, and proposed their statistical estimation by applying a mixed effects model to the combined data set of the different treatments.
Concerning the calculation of the instant of time at which a tumor reaches a certain volume, the works of Thomlinson et al. [8] and Denekamp [9] set the basis for how to proceed. In the former, the average of the times it takes for each individual tumor to reach a given size was employed, while in the second, the average growth curve of the tumors was considered and used to calculate the specific time point. However, although simple and quite widespread, such empirical methods have some drawbacks: • They can be imprecise and contain substantial bias, which can lead to erroneous conclusions about the effect of therapy. On the one hand, they require using some interpolation method because the individual or average growth curves usually reach the specified volume between two instants of time. On the other hand, the individual growth curves are irregular and able to reach the volume specified at more than one time instant. • If the mean of a group is analyzed instead of the individual trajectories, the standard errors necessary for the comparison of groups by t-tests or F-tests will not be available. Furthermore, when significant variations in the response occur, the analysis of the mean can lead to incorrect estimates of the mean time to reach a specific volume.
Data modeling is the tool of choice when data follow a credible growth pattern and implies making certain assumptions about the dynamics of tumor growth and regrowth. Traditionally, in order to estimate the mean time for which a specific volume is reached, an exponential or Gompertz curve has been fit to the average tumor growth data, depending on the size of the tumors or the duration of the experiment.
The fact that deterministic models do not consider the usual fluctuations in the dynamics of real systems gave way to the consideration of stochastic models, among which diffusion processes stand out. These models are obtained from stochastic differential equations obtained by adding a noise to the ordinary differential equation that gives rise to the growth curve. Using this methodology, Capocelli and Ricciardi [10,11] obtained the lognormal and Gompertz diffusion processes, associated with the exponential and Gompertz growth curves, respectively. Later, Gutiérrez et al. [12] obtained a Gompertz-type diffusion process associated with a Gompertz growth curve whose limit value depends on the initial value.
Stochastic modeling of tumor growth in the presence of therapies raises the need to modify the available models that fit control (untreated) groups by including in their trend the effect of an antitumor treatment through time-dependent functions. In this sense, Albano et al. [13,14] developed a methodology to determine the effect of a treatment in therapeutic trials. The idea is to model the treated group using a modified Gompertz (or Gompertz-like) process. For this, time functions are introduced into its infinitesimal mean representing the effect of the therapy on the growth and/or death rates. Later, said effects are estimated from some relationship between the characteristics of both processes. These models were extended later to the case in which the therapy also affects the variability of tumor growth. For this, another term is included in the infinitesimal variance of the process that models the control group (Román et al. [15] and Albano et al. [16]).
Thus, suitable probabilistic models are currently available to model tumor growth data and to determine the effect of treatments. In the present paper, we use stochastic modeling of tumor growth to study the time in which a specific volume is reached in a random environment, and we propose some new endpoints of the effect of a therapy. To do this, we employ the study of temporal variables associated with the diffusion processes considered, such as the first-passage times (f.p.t.). The fact that the distribution of the temporal variable that describes the phenomenon under study is available will allow more complete information to be gathered on the evolution of tumors in the presence of a certain therapy. As of today, such considerations are already being incorporated into epidemiological studies (see Gómez-Corral et al. [17]).
We focus our attention on the time required for the relative volume of the tumor to reach a value k ∈ N, with k > 1, resulting in the time doubling, tripling, quadrupling, etc., the initial tumor volume. In a random environment, we assume that the growth curves of the relative volume of the tumor for each individual considered are discretized sample trajectories of a diffusion process {X(t); t ∈ I} defined on a real interval I = [t 0 , T] (t 0 ≥ 0), taking values on E ⊆ R, with infinitesimal mean and variance A 1 (x, t) and A 2 (x, t), respectively, being x ∈ E and t ∈ I, and initial distribution P[X(t 0 ) = 1] = 1. Consequently, the time required for X(t) to reach the value k for the first time is a random variable, specifically the variable f.p.t. of the process through the constant boundary S(t) = k, provided X(t 0 ) = 1, that is: Let us note that knowledge of the distribution of T k,1 is very important since it allows us to answer questions of interest such as: • What is the average time required for the tumor to multiply k times its initial volume for the first time? • Do tumors show great variability in relation to the time it takes to multiply k times their initial volume for the first time? • What time is required for a certain percentage of tumors to multiply k times their initial volume for the first time?
The rest of the article is structured as follows: Section 2 provides a brief perspective on the definition of first-passage time and the way in which the density function of such a variable can be obtained. Taking into account the random modeling of tumor growth, Section 3 discusses a relevant question in clinical practice, namely how the time required to reach a specific volume should be determined empirically. Section 4 establishes the new proposed criteria to evaluate the therapeutic effect in a random environment. Finally, Section 5 illustrates the use of the proposed criteria through a real-life application.

A Brief Overview of First-Passage Times
Let I = [t 0 , T] be a real interval (t 0 ≥ 0) and {X(t); t ∈ I} a diffusion process, in general time-non-homogeneous, taking values on E ⊆ R, with infinitesimal moments A 1 (x, t) and A 2 (x, t), being x ∈ E and t ∈ I, and consider S a C 1 -class function in I.
The time variable f.p.t. through the boundary S, conditioned to X(t 0 ) = x 0 , is defined as: whose density function is denoted by g(S(t), t|x 0 , t 0 ). Note that sometimes, the conditioning that appears in the definition is not established on a fixed value, but on variable X(t 0 ). In such cases, and if J denotes the range of variation of X(t 0 ), the density of the f.p.t. can be obtained from the family of densities {g(S(t), t|x 0 , t 0 ), x 0 ∈ J − {S(t 0 )}} by means of the expression: is the density function of X(t 0 ). Therefore, this case can be reduced to the former, and to that end, we focus on the formulation given by (2).
Historically, the study of the procedures for obtaining the density function of the f.p.t. through time-dependent boundaries for diffusion processes has undergone a substantial evolution. Initially, this type of study was restricted to specific diffusion processes and boundaries. Subsequently, the class of diffusion processes where the problem was studied, as well as their associated class of boundaries, were generalized. In this way, in the context of time-homogeneous diffusion processes, Giorno et al. [18] proved that the probability density function of the f.p.t. variable (2), through a C 1 [t 0 , T]-class function S, verifies a second kind Volterra integral equation with a non-singular kernel. For more details, see Ricciardi et al. [19], where a description of the development of the study of this problem up to that date was made, with numerous references that include both theoretical studies and applications to several fields.
Concerning the non-homogeneous case, a first generalization of the integral equation was made by considering some special kind of diffusion processes (see Gutiérrez et al. [20]). Later, Gutiérrez et al. [21] extended their results and also showed their validity for the class of time-non-homogeneous diffusion processes. More in detail, if S ∈ C 1 [t 0 , T], g(S(t), t|x 0 , t 0 ) is the solution of the second kind Volterra integral equation: where ρ = Sgn(S(t 0 ) − x 0 ) and: f (x, t|y, s) being the transition probability density function of the process. We must note that this equation is also valid in somewhat more general contexts such as Gauss-Markov processes, as proven by Di Nardo et al. [22]. According to the general theory of Volterra integral equations, the non-singularity of the kernel is equivalent to: lim If, additionally, it is verified that: the probability density of the f.p.t. is explicitly given by: Nevertheless, and apart from some particular processes and boundaries, closed-form solutions for the integral equation are not available. For this reason, in the cases without explicit solutions, numerical procedures are needed. The most usual methods are based on numerical quadrature procedures, as proposed by Buonocore et al. [23] on the basis of the composite trapezoid method. However, in the application of this type of algorithm, some problems can be found, leading to undesirable solutions. For instance, considering an inadequate integration step for the variation range of the f.p.t. variable may lead to bad approximations of its density function. Another possible drawback is related to a potential unnecessarily high computational cost. This may be due to a poor choice of the initial instant for the application of the algorithm or to the use of stopping rules, which are unable to detect tails in the distribution of the variable. To solve these problems, Roman et al. [24] developed a general heuristic strategy that allows an efficient implementation of the algorithm, avoiding the aforementioned drawbacks. This strategy is based on the so-called FPTLfunction (Roman et al. [25]) and was implemented in the R package fptdApprox [26].

Determining the Time Required to Reach a Specific Volume
In experimental trials, it is usual to define the average time to reach a specific volume as the time at which the mean of the tumor growth curves reaches said volume, instead of considering the average of the times required for each tumor growth curve to individually reach it. This practice can lead to very different estimates of the average time to reach a particular volume, as is made evident in the following examples.
First, let us consider a set of data (kindly provided by the Laboratory of Preclinical Investigation of the Translational Research Department of the Institute Curie, Paris) about the growth of the BC297MONp5tumor in an experimental group of seven mice treated with a combination of cyclophosphamide/doxorubicin. Figure 1 shows the individual and mean growth curves of the relative tumor volume. From these data, interpolation reveals that the average time required for each individual tumor to quadruple its initial volume is 21.38 days, while the time required for the average of the relative volumes of tumors to reach a value of four is 15.75 days. The disparity between the values is mainly due to the fact that the average relative volume of the tumors is a rather limited representation of the variability of the response to treatment.
The time taken by each individual tumor to quadruple its initial volume can be interpreted as a random sample of the f.p.t. variable for relative tumor volume X(t) through constant boundary S = 4, conditioned to X(1) = 1, that is: As a consequence, the average of the times required by each individual tumor to quadruple its volume should be considered from an empirical standpoint.
The following example based on simulated data and related to exponential tumor growth helps clarify the matter. Likewise, it shows how random modeling presents clear advantages over empirical procedures in correctly determining the average time to reach a specific volume.
Let us suppose that the relative volume of the tumor is modeled by a lognormal diffusion process {X(t), t 0 ≤ t ≤ T}, taking values on R + , with infinitesimal moments A 1 (x) = mx, m = 0, and A 2 (x) = σ 2 x 2 , σ > 0, and initial distribution X(t 0 ) = 1.
It is known that the transition probability density function of the process is that corresponding to a lognormal variable: so E(X(t)) = E(X(t) | X(t 0 ) = 1) = exp(m(t − t 0 )) (see Román et al. [27] for details). As a consequence, the time required to quadruple the average volume of the tumors is t = t 0 + ln(4)/m, regardless of the variability of X(t). This means that if we consider simulated trajectories for different groups of individuals, generated with the same value of m and different values of σ, it is expected that the sample mean function in each group will be similar and will provide approximately the same time required to quadruple initial tumor volume. However, the average times required by the individual trajectories in each group to quadruple the initial volume will be affected by the different variability of the simulated processes and will yield quite different values. From the lognormal diffusion process {X(t), 1 ≤ t ≤ 10} with degenerate initial distribution P[X(1) = 1] = 1 and parameter m = 0.4, we simulated twelve sample paths by considering different values of σ, concretely 0.05, 0.1, and 0.15, in time instants t j = j, j = 1, . . . , 10. Figure 2 shows, for each value of σ, the simulated sample paths, as well as the corresponding sample mean functions.  Table 1. In view of the results, it can be concluded that, as a matter of fact, times t 4 are quite close to one another, whereas larger differences between valuest 4 are observed, as well, between these values and the previous ones, the differences becoming larger as the variability increases. Next, we compare the results obtained by studying the f.p.t. of the process through boundary S = 4. We must note that, from the probability density function of (6), for the lognormal diffusion process and boundary S(t) = A exp(Bt), A > 0, Equation (4) is verified. Additionally, the probability density function of the f.p.t. variable through this boundary is given by Equation (5) (see [20] for details). Therefore, by considering B = 0, the explicit expression for a constant boundary S(t) = A = x 0 is available. More in detail, the expression is: Obviously, in order to use (7) in this example, it is necessary to previously estimate the values of m and σ. Obtaining the maximum likelihood estimates of these parameters makes it possible to approximate the density function of the f.p.t. variable, through the boundary S = 4, for the process fitting the simulated sample paths. Finally, the mean and variance of said variable are considered to estimate the mean and variance of the time required to quadruple the initial tumor volume.
For each value of σ, Table 2 shows the maximum likelihood estimates of the parameters of m and σ, together with the theoretical and estimated values of the mean and variance of the f.p.t. variable T 4,1 . In all cases, the fitted diffusion processes provide good approximations of the parameters of the theoretical model and fit adequately both the mean and the variability of the simulated trajectories. Figures 3-5 show the mean and variance of the simulated trajectories and the mean function of the lognormal diffusion process adjusted for σ = 0.05, 0.1, and 0.15, respectively. The approximations of the density function of the time required to quadruple the initial tumor volume for the different values of σ are represented in Figure 6.
The comparison between Tables 1 and 2 allows us to establish two important conclusions: • It is empirically more accurate and convenient for comparing treatments to determine the average of the times required for the volume of individual tumors to reach a specific value instead of calculating the time required for the average volume of the tumors to reach said value. • Random modeling of tumor growth through diffusion processes provides a better estimate of the mean time required for the tumor volume to reach a specific value, as well as other characteristics of said variable, such as its variability.

Tumor Growth Delay Measurements in a Random Environment
As mentioned in Section 1, the scientific literature has defined tumor growth delay as a measure of the effect of the therapy on the time required for the tumor volume to reach a specific value. It has thus been determined by different authors as the difference between the times taken for the tumor, in the treated and the control group, to reach the volume of interest, usually a multiple of the initial tumor volume.
Let us consider a tumoral preclinical trial in which the relative volume of a tumor is observed in a control group and in one or more treated groups. Based on the results of the previous section, the difference between the average of the time instants required for the relative tumor volume to reach a specific value for the first time in the treated and control groups is an appropriate measure of the delay in tumor growth. Alternatively, the ratio between such times can be considered. Quite often, the interest is focused on the time required for the tumor to multiply k times its initial volume, so that an adequate empirical measure of the tumor growth delay is given by: for k = 2, 3, 4, . . ., wheret T k andt C k are, respectively, the average of the time instants required for the relative tumor volume to reach value k for the first time in the treated and control groups.
TGD k is a simple measure of the change caused by treatment in the time required for the tumor to multiply k times its volume. In a random environment, this time is provided by the random variable T k,1 , considered in the previous section. Therefore, a possible alternative to TGD k is to compare the behavior of the studied variable in the control, as well as in the treated groups.
Let T C k,1 and T T k,1 be the random variables "time required for the tumor to multiply k times its initial volume" in the control and treated groups, respectively. The distributions of these variables can be compared according to one or more of their characteristics. In this sense, a measure analogous to TGD k in a random environment would be given by: which determines the rate of increase of the mean time required for the tumor volume to multiply k times its initial volume when it is subjected to a therapy. However, other alternatives can be considered to evaluate, or compare, the effect of the treatments. Among them, we can highlight the analysis of survival or hazard functions associated with T k,1 in the control and treated groups, which are not usually used in this context by medical researchers. These functions can easily be determined from the density function g(k, t|1, t 0 ) of T k,1 .
Survival function S T k,1 (t) = P(T k,1 > t) provides the probability that the tumor volume does not multiply k times its initial volume before t. Then, the time t k,q in which a specified proportion q of tumors has not yet multiplied k times their initial volume is a solution of equation S T k,1 (t k,q ) = q, that is P(T k,1 ≤ t k,q ) = 1 − q and t k,q = Q T k,1 (1 − q), where Q T k,1 (p) is the quantile function of T k,1 .
Clearly, a therapy will be more effective the longer the time in which a significant proportion of tumors have not yet multiplied k times their initial volume. Thus, we propose to quantify the delay in tumor growth using: for some q ≥ 1/2 set previously by the researcher. RTGD k,q provides the proportion that increases the time in which a proportion q of tumors has not yet multiplied k times their initial volume when the tumors are subjected to therapy. Nevertheless, given that the effectiveness of a therapy varies with time, a comparison of the survival functions of T k,1 in the control and treated groups over time should not be dispensed with. The therapy will be effective in those periods of time in which the survival function of T k,1 in the treated group is greater than in the control group.
On the other hand, hazard function: measures the instantaneous probability that the tumor will multiply k times its initial volume over time (if it has not done so before). Based on this function, a therapy will be effective in those time periods in which the hazard function of T k,1 in the treated group is lower than in the control group. The hazard ratio function: allows us to compare hazard functions h T T k,1 and h T C k,1 over time. Values close to one of HR T k,1 (t) indicate that the therapy does not modify the instantaneous probability that the tumor multiplies k times its initial volume, whereas values below one reveal a decrease of said probability. This decrease becomes larger as the value of HR T k,1 (t) diminishes. In a random environment, the comparison of the effect of two therapies can be carried out using RTGD k and RTGD k,q , as well as comparing the survival and hazard functions of T k,1 for each therapy. Obviously, this requires tumor growth to have been appropriately modeled in advance.

Application to Real Data
With the purpose of studying the delay in tumor growth induced by a therapy in a random environment, we consider again the data regarding the relative volume of tumor BC297MONp5 in mice that were studied by Albano et al. [13] and Román-Román et al. [15]. In these studies, the modeling of the relative volume of the tumor was carried out through the use of Gompertz diffusion processes. More specifically, a homogeneous version was considered for the control group, while for the group treated with cisplatin, it was necessary to modify the previous one by introducing two functions C(t) and V(t) in the infinitesimal moments. These functions represent the effect of therapy on growth rate and volume variability, respectively.
We focus our attention on the time required for the tumor to quadruple its initial volume, T 4,1 . Using the R package fptdApprox, the density functions of the f.p.t. of the Gompertz models estimated through S = 4 were obtained (see Figure 7). From these densities, Table 3 was established, which summarizes the characteristics of the variable T 4,1 necessary to determine the proposed measures.  The results show that the mean time required for the tumor to quadruple its initial volume in the control group was 15.58948 days, while in the group treated with cisplatin, this time increased to 18.75715 days. Therefore, RTGD 4 = 1.2032, i.e., cisplatin therapy, increased the mean time required for tumor volume to quadruple its initial volume by 20.32%. Furthermore, the variability of the time required for this event to occur was slightly higher in the treated group.
On the other hand, RTGD 4,0.5 = 1.22, RTGD 4,0.9 = 1.2803, and RTGD 4,0.99 = 1.2373, which allowed us to conclude that the therapy increased by 22% the time in which 50% of the tumors had not yet quadrupled their initial volume and by 28.03 % the time in which 90 % of the tumors had not yet reached such a value, 23.73% being the increase for 99% of the tumors that had not exceeded that level.
Finally, Figure 8 shows the survival and hazard functions of T 4,1 in the control and treated groups, while Figure 9 depicts the difference between the survival functions and the hazards ratio function.
In Figure 9a, the therapy is shown to increase the probability that initial tumor volume does not quadruple before any time instant between Days 7 through 41. Initially, the increase in said probability grows to a maximum of approximately 0.34 by Day 16, when it starts decreasing until it becomes negligible by Day 41. Figure 9b shows that the therapy makes the instantaneous probability of tumors multiplying their initial volume by four over time (if they have not done so before) to be reduced between Days 5 through 45, approximately. The maximum reduction, around 96%, takes place at the beginning of this period, until a minimum reduction rate, of around 8%, is reached by Day 21. The reduction in the instantaneous probability of interest is increased again until day 39, when a new maximum is reached at around 30%. Finally, between days 39 and 45, the effect of therapy on the reduction of the instantaneous probability of tumors multiplying their initial volume by four diminishes until it vanishes completely.
From Day 45 onward, therapy does not reduce the instantaneous probability of the tumor multiplying its initial volume by four over time. Many reasons can affect the behavior of tumor growth during treatment, one being that tumor cells that are resistant to a given drug may initiate a process of active proliferation. This may therefore influence, at that moment, the instantaneous probability of the tumor multiplying its initial volume to be reduced. The main contribution of our study is that it provides early warning to medical researchers when therapies behave as described in the previous lines, so that they can be investigated and compensated, perhaps by increasing or anticipating the dose of the therapeutic agent.

Conclusions
In this work, stochastic modeling of tumor growth is used to study the first-passage time variable defined by the time it takes the tumor volume to reach a specific value. This procedure allows us to clarify how such a time should be determined in a conventional descriptive analysis and, more importantly, to establish, for in random environments, new clinical endpoints derived from the conventional measurement of tumor growth delay and the techniques of the survival analysis.
The attrition rate in anticancer drug development is extremely high. To enhance the chance of a drug gaining market authorization, extremely accurate preclinical data are required. If our methodology is validated on large series of experimental data, we believe it could be very useful for improving both the evaluation of the effectiveness of therapeutic treatments and their administration schedules.
The results obtained in this work allow us to broaden the horizon on future lines of action. One such line concerns survival theory, since our contribution is linked to concepts such as the hazard and survival functions. Likewise, it is interesting to contrast the possibilities of the methodology described herein with the results that can be deduced from the classic models, such as general regression models or more specific models such as Cox's regression. In another potential line of application, the TGD can be an interesting source of information for tackling complex problems such as determining the most optimal therapies for a treatment. This is a problem that is related to that of model selection, in which the Bayesian perspective may be of interest (in a similar way as it is in the selection of variables in linear models). Furthermore, the Bayesian approach may also be of interest given the importance of inferential aspects in this type of study. As a matter of fact, we cannot forget that first-pass time densities depend on the transition densities of the process, and therefore, the estimation of its parameters as its eventual approximation (when an exact functional form is not available) is an issue that must be taken into account. In this regard, recent progress from a Bayesian point of view includes the works of Pieschner et al. [28] and Fuchs [29].