Next Article in Journal / Special Issue
Numerical and Theoretical Treatments of the Optimal Control Model for the Interaction Between Diabetes and Tuberculosis
Previous Article in Journal
A Novel Model for Accurate Daily Urban Gas Load Prediction Using Genetic Algorithms
Previous Article in Special Issue
Optimization of PFMEA Team Composition in the Automotive Industry Using the IPF-RADAR Approach
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring

1
Department of Biostatistics, College of Public Health, University of Kentucky, Lexington, KY 40506, USA
2
Institute for Implementation Science in Population Health, City University of New York, New York, NY 10027, USA
3
Bureau of Hepatitis, HIV, and Sexually Transmitted Infections, Department of Health and Mental Hygiene, New York, NY 10013, USA
4
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY 10461, USA
5
Department of Medicine, Albert Einstein College of Medicine and Montefiore Medical Center, Bronx, NY 10461, USA
6
Department of Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, NY 10027, USA
*
Author to whom correspondence should be addressed.
Algorithms 2025, 18(6), 346; https://doi.org/10.3390/a18060346
Submission received: 15 April 2025 / Revised: 28 May 2025 / Accepted: 1 June 2025 / Published: 5 June 2025

Abstract

We propose a new random change point model that utilizes routinely recorded individual-level HIV viral load data to estimate the timing of antiretroviral therapy (ART) initiation in people living with HIV. The change point distribution is assumed to follow a zero-inflated exponential distribution for the longitudinal data, which is also subject to left-censoring, and the underlying data-generating mechanism is a nonlinear mixed-effects model. We extend the Stochastic EM (StEM) algorithm by combining a Gibbs sampler with a Metropolis–Hastings sampling. We apply the method to real HIV data to infer the timing of ART initiation since diagnosis. Additionally, we conduct simulation studies to assess the performance of our proposed method.

1. Introduction

Antiretroviral therapy (ART) is the cornerstone of HIV care, and timely initiation is critical for improving clinical outcomes and reducing HIV transmission. While current guidelines endorse immediate ART initiation upon diagnosis, in practice, the time between HIV diagnosis and treatment initiation varies considerably. These delays often arise due to multiple barriers, including anticipated side effects (i.e., perceived collateral effects of ART), stigma, mental health or substance use challenges, and structural issues such as inconsistent access to care. As a result, viral load trajectories observed in clinical data reflect substantial heterogeneity in ART uptake timing.
While HIV case surveillance in most U.S. states collects viral load data through mandatory electronic reporting, information on ART initiation is typically not recorded. Public health departments often infer treatment initiation indirectly using viral suppression data or through analyses of clinical cohorts. Motivated by Braunstein et al. [1], who developed a rule-based method using viral load surveillance data to estimate ART initiation timing in New York City, this paper proposes a statistical modeling approach to infer ART initiation time more formally and flexibly.
Random change point models, allowing for individual-specific changes in longitudinal outcomes, are widely used in medical research [2]. Typically, these models assume a normal distribution for the change point and use linear mixed-effects models for the segments before and after the change point [3,4,5,6]. However, the Gaussian assumption may not always be ideal. In our case, many individuals likely initiated ART at the time of diagnosis (time zero), making a zero-inflated distribution more suitable. Additionally, longitudinal data may be censored, meaning measurements below a certain threshold cannot be accurately quantified. For viral load data, the detection limit is generally between 50 copies/mL and 400 copies/mL. A linear model based on the observed data might not be suitable for the unobserved data. Alternatively, if a mechanical or scientific model is available for the longitudinal data, it can provide better predictions for the unobserved data and improve change point estimation. Such a mechanical model often takes a nonlinear form.
Accounting for a random change point within a likelihood framework poses a major challenge due to the lack of closed-form expressions [7,8,9]. This challenge is compounded by the non-Gaussian distribution of the change point, nonlinear models, and data censoring. In this paper, we propose a zero-inflated exponential (ZIE) distribution-based random change point model with segmented NLME sub-models to analyze left-censored data. To facilitate full likelihood-based inference, we extend the Stochastic Expectation-Maximization (StEM) algorithm, which was initially introduced by [10]. Our extension of the StEM involves a Gibbs sampler coupled with Metropolis–Hasting sampling for mixed-type random effects structure in the random change point model.
The remainder of this paper is structured as follows. Section 2 introduces the ZIE-based nonlinear random change point model. Section 3 presents the general model and details of the extended StEM algorithm. Section 4 analyzes an HIV cohort dataset, and Section 5 evaluates the proposed method through simulations. Finally, Section 6 concludes the article with a discussion.

2. ZIE-Based Nonlinear Random Change Point Model

In longitudinal data analysis, understanding how a specific outcome changes over time is crucial. Random effects change point models offer a valuable framework for examining changes in time trajectories by incorporating individual change points. These changes are typically induced by external events, causing deviations from the original data pattern.
Our objective is to determine when individuals initiate HIV treatment using their HIV viral load measurements over time. Without ART, the viral load fluctuates significantly after HIV infection until it stabilizes at a set point. If untreated, the viral load eventually increases, leading to AIDS [11]. Starting ART, however, results in substantial decreases in HIV viral load. To simplify our modeling approach, we assume the HIV diagnosis occurs after the viral set point and focus on modeling the changes in viral load dynamics post-ART initiation.
Traditional random effects change point models often assume that the longitudinal outcome can be described by a segmented linear mixed-effects model. However, as mentioned in the introduction, linear assumptions may not be suitable for many real-world applications. While linear models might adequately fit the observed data, they may not be appropriate for data subject to censoring, which is common with HIV viral load measurements, especially after ART initiation.
Extensive research has been conducted to understand the dynamics of viral load following ART initiation and to evaluate the effectiveness of these drugs in treating HIV. Building upon biological and clinical knowledge, ref. [12] proposed a virological model to approximate the patterns observed in viral load data. This model is represented by the equation:
V ( t ) = P 1 e γ t + P 2 ,
where V ( t ) represents the total virus at time t, and P 1 and P 2 are baseline values. The parameter γ corresponds to the viral decay rate and can be interpreted as the turnover rate of productively infected cells and long-lived or latently infected cells in an ideal therapy setting. Refs. [13,14] provide detailed discussions of this model.
For our problem, we can consider the following random change point model to describe the viral loads y i j for individual i at j t h visit with time t i j after HIV diagnosis:
y i j = a 1 i ( t i j τ i ) + log 10 ( b 1 i e b 2 i ( t i j τ i ) + + b 3 i ) + e i j .
Here, y i j is a log 10-transformed viral load, τ i is the (single) change point which induces the change of viral load trajectory, and e i j is the error term. We use log 10 transformation in line with standard practice in HIV clinical research and surveillance, where viral load thresholds and treatment response metrics (e.g., a 1-log reduction) are conventionally interpreted on the log 10 scale. While natural logarithms are common in general modeling, the base-10 scale facilitates direct clinical relevance and comparability with prior studies.
The functions x and x + correspond to min ( x , 0 ) and max ( x , 0 ) , respectively. The quantity a 1 i represents the subject-specific regression coefficient that captures the viral load slope before the change point, while b 1 i , b 2 i , and b 3 i represent subject-specific mixed effects governing the viral trajectory after the change point. We define these as
a 1 i = α 1 + α 1 i , b 1 i = β 1 + β 1 i , b 2 i = β 2 + β 2 i , b 3 i = β 3 + β 3 i .
Here, α 1 , β 1 , β 2 , and β 3 represent the population parameters (fixed effects), while α 1 i , β 1 i , β 2 i , and β 3 i denote the random effects, typically assumed to follow normal distribution with a mean of zero. It is worth noting that the pre-change point segment and the post-change point segment meet at the intercept log 10 ( b 1 i + b 3 i ) when t i j = τ i .
The choice of distribution for the random change points is a modeling assumption that depends on the specific investigation. For our application, there could be a significant proportion of individuals who presumably received ART treatment at diagnosis, i.e., “test-and-treated”. The rest would initiate their HIV treatment after the diagnosis date. The zero inflation exponential (ZIE) distribution allows the combination of a point mass at zero with an exponential distribution for the positive values [15]. It assumes that with a probability of π , the only possible observation is 0, and with a probability of 1 − π , an exponential random variable is observed. For our change point model, we have
τ i { 0 , π Exp ( λ ) , 1 π
where λ represents the expectation of exponential distribution.

3. Estimation Procedure Based on StEM

3.1. The Models and Notations

In this section we describe the models and methods in a general form to illustrate their applicability to other applications. Let y i j , with j = 1 , 2 , , n i , represent the longitudinal measurements for subject i = 1 , 2 , , n , taken at time t i j . We consider a general ZIE-based nonlinear random change point model:
y i j = g ( ( t i j τ i ) , a i ) + h ( ( t i j τ i ) + , b i ) + e i j , a i N ( α , A ) , b i N ( β , B ) , τ i Z I E ( π , λ ) , e i j | a i , b i , τ i N ( 0 , σ e 2 ) .
Here, g ( · ) and h ( · ) are known nonlinear functions, and α and β are vectors of population parameters. The random effects a i and b i follow normal distribution with mean α and β , and covariance matrices A and B, respectively. The random change point τ i is assumed to follow a zero-inflated exponential distribution with parameters π and λ . The within-individual variance is denoted as σ e 2 . The function x ( · ) and x ( · ) + are defined as before. We assume that a i , b i are independent and both are independent of τ i , which is introduced externally.
To estimate and infer the model (2) for left-censored data, we employ a likelihood-based estimation procedure using the observed data { ( q i j , c i j ) , i = 1 , , n , j = 1 , , n i } where c i j is the censoring indicator such that y i j is observed if c i j = 0 and y i j is left-censored if c i j = 1 . That is, y i j = q i j if c i j = 0 and y i j q i j ( d ) if c i j = 1 , where d is the detection limit. Extension to the “doubly-censored” case, in which the response may either be left-censored or right-censored, is straightforward.
Let θ = ( α , β , A , B , σ e 2 , π , λ ) denote the collection of all unknown parameters, and let f ( · ) be a generic density function, with f ( X | Y ) denoting the conditional density of X given Y. The observed data likelihood is given by the following:
L ( θ ) = i = 1 n { j = 1 n i f ( y i j | a i , b i , τ i ; θ ) 1 c i j F ( d | a i , b i , τ i ; θ ) c i j × f ( a i ) f ( b i ) f ( τ i ) d a i d b i d τ i } ,
where F ( d | a i , b i , τ i ; θ ) = d f ( y i j | a i , b i , τ i ; θ ) d y i j . Directly maximizing the likelihood (3) is challenging due to the presence of mixed-type distributions, nonlinear models, and nested integrals. The numerical methods, e.g., Gauss–Hermite Quadrature [16], can be prohibitively intensive for the computation. We therefore resort to EM algorithm-based methods. By treating y c e n , i , the censored component of y i , and the unobserved random effects a i , b i , and τ i as “missing data”, we have the “complete data” ( y i , a i , b i , τ i ) , i = 1 , , n . The complete-data log-likelihood function for individual i is expressed as
l c ( i ) ( θ ) = log f ( y i | a i , b i , τ i ; θ ) + log f ( a i ; A ) + log f ( b i ; B ) + log f ( τ i ; θ ) .

3.2. The Estimation Procedure

The EM algorithm introduced by [17] is a classical approach to estimate parameters of models with non-observed or incomplete data. Let us briefly cover the principle. Denote z as the vector of non-observed data, ( y , z ) the complete data, and L c ( y , z ; θ ) the log-likelihood of the complete data; the EM algorithm maximizes the Q ( θ | θ ) = E ( L c ( y , z ; θ ) | y ; θ ) function in two steps. At the k t h iteration, the E-step is the evaluation of Q ( k ) ( θ ) = Q ( θ | θ ( k 1 ) ) , where the M-step updates θ ( k 1 ) by maximizing Q ( k ) ( θ ) .
For cases where the E-step has no analytic form, ref. [18] introduces the MCEM algorithm, which calculates the conditional expectations at the E-step via many simulations within each iteration and hence is quite computationally intensive. The choice of replicate size is the central issue in guaranteeing convergence. Ref. [10] introduces a stochastic version of the EM algorithm, namely the StEM, which replaces the E-step with a single imputation of the complete data and then averages the last batch of M estimates in the Markov Chain iterative sequence to obtain the point estimate of the parameters. The imputed data z ( k ) at the k t h iteration are a random draw from the conditional distribution of the missing data given the observed data and the estimated parameter values at the ( k 1 ) t h iteration, f ( z ( k ) | y , θ ( k 1 ) ) . As z ( k ) only depends on z ( k 1 ) , { z ( k ) } k 1 is a Markov chain. Assuming that z ( k ) take values in a compact space and the kernel of the Markov chain is positive continuous for a Lebesgue measure, the Markov chain is ergodic, and that ensures the existence of a unique stationary distribution [19,20].
In extending the StEM algorithm for the ZIE-based nonlinear random change point model, the imputation step is a crucial part of the process. At the k th iteration, we aim to draw the missing data ( y c e n , i ( k ) , a i ( k ) , b i ( k ) , τ i ( k ) ) , where direct sampling from the joint conditional distribution is often intractable. To address this, we employ a Metropolis-within-Gibbs sampler, wherein each component is updated conditionally. For variables with tractable full conditional distributions (e.g., censored outcomes), we use standard Gibbs updates. For components lacking closed-form conditionals, we embed Metropolis–Hastings steps within the Gibbs framework to sample from the appropriate target distributions. Unlike the EM or MCEM algorithms, this procedure does not require monotonic increases in the likelihood, but instead ensures that the Markov chain explores the parameter space in a way consistent with the joint posterior distribution [21,22].
As an example, after initializing θ ( 0 ) , and ( y c e n , i ( 0 ) , a i ( 0 ) , b i ( 0 ) , τ i ( 0 ) ) , we update y c e n , i ( k ) , k = 1 , , as follows:
Step 1:
simulate y c e n , i * from TN ( d , μ i ( k 1 ) , Σ i ( k 1 ) ) , a multivariate truncated normal distribution with
  • d the lower bound of the truncation,
  • mean μ i ( k 1 ) = ( μ i , 1 ( k 1 ) , , μ i , n i ( k 1 ) ) where μ i j ( k 1 ) = g ( ( t i j τ i ( k 1 ) ) , a i ( k 1 ) ) + h ( ( t i j τ i ( k 1 ) ) + , b i ( k 1 ) ) , j = 1 , , n i ,
  • and variance Σ i ( k 1 ) = σ e 2 ( k 1 ) I n i × n i , where I n i × n i is a n i by n i identity matrix,
and, independently, sample η from the uniform (0, 1) distribution;
Step 2:
calculate ρ = l c ( i ) ( θ ( k 1 ) | y c e n , i * , y o b s , i , a i ( k 1 ) , b i ( k 1 ) , τ i ( k 1 ) ) l c ( i ) ( θ ( k 1 ) | y c e n , i ( k 1 ) , y o b s , i , a i ( k 1 ) , b i ( k 1 ) , τ i ( k 1 ) ) ;
Step 3:
if η ρ , we update y c e n , i ( k ) by y c e n , i * ; otherwise, y c e n , i ( k ) = y c e n , i ( k 1 ) .
The maximization step of the StEM algorithm involves maximizing the log-likelihood i = 1 n l c ( i ) ( θ | y o b s , , y c e n , i , a i , b i , τ i ) to update parameters { α , β , A , B , σ e 2 , π , λ } under the current imputed missing data. Since ( y c e n , i , a i , b i , τ i ) are regarded as data, the complete log-likelihood no longer involves integrals, which substantially simplifies the maximization. Solving the corresponding score functions yields the following estimations:
α = 1 n i = 1 n a i , β = 1 n i = 1 n b i , A = 1 n i = 1 n ( a i α ) T ( a i α ) , B = 1 n i = 1 n ( b i β ) T ( b i β ) , σ e 2 = 1 n i = 1 n 1 n i j = 1 n i { y i j g ( ( t i j τ i ) , a i ) h ( ( t i j τ i ) + , b i ) } 2 .
The likelihood function represented by the joint probability density and mass function of the ZIE distribution can be written as L ( τ 1 , τ 2 , , τ n ; π , λ ) = τ i = 0 π τ i > 0 ( 1 π ) λ exp ( λ τ i ) . Denote I = { 1 , if τ i = 0 0 , other , we have log L = i = 1 n I log ( π ) + i = 1 n ( 1 I ) log [ ( 1 π ) λ exp ( λ τ i ) ] . Solving the score function log L / π = 0 and log L / λ = 0 , yields
π = 1 n i = 1 n I , λ = 1 i = 1 n I / i = 1 n τ i .
In general, increasing the number of random effects does not substantially increase the complexity of the maximization step. However, the imputation step will be more complicated, as this increases the dimensions of missing data that need to be imputed.
As with the likelihood defined in (3), the Fisher information matrix of the ZIE-based nonlinear random change point model has no closed-form solution. To obtain the variance-covariance matrix of the MLE θ ^ , we consider the following approximate formula in [23]. Denote the score function of the complete-data likelihood by S c ( i ) = l c ( i ) / θ . Then, an approximate formula for the variance-covariance matrix of θ ^ is
Cov ( θ ^ ) = i = 1 n E ( S c ( i ) | y i , a i , b i , τ i ; θ ^ ) E ( S c ( i ) | y i , a i , b i , τ i ; θ ^ ) T 1 ,
where the expectation can be approximated by conditional mean of the Monte Carlo samples.

3.3. Convergence Diagnosis

Determining convergence is a critical aspect of the StEM algorithm, yet it remains an open question in the literature. The most commonly used approach for convergence diagnostics involves visual examination of trace plots [24,25,26]. Recently, ref. [27] proposed a Geweke Statistics-based method. We adopt this approach in our implementation for a more rigorous assessment of convergence. Specifically, for each run, after initializing the Markov chain with the specified initial values, we determined stationarity using a batch procedure based on the Geweke statistic [28]. A Geweke statistic is computed at each increment of w iterations using a moving window with batch size M. Specifically, the procedure consists of the following steps:
1.
Initialization. Set B = 0 and run the StEM algorithm to obtain the initial series of the estimates { θ ( w B + 1 ) , , θ ( w B + M ) } .
2.
Check stationarity. For each entry p in θ , compute the Geweke statistic z p from the Markov chain { θ p ( w B + 1 ) , , θ p ( w B + M ) } . The Geweke statistic is defined as the standardized mean difference between the first p 1 and last p 2 portion of the chain, where p 1 and p 2 can be fine-tuned for a specific application. We consider stationary to be reached when all | z p | s are sufficiently small, i.e., p = 1 P z p 2 < ϵ P , where P is the total number of parameters and ϵ is another tuning parameter.
3.
Update. If stationarity is not reached, perform w additional runs of the chain, increase the number B by 1, and repeat step 2.

4. Data Analysis

The HIV clinical cohort database (HCCD), maintained by Einstein-Rockefeller-CUNY Center for AIDS Research, contains de-identified data on people living with HIV and receiving care at hospitals and clinics affiliated with Montefiore Medical Center, which is the largest provider of HIV care in the Bronx, New York City. Patients in the HCCD are demographically representative with respect to age, sex, race/ethnicity, and HIV transmission risk of the overall population of people living with HIV in the Bronx that is described by public health surveillance data [29].
For this study, we include all patients living with HIV in HCCD diagnosed between 2005 and 2015 with last follow-up by 31 December 2017. Additional inclusion criteria included age ≥ 13 years and at least two HIV-1 viral loads recorded during the period. The final analytic dataset contains 2475 persons with a median viral load frequency of 5 and an inter-quantile range from 3 to 11. Notably, approximately 60% of the viral load measurements were found to be below the detection limits.
In addition to the primary model (2), we also fit a model where the post-ART segment is assumed to follow a two-compartment model [12], which is commonly used for viral load after the treatment. Therefore, we have the following two random change point models:
M 1 : y i j = a 1 i ( t i j τ i ) + log 10 ( b 1 i e b 2 i ( t i j τ i ) + + b 3 i ) + e i j . M 2 : y i j = a 1 i ( t i j τ i ) + log 10 ( b 1 i e b 2 i ( t i j τ i ) + + b 3 i e b 4 i ( t i j τ i ) + ) + e i j .
where M1 allows for individual-specific baseline value b 3 i for the second phase of viral decay, while M2 also captures the second phase viral decay rate through the random effects b 4 i . Our preliminary analysis indicates that it is sufficient to model the pre-change point segment with a linear mixed-effects model and the post-change point segment with a diagonal variance-covariance random effects structure. As a result, we make use of the following assumptions:
M 1 : a i N ( α , σ A 2 ) , b i N ( ( β 1 , β 2 , β 3 ) T , diag ( σ B 11 2 , σ B 22 2 , σ B 33 2 ) ) , τ i Z I E ( π , λ ) , e i j N ( 0 , σ e 2 ) . M 2 : a i N ( α , σ A 2 ) , b i N ( ( β 1 , β 2 , β 3 , β 4 ) T , diag ( σ B 11 2 , σ B 22 2 , σ B 33 2 , σ B 44 2 ) ) , τ i Z I E ( π , λ ) , e i j N ( 0 , σ e 2 ) .
We implemented the StEM algorithm in R [30] with the following tuning parameters: ( w , M , p 1 , p 2 , ϵ ) = ( 10 , 300 , 0.1 , 0.5 , 1.5 ) , where we also restrict β 1 > β 3 for M2 to ensure the model is identifiable. This configuration typically allows for convergence within 3000 iterations in most model-fitting runs. Owing to the stochastic nature of the StEM algorithm, the choice of starting values is quite flexible, with initial values set randomly within the possible range.
To simulate the multivariate truncated normal distribution for the left-censored viral loads, we utilized the R package truncnorm [31]. For the ZIE-distributed change points, we set the value to zero with the current estimated probability π , and with probability 1 π , we set it to a random variable from an exponential distribution with the current estimated mean λ .
Figure 1 and Figure 2 display the trace plots of the Markov Chains for each parameter under models M1 and M2, respectively. Table 1 summarizes the estimation results for the fixed effects and dispersion parameters for both models. The estimations are comparable between M1 and M2, except for β 3 which is necessarily smaller in M1 due to the single baseline parameter in the one-compartment model M1. In contrast, this baseline value is estimated to be larger in M2, where it, along with the viral decay rate β 4 , captures the dynamics of the viral trajectory during the second phase. Both M1 and M2 estimate a similar positive slope of viral load before ART initiation, specifically 0.43 and 0.42, respectively. Additionally, the models estimate similar percentages for the proportion of individuals who started ART treatment at time zero (time of diagnosis), with M1 estimating 31% and M2 estimating 32%.
We also assess the performance of our model by comparing the observed viral load values with the predicted values for each individual in the HCCD. Individual random effects, including the change point, are predicted using the conditional mean, which is obtained by averaging the parameters from extra iterations of the imputation step after convergence.
To further contextualize our model-based estimates, we also implemented the approach denoted as l o g 1 p l u s * , adapted from the rule-based algorithm developed by [1], which uses viral load surveillance data to infer ART initiation timing. Specifically, l o g 1 p l u s * detects ART initiation when there is a decline of more than one log 10 unit in viral load between two consecutive measurements occurring within a defined time window (e.g., three months). Additionally, ART initiation is inferred when a subject transitions from being detectable to being undetectable (i.e., left-censored measurement). While the original method was primarily used to identify ART initiation in a subset of individuals affected by the treatment, l o g 1 p l u s * generalizes this logic to the entire sample. Furthermore, instead of imputing ART initiation at the midpoint between the two relevant viral load measurements, l o g 1 p l u s * assigns the initiation time to the earlier measurement in the pair. This adjustment is biologically motivated, reflecting the expectation that viral load suppression begins shortly after ART initiation.
Figure 3 showcases the findings for nine selected individuals, chosen to represent typical patterns. For each individual, we present the predicted trajectory from M1 and M2, in addition to the observed viral loads. The predicted change point and the corresponding viral load at the time of ART treatment are indicated on the plot with different symbols for the two models. Predicted ART initiation by the l o g 1 p l u s * is also highlighted in the plot. It is important to note that the trajectories produced by the two models exhibit slightly different trends, with some trajectories displaying censoring (IDs 4 to 9) and others without censoring (IDs 1, 2, and 3).
Our model fits the fully observed data quite well and predicts a reasonable ART initiation time. For example, the change point is estimated right before the decline of the viral load for ID 1, while for ID 3, the change point is estimated beyond the last observed viral load, where the observed viral loads are ever-increasing, indicating no ART initiation. For ID 2, the ART initiation time occurs at about the time when the viral load starts to decline.
In contrast to ID 1, ID 4 has only one fully observed viral load, as the other is left-censored at around 4.8 months (0.4 years). Based on this information, M1 predicts that the viral trajectory dipped under the detection limit around 1.2 months (0.1 years), while M2 predicts the time of viral suppression at a time around 2.4 months (0.2 years). Nevertheless, both models predict ART initiation at around the time when the observed viral load is recorded, which is reasonable due to the occurrence of viral suppression.
For IDs 5 and 6, there are two fully observed viral loads before viral suppression. The values and the positions of the two viral loads influenced the shape of the entire trajectory. For example, compared to the slope of ID 6, the slope of the two viral loads for ID 5 is relatively flat; therefore, both M1 and M2 predict a flat pre-change point line.
We present the example for IDs 7, 8, and 9, where more fully observed viral loads are available. Here again, our model fits those data points quite well, and the predicted ART initiation times are reasonable. Importantly, ID 7 represents cases where ART initiation time is estimated to be zero, the time of HIV diagnosis. In such cases, the random change point model effectively degenerates to an NLME model. Similarly, ID 3 represents a case in which the change point model degenerates to an LME model, where the change occurs beyond the last observed viral load.
Visual inspection of Figure 3 reveals that the ART initiation times estimated by the l o g 1 p l u s * method generally align with the model-based predictions (M1 and M2) when viral load measurements show a sharp and well-timed decline (e.g., IDs 4, 5, and 7). However, notable discrepancies occur in cases with sparse measurements or censoring. For instance, in ID 6 and ID 8, the l o g 1 p l u s * method identifies ART initiation earlier than both M1 and M2, likely due to its reliance on observed declines rather than inferred trajectories. These differences underscore the advantage of the model-based approach in accommodating censoring, nonlinear post-ART dynamics, and between-subject variability in estimating ART initiation timing.

5. Simulation Studies

The performance of the StEM algorithm was evaluated through simulation. Here, we design two simulation studies aimed at assessing and comparing their estimation accuracy under different specifications for the post-ART segment, inspired by the real data analysis.
We model irregular viral load recording times since HIV diagnosis using a progressive state-transition model, assuming a first-order Markov process. This means that the length of time between viral records depends on the previous recording time. To generate a stochastic measurement time ( t i 1 , t i 2 , , t i n i ) , we use parameters obtained from fitting the model to the actual viral load test dates in HCCD. Specifically, we assume that the viral load measurement time T follows an exponential distribution with parameter ξ > 0 . Given the previous recording time u > 0 , the next recording time, conditioned on u, is given by T | u = 1 ξ log ( X ) + u , where X Uniform ( 0 , 1 ) . For this simulation, we set ξ = 1.6 , which is estimated from the real data. The recording time is terminated when the time exceeds 3 years to simulate the real data. For left-censoring, in addition to simulating the 60% as in real data, we also run the model without any censoring to assess the StEM algorithm under such an ideal scenario.
To efficiently manage computational resources, each simulated dataset included 1000 individuals. In order to thoroughly evaluate the accuracy and bias of the estimated parameter values, we conducted 200 simulation runs for each scenario. Across these simulation replicates, we calculated both the mean squared error (MSE) and the bias (Bias) by comparing the estimated parameter values with the true values: Bias = 1 S s = 1 S ( θ ^ ( s ) θ true ) , MSE = 1 S s = 1 S 100 × ( θ ^ ( s ) θ true ) 2 + SE ( θ ^ ( s ) ) | θ t r u e | . Here, S is the number of replicates, and θ ^ ( s ) is the estimate from simulation s.
Table 2 and Table 3 display the simulation results for studies 1 and 2, respectively. We see that there is no major performance difference between M1 and M2. For each model, the fixed effects are estimated better than the dispersion parameters in general. The largest MSE occurred in the estimation of σ A 2 under M1 when 60% of viral loads are left-censored. For M2, we see σ B 11 2 is estimated with the largest bias, while β 4 , the second-phase decay rate, is estimated with the biggest MSE. Such sub-optimal performances are likely due to the sparsity of the observation frequency where insufficient data are available to provide the best estimation.

6. Conclusions and Discussion

In this paper, we extend the StEM algorithm by incorporating a combination of Gibbs sampling and Metropolis–Hastings steps to address the complexity of modeling individual-specific change points in longitudinal data. Specifically, we model change points using a zero-inflated exponential (ZIE) distribution, allowing us to capture both immediate and delayed antiretroviral therapy (ART) initiation. We also generalize standard random change point models by incorporating nonlinear mixed-effects (NLME) specifications before and after the change point.
Although our algorithm uses MCMC-based imputation steps reminiscent of Bayesian methods, it is embedded within a maximum likelihood framework. This hybrid structure avoids full posterior sampling—particularly of high-dimensional variance components—resulting in improved computational efficiency and estimation stability. It also preserves desirable asymptotic properties of maximum likelihood estimators while allowing for flexible inference via stochastic approximation. In addition, although our primary emphasis has been on point estimation, the framework supports approximate inference via a score-based variance-covariance estimator, enabling the construction of confidence intervals and facilitating hypothesis testing for model parameters.
Our method is evaluated through simulation studies and real data analysis. Compared to empirical rule-based approaches, which often suffer from instability due to measurement error or sparse sampling, our model-based framework leverages the full structure of the data to yield more reliable estimates. Application to clinical HIV cohort data demonstrates the utility of the method, revealing that a substantial proportion of individuals initiate ART immediately upon diagnosis—a biologically meaningful finding in light of “test-and-treat” policies.
Beyond statistical modeling, the clinical context of ART initiation is critical. While immediate initiation is now the recommended standard, delays remain common due to factors such as patient hesitancy, anticipated side effects, co-occurring mental health or substance use issues, stigma, or gaps in follow-up care. By accommodating both immediate and delayed ART initiation, the zero-inflated change point model enhances public health relevance and interpretability.
On the computational side, our use of Metropolis-within-Gibbs sampling and Geweke diagnostics enables convergence within 3000 iterations, even for complex, high-dimensional missing data structures. While our framework is flexible enough to incorporate other zero-inflated distributions (e.g., gamma, Weibull, or log-normal), we limited this work to the exponential case to preserve computational tractability. Future work will explore more efficient variants of the algorithm, such as independent sample approaches [32], to accommodate these extensions.
Several methodological developments offer promising directions. One is the integration of machine learning (ML) techniques—for example, to model post-ART viral dynamics or improve scalability. While ML algorithms may enhance flexibility, they often lack the interpretability and inferential tools required for surveillance-oriented estimation tasks. Hybrid approaches combining ML with structured statistical models could offer the best of both worlds and merit future study.
Another important extension involves incorporating subject-level covariates (e.g., age, gender, or transmission risk) into the fixed effects components of the model. This would allow parameters such as α 1 , π , λ , β 1 , β 2 , and β 3 to vary across individuals. However, this generalization would require replacing the closed-form M-step with iterative procedures such as Newton–Raphson, significantly increasing computational demands—particularly in the presence of censoring and complex random effects. We are actively pursuing this extension using parallel computing strategies to support large-scale applications.
It will also be important to assess the robustness of our method under misspecified change point distributions. While we focused on the ZIE distribution here, the algorithm can, in principle, accommodate zero-inflated log-Gaussian, gamma, or Weibull alternatives, as mentioned above. Incorporating these would, however, increase the complexity of Metropolis–Hastings updates and reduce sampling efficiency. Simulation-based robustness assessments under alternative distributions are planned as the next step.

Author Contributions

Conceptualization, S.L.B., D.N. and H.Z.; methodology, H.Z.; preliminary data preparation, M.R.; data curation, D.B.H. and U.R.F.; writing—original draft preparation, review and editing, L.W. and H.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by NIH grant R21AI147933. It was also partially supported by the High-Performance Computing Center at the University of Kentucky. The Einstein-Rockefeller-CUNY Center for AIDS Research (P30-AI-124414) is supported by the following NIH Co-Funding and Participating Institutes and Centers: NIAID, NCI, NICHD, NHLBI, NIDA, NIDDK, NIGMS, NIMH, NIMHD, NIA, FIC, and OAR.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Braunstein, S.; Robertson, M.; Myers, J.; Abraham, B.; Nash, D. Increase in CD4+ T-Cell Count at the Time of HIV Diagnosis and Antiretroviral Treatment Initiation Among Persons with HIV in New York City. J. Infect. Dis. 2016, 214, 1682–1686. [Google Scholar] [CrossRef] [PubMed]
  2. Dominicus, A.; Ripatti, S.; Pedersen, N.; Palmgren, J. A random change point model for assessing variability in repeated measures of cognitive function. Stat. Med. 2008, 27, 5786–5798. [Google Scholar] [CrossRef] [PubMed]
  3. Cudeck, R.; Klebe, K. Multiphase mixed-effects models for repeated measures data. Psychol. Methods 2002, 7, 41–63. [Google Scholar] [CrossRef]
  4. Rudoy, D.; Yuen, S.; Howe, R.; Wolfe, P. Bayesian change-point analysis for atomic force microscopy and soft material indentation. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2010, 59, 573–593. [Google Scholar] [CrossRef]
  5. Moss, A.; Juarez-Colunga, E.; Nathoo, F.; Wagner, B.; Sagel, S. A comparison of change point models with application to longitudinal lung function measurements in children with cystic fibrosis. Stat. Med. 2016, 35, 2058–2073. [Google Scholar] [CrossRef]
  6. Buhule, O.; Choo-Wosoba, H.; Albert, P. Modeling repeated labor curves in consecutive pregnancies: Individualized prediction of labor progression from previous preganncy data. Stat. Med. 2020, 39, 1068–1083. [Google Scholar] [CrossRef]
  7. Naumova, E.; Musta, A.; Laird, N. Evaluating the impact of critical periods in longitudinal studies of growth using piecewise mixed effects models. Int. J. Epidemiol. 2001, 30, 1332–1341. [Google Scholar] [CrossRef]
  8. Hall, C.; Ying, J.; Kuo, L.; Lipton, R. Bayesian and profile likelihood change point methods for modeling cognitive function over time. Comput. Stat. Data Anal. 2003, 42, 91–109. [Google Scholar] [CrossRef]
  9. Muggeo, M.; David, C.; Robert, J.; Sona, D. Segmented mixed models with random changepoints: A maximum likelihood approach with application to treatment for depression study. Stat. Model. 2014, 14, 293–313. [Google Scholar] [CrossRef]
  10. Diebolt, J.; Celeux, G. Asymptotic properties of a stochastic EM algorithm for estimating mixture proportions. Stoch. Model. 1996, 9, 599–613. [Google Scholar]
  11. Mei, Y.; Wang, L.; Holte, S. A comparison of methods for determining HIV viral set point. Stat. Med. 2008, 27, 121–139. [Google Scholar] [CrossRef] [PubMed]
  12. Wu, H.; Ding, A. Population HIV-1 dynamics in vivo: Applicable models and inferential tools for virological data from AIDS clinical trials. Biometrics 1999, 55, 410–418. [Google Scholar] [CrossRef] [PubMed]
  13. Grossman, Z.; Polis, M.; Feinberg, M. Ongoing HIV dissemination during HARRT. Nat. Med. 1999, 5, 1099–1103. [Google Scholar] [CrossRef] [PubMed]
  14. Perelson, A.; Neumann, A.; Markowitz, M. HIV-1 dynamics in vivo: Virion clearance rate, infected cell life-span, and viral generation time. Science 1996, 271, 1582–1586. [Google Scholar] [CrossRef]
  15. Huang, D.; Hu, H.; Li, Y. Zero-inflated exponential distribution of casualty rate in ship. J. Shanghai Jiaotong Univ. (Sci.) 2019, 24, 739–744. [Google Scholar] [CrossRef]
  16. Olver, F.; Lozier, D.; Boisver, R.; Clark, C. Quadrature: Gauss-Hermit Formula: NIST Handbook of Mathematical Functions; Cambridge University Press: London, UK, 2010. [Google Scholar]
  17. Dempster, A.; Laird, N.; Rubin, D. Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. Ser. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
  18. Wei, G.; Tanner, M. A Monte-Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. J. Am. Stat. Assoc. 1990, 85, 699–704. [Google Scholar] [CrossRef]
  19. Ip, E. A Stochastic EM Estimator in the Presence of Missing Data: Theory and Application. Ph.D. Dissertation, Standford University, Stanford, CA, USA, 1994. [Google Scholar]
  20. Nielsen, S. The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli 2000, 6, 457–489. [Google Scholar] [CrossRef]
  21. Robert, C.; Casella, G. Introduction to Monto Carlo in R; Springer: New York, NY, USA, 2010. [Google Scholar]
  22. Baey, C.; Trevezas, S.; Cournede, P. A non linear mixed effects model of plant growth and estimation via stochastic variants of the EM algorithm. Commun. Stat.-Theory Methods 2016, 45, 1643–1669. [Google Scholar] [CrossRef]
  23. McLachlan, G.; Krishnan, T. The EM-Algorithm and Extension; Wiley: New York, NY, USA, 1997. [Google Scholar]
  24. Yang, F. A Stochastic EM Algorithm for Quantile and Censored Quantile Regression Models. Compt. Econ. 2018, 52, 555–582. [Google Scholar] [CrossRef]
  25. Wang, R.; Bing, A.; Wang, C.; Hu, Y.; Bosch, R.; DeGruttola, V. A flexible nonlinear mixed effects model for HIV viral load rebound after treatment interruption. Stat. Med. 2020, 39, 2051–2066. [Google Scholar] [CrossRef] [PubMed]
  26. Huang, R.; Xu, R.; Dulai, P.S. Sensitivity analysis of treatment effect to unmeasured confounding in observational studies with survival and competing risk outcomes. Stat. Med. 2020, 39, 3397–3411. [Google Scholar] [CrossRef] [PubMed]
  27. Zhang, S.; Chen, Y.; Liu, Y. An improved stochastic EM algorithm for large-scale full-information item factor analysis. Br. J. Math. Stat. Psychol. 2020, 73, 44–71. [Google Scholar] [CrossRef] [PubMed]
  28. Gelman, A.; Rubin, D. Inference from iterative simulation using multiple sequences. Stat. Sci. 1992, 7, 457–472. [Google Scholar] [CrossRef]
  29. Felsen, U.R.; Hanna, D.B.; Ginsberg, M.S. The Einstein- Montefiore Center for AIDS Research HIV Integrated Clinical Database: Cohort development and description. In Proceedings of the New York City Epidemiology Forum, New York, NY, USA, 28 February 2014. [Google Scholar]
  30. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
  31. Kotecha, J.; Djuric, P. Gibbs Sampling Approach For Generation of Truncated Multivariate Gaussian Random Variables. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, Phoenix, AZ, USA, 15–19 March 1999; IEEE Computer Society: Washington, DC, USA, 1999; pp. 1757–1760. [Google Scholar]
  32. Karimi, B.; Lavielle, M.; Moulines, E. f-SAEM: A fast Stochastic Approximation of the EM algorithm for nonlinear mixed effects models. Commun. Stat. Data Anal. 2019, 141, 123–138. [Google Scholar] [CrossRef]
Figure 1. Trace plots for all parameters in M1.
Figure 1. Trace plots for all parameters in M1.
Algorithms 18 00346 g001
Figure 2. Trace plots for all parameters in M2.
Figure 2. Trace plots for all parameters in M2.
Algorithms 18 00346 g002
Figure 3. Individual trajectory prediction plots. Open circles (∘) connected by a solid line represent the observed viral loads. Dashed lines and the dash-dot lines indicate the predicted trajectory from M1 and M2, respectively. The detection limit (for IDs 4–9) is shown by dotted line. A triangle (▲) marks the predicted ART initiation time and corresponding viral load under M1, while a solid circle (•) does so for M2. The vertical dash-dot line represents the ART initiation selected by the l o g 1 p l u s * method. Visual comparison shows that this method often yields similar estimates to model-based predictions when there is a clear viral load decline, but may differ in cases with sparse or censored data (e.g., ID 6 or ID 8), highlighting the value of a model-based approach.
Figure 3. Individual trajectory prediction plots. Open circles (∘) connected by a solid line represent the observed viral loads. Dashed lines and the dash-dot lines indicate the predicted trajectory from M1 and M2, respectively. The detection limit (for IDs 4–9) is shown by dotted line. A triangle (▲) marks the predicted ART initiation time and corresponding viral load under M1, while a solid circle (•) does so for M2. The vertical dash-dot line represents the ART initiation selected by the l o g 1 p l u s * method. Visual comparison shows that this method often yields similar estimates to model-based predictions when there is a clear viral load decline, but may differ in cases with sparse or censored data (e.g., ID 6 or ID 8), highlighting the value of a model-based approach.
Algorithms 18 00346 g003
Table 1. Estimates in fitting HCCD data to the StEM algorithm with one-compartment (M1) and two-compartment (M2) specification for the post-ART trajectory.
Table 1. Estimates in fitting HCCD data to the StEM algorithm with one-compartment (M1) and two-compartment (M2) specification for the post-ART trajectory.
MethodParameter α β 1 β 2 β 3 β 4 π λ
M1est0.4310.513.921.69 0.311.11
se0.020.040.020.02 0.020.04
M2est0.4210.523.892.72−0.100.321.09
se0.020.030.020.020.070.020.04
Estimates of the variance components: M 1 : σ A 2 = 0.49 , ( σ B 11 2 , σ B 22 2 , σ B 33 2 ) = ( 4.61 , 0.64 , 1.92 ) , σ e 2 = 0.10 , M 2 : σ A 2 = 0.34 , ( σ B 11 2 , σ B 22 2 , σ B 33 2 , σ B 44 2 ) = ( 4.23 , 0.80 , 0.70 , 0.50 ) , σ e 2 = 0.11 ,
Table 2. Simulation results on the performance of StEM algorithm when the post-ART trajectory is modeled with one-compartment model (M1).
Table 2. Simulation results on the performance of StEM algorithm when the post-ART trajectory is modeled with one-compartment model (M1).
θ α β 1 β 2 β 3 σ A 2 σ B 11 2 σ B 22 2 σ B 33 2 σ e 2 π λ
True 0.43 10.51 3.92 1.69 0.49 4.61 0.64 1.92 0.10 0.31 1.11
0% left-censored
Est0.3610.403.921.670.575.270.651.980.110.321.03
MSE0.400.020.020.080.400.200.250.140.290.310.16
Bias−0.7−0.110.00−0.020.080.660.010.060.030.01−0.11
60% left-censored
Est0.3910.384.022.011.206.590.761.340.140.340.99
MSE0.490.030.040.261.500.450.350.520.430.310.17
Bias−0.04−0.130.100.320.711.980.12−0.580.05−0.02−0.09
Table 3. Simulation results on the performance of StEM algorithm when the post-ART trajectory is modeled with two-compartment model (M2).
Table 3. Simulation results on the performance of StEM algorithm when the post-ART trajectory is modeled with two-compartment model (M2).
θ α β 1 β 2 β 3 β 4 σ A 2 σ B 11 2 σ B 22 2 σ B 33 2 σ B 44 2 σ e 2 π λ
True 0.42 10.52 3.89 2.72 −0.10 0.34 4.23 0.80 0.70 0.50 0.11 0.32 1.09
0% left-censored
Est0.3810.393.872.68−0.470.375.820.901.420.500.120.321.07
MSE0.330.030.040.050.390.390.440.231.140.250.560.330.18
Bias−0.04−0.13−0.02−0.04−0.300.031.590.010.720.000.07−0.00−0.02
60% left-censored
Est0.4210.233.782.23−0.570.978.161.081.061.000.150.291.15
MSE0.450.040.050.603.501.930.940.320.831.250.570.340.21
Bias0.00−0.29−0.11−1.79−1.800.633.930.250.420.590.09−0.030.06
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.; Robertson, M.; Braunstein, S.L.; Hanna, D.B.; Felsen, U.R.; Waldron, L.; Nash, D. Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring. Algorithms 2025, 18, 346. https://doi.org/10.3390/a18060346

AMA Style

Zhang H, Robertson M, Braunstein SL, Hanna DB, Felsen UR, Waldron L, Nash D. Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring. Algorithms. 2025; 18(6):346. https://doi.org/10.3390/a18060346

Chicago/Turabian Style

Zhang, Hongbin, McKaylee Robertson, Sarah L. Braunstein, David B. Hanna, Uriel R. Felsen, Levi Waldron, and Denis Nash. 2025. "Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring" Algorithms 18, no. 6: 346. https://doi.org/10.3390/a18060346

APA Style

Zhang, H., Robertson, M., Braunstein, S. L., Hanna, D. B., Felsen, U. R., Waldron, L., & Nash, D. (2025). Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring. Algorithms, 18(6), 346. https://doi.org/10.3390/a18060346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop