A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs

Li, Simeng; Deng, Dianliang; Han, Yuecai

doi:10.3390/sym16040389

Open AccessArticle

A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs

by

Simeng Li

^1,2

,

Dianliang Deng

^2,* and

Yuecai Han

¹

School of Mathematics, Jilin University, No. 2699 Qianjin Street, Changchun 130012, China

²

Department of Mathematics and Statistics, University of Regina, Regina, SK S4S 0A2, Canada

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(4), 389; https://doi.org/10.3390/sym16040389

Submission received: 21 February 2024 / Revised: 16 March 2024 / Accepted: 22 March 2024 / Published: 26 March 2024

(This article belongs to the Special Issue Applications Based on Symmetry/Asymmetry in Functional Data Analysis)

Download

Browse Figures

Versions Notes

Abstract

In longitudinal studies, subjects are repeatedly observed at a set of distinct time points until the terminal event time. The time-varying coefficient model extends the parametric method and captures the dynamic trajectories of time-dependent covariate effects, thus enabling it to describe the potential relationship between the longitudinal variable and the observed time points. In this study, we propose a novel approach to the estimation of medical costs using a symmetric kernel smoothing method in the time-varying coefficient joint model. A smooth function of medical costs is derived by weighting the values of longitudinal data at all distinct observed time points via the combination of the kernel method and the inverse probability weighting method. For the simulation study, we first set up the true functions of time-varying coefficients; we then generated random samples for covariates and censored survival times. Subsequently, the longitudinal data of response variables could be produced. Further, numerical simulation experiments were conducted by using the proposed method and applying R code to the generated data. The estimated results for the parameters and non-parametric functions were compared with different settings. The numerical results illustrate that as the sample size increases, the bias and model-based standard errors decrease, and the performance improves with larger sample sizes. The estimates of functions in the model almost coincide with the true functions, as shown in the figures of the simulation study. Furthermore, the consistency of the obtained estimator is demonstrated via theoretical analysis, and a numerical simulation is performed to illustrate the performance of the proposed estimators. The proposed model is applied to a real-world data set acquired from a multicenter automatic defibrillator implantation trial (MADIT).

Keywords:

varying coefficients; survival model; kernel function; medical cost; right censoring

1. Introduction

In medical cost studies, researchers need to use appropriate methods to evaluate the average medical cost of a patient across their whole life. Due to censoring mechanisms, the survival function is generally not identifiable. Bang and Tsiatis [1] introduced a class of weighted estimators that appropriately account for censoring; although extensive simulation studies showed that the estimators perform well in finite samples, even with heavily censored data, the estimator is not efficient, and the computations are complex. Lin et al. [2] partitioned the entire time period of interest into a number of small intervals and estimated the average total cost to minimize the bias induced by censoring. Furthermore, the estimators were proven to be asymptotically normal. Huang and Lovato [3] formulated weighted log-rank statistics in a marked point process framework and developed the asymptotic theory. The above methods have been applied to estimate the cumulative mean function. However, in addition to censoring mechanisms, data often consist of longitudinal outcomes, time-dependent/independent covariates, event times, and censoring times. In longitudinal studies, subjects are repeatedly observed at a set of distinct time points until the terminal event time. The joint model [4] contains both a longitudinal sub-model, to express the effect on longitudinal measurements from time-dependent covariates, and a survival sub-model, to reveal the survival function association with the longitudinal part. Time-independent covariates are generally present in the survival sub-model due to the absence of repeated measurements.

Deng [5] considered a linear parametric regression model to describe longitudinal data and used the joint modeling technique and the inverse probability weighting method [6] to estimate the cumulative mean function. Although this method can also solve the problem of handling time-dependent covariates and right-censored time-to-event data, it may still be restrictive for capturing initial data.The main reason for this is that the effects of covariates on longitudinal outcomes are considered time-independent constants.

Thus, some researchers have focused on non-parametric longitudinal sub-models (e.g., Hoover et al. [7]; Zhao et al. [8]; Li et al. [9]; Do et al. [10]). There have been numerous approaches to estimating non-parametric estimators in the recent literature, such as kernel, smoothing spline, regression spline, and wavelet-based methods. For instance, Eubank and Speckman [11] proposed a well-behaved non-parametric kernel regression model in a small-sample study with bias-corrected confidence bands and proved the asymptotic properties. Wu et al. [12] minimized the local square criterion and obtained the asymptotic distributions for kernel estimators. However, when the covariate dimension is too high, the smooth estimation of general multivariate non-parametric regression may require a large sample size, and the smoothing results may be difficult to interpret.

Several time-varying coefficient longitudinal models have been considered in recent studies. For instance, You et al. [13] considered the time-varying coefficient as a polynomial basis regression spline, proposed a mixed-effects model for multiple longitudinal outcomes using the local polynomial method, and provided tuning parameters and variable selection. However, this model can only be used in continuous longitudinal outcomes. Observed data tend to be discrete in many applications. Moreover, time-independent covariates were not considered in their study [13]. The joint model with kernel smoothing varying coefficients in the longitudinal sub-model can be used to estimate the cumulative mean function with discrete data. Therefore, in this study, we estimate the cumulative mean function with time-dependent/independent covariates using the kernel method in a joint time-varying coefficient model based on right/interval censoring history process data. In our method, the estimator of the mean state function is unbiased at time points where all subjects are observed. The reason for this is that, after smoothing the function with the kernel method, the values of the original sample points are not changed. Thus, we can utilize all the known information for estimation without data bias. Moreover, in our simulation study, the estimator of the cumulative mean state function is almost equal to the value of the preassigned function at any time point.

The remainder of the paper is organized as follows: Section 2 establishes the joint varying coefficient model and proposes the estimators of time-varying coefficients based on our method. Section 3 demonstrates the feasibility of this method through numerical simulations. Section 4 applies the proposed model to a real-world data set from a multicenter automatic defibrillator implantation trial (MADIT). Section 5 discusses the influence of bandwidth h selection and concludes this paper. Finally, the appendices provide the proofs of the main results.

2. Estimation for a Time-Varing Coefficient Model

It is assumed that history process data are right-censored. Let T denote the terminal event time, and C denote the censoring event time. Let

Y (t)

denote the state process which is related to the time-dependent covariate

X (t)

and time-independent covariate

W

. The state process satisfies

Y (t) = 0

when

t ⩾ T

. For

i = 1, 2, \dots, m

, let

T_{i}

and

C_{i}

denote the true values of T and C for the ith subject. Further,

δ_{i} = I (T_{i} ⩽ C_{i})

denotes the censoring indicator. Assume that censoring time is independent of terminal event time and the state process. Let

y_{i} (t)

, q-dimensional vector

x_{i} (t) = {(x_{i 1} (t), x_{i 2} (t), \dots, x_{i q} (t))}^{T}

, and p-dimensional vector

w_{i} = {(w_{i 1}, \dots, w_{i p})}^{T}

be the observed history of

Y (t)

,

X (t)

, and

W

for the ith subject.

Now, the time-varying coefficient joint model can be formed as follows:

\{\begin{matrix} y_{i} (t) = x_{i} {(t)}^{T} β (t) + ϵ_{i} (t), \\ h_{i} (t) = h_{0} (t) exp {w_{i} γ + α (y_{i} (t) - ϵ_{i} (t))}, \end{matrix}

(1)

where

β (t)

is time-varying coefficient parameter,

α

is the association coefficient of the longitudinal outcome to the hazards for occurring event,

h_{0} (t)

is the baseline hazards function, which is known in our model, and

ϵ_{i} (t)

is the random error with

ϵ_{i} (t) \sim N (0, σ^{2})

. Further, it is assumed that

ϵ (t)

is independent of terminal event T conditional on

X (t)

and W.

2.1. Estimation of Time-Varying Coefficient in Longitudinal Model

We should notice that the observations for ith subject

(i = 1, 2, \dots, m)

are not continuous but only able to be obtained at some special times

t_{i j}

,

(j = 1, 2, \dots, n_{i})

. The state process

Y (t)

can be the observed until the event time

T_{i}^{*} = min (T_{i}, C_{i})

Thus, let

y_{i} (t_{i j})

be the observed state history. For the sake of illustration, assume

y_{i} (t_{i j}) = 0

when

t ⩾ T^{*}

.

For convenience, define sets

Δ_{i} = {t_{i j}, j = 1, 2, \dots, n_{i}}

for

i = 1, 2, \dots, m

and

Δ = {t_{(k)}; k = 0, 1, 2, \dots, N}

, where

0 = t_{(0)} < t_{(1)} < t_{(2)} < \dots < t_{(N)}

are the observed distinct time points for all subjects. N denotes the number of all the distinct observed time points.

The estimator

\hat{β} (t)

of the time-varying coefficient

β (t)

can be calculated by minimizing the following equation:

L_{N} (t) = \sum_{k = 1}^{N} W_{k, h} (t - t_{(k)}) \{\sum_{i = 1}^{m} {[I {T_{i}^{*} ⩾ t_{(k)}} {y_{i} (t_{(k)}) - x_{i} {(t_{(k)})}^{T} β (t)}]}^{2}\},

(2)

where

W_{k, h} (t - t_{(k)}) = \frac{K_{h} (t - t_{(k)})}{\sum_{k = 1}^{N} K_{h} (t - t_{(k)})},

K_{h} (\cdot)

is the kernel weight function. The bandwidth, which is generally selected according to the observed time points, reveals the distance of t and

t_{(k)}

. There is a major distinction between the estimation from Equation (2) and the estimation in Wu [12] since the censoring mechanism, indicator

I {T_{i}^{*} ⩾ t_{(k)}}

, is necessary in order to avoid incomplete data due to censoring.

Remark 1.

We assume that

K_{h} (\cdot)

is the Epanechnikov kernel function. The cross-validation method can be used to select the appropriate kernel function. The kernel function with the smallest induced error is considered the best. More details on kernel function selection can be found in [14].

Equivalently, we write Equation (2) in the matrix form:

L_{N} (t) = \sum_{k = 1}^{N} {(Y_{k} - X_{k} β (t))}^{T} {\tilde{D}}_{k} (t) (Y_{k} - X_{k} β (t)),

(3)

where

Y_{k} = (\begin{matrix} y_{1} (t_{(k)}) \\ ⋮ \\ y_{m} (t_{(k)}) \end{matrix}),

X_{k} = (\begin{matrix} x_{11} (t_{(k)}) & \dots & x_{1 q} (t_{(k)}) \\ ⋮ & \dots & ⋮ \\ x_{m 1} (t_{(k)}) & \dots & x_{m q} (t_{(k)}) \end{matrix}),

and

{\tilde{D}}_{k} (t) = diag (W_{k, h} (t - t_{(k)}) I {T_{1}^{*} ⩾ t_{(k)}}, \dots, W_{k, h} (t - t_{(k)}) I {T_{m}^{*} ⩾ t_{(k)})),

It is assumed that

(\sum_{k = 1}^{N} X_{t_{(k)}}^{T} {\tilde{D}}_{k} X_{t_{(k)}})

is also invertible. By minimizing Equation (3),

\hat{β} (t)

can be expressed as the following q-dimensional column vector:

\hat{β} (t) = {(\sum_{k = 1}^{N} X_{k}^{T} {\tilde{D}}_{k} (t) X_{k})}^{- 1} (\sum_{k = 1}^{N} X_{k}^{T} {\tilde{D}}_{k} (t) Y_{k}) .

(4)

The traditional method for bandwidth selection is K-fold cross-validation (CV). Time-varying coefficients complicate the calculation. However, ‘leave-one-subject-out’ cross-validation proposed by Rice and Sliverman [15] can be used in such scenarios. In this case, the kernel weight function is

K (t - t_{(k)}; h)

, the estimator of time-varying coefficient

β (t)

is

\hat{β} (t; h)

. Minimizing the following equation:

C V (h) = \sum_{k = 1}^{N} \sum_{i = 1}^{m} E {[I {T_{i}^{*} ⩾ t_{(k)}} {y_{i} (t_{(k)}) - x_{i} {(t_{(k)})}^{T} β^{(- i)} (t_{(k)}; h)}]}^{2},

where

{\hat{β}}^{(- i)} (t_{(k)}; h)

is the kernel estimator computed with all measurements except the measurements of ith subject. Then, the cross-validation bandwidth can be obtained.

2.2. Estimation of Survival Model

Define

m_{i} (t) = y_{i} (t) - ϵ_{i} (t) = x_{i} {(t)}^{T} β (t)

, and then the estimate of

m_{i} (t)

is

{\hat{m}}_{i} (t) = x_{i} {(t)}^{T} \hat{β} (t) .

The data related to survival sub-model consist of

{(x_{i} (t), w_{i}, T_{i}^{*}, δ_{i},); i = 1, 2, \dots, m}

.

For each given

t \in R

,

h_{i} (t) = h_{0} (t) exp {w_{i} γ + α m_{i} (t)},

The Cox partial maximum likelihood function [4] is

L_{p} (γ, α) = \prod_{i = 1}^{m} {[\frac{exp {w_{i} γ + α m_{i} (T_{i}^{*})}}{\int_{0}^{T_{i}^{*}} exp {w_{i} γ + α m_{i} (s)} d s}]}^{δ_{i}} .

(5)

The log-partial likelihood function for Equation (5) is

\begin{matrix} L L (θ) & = log L_{p} (γ, α) \\ = \sum_{i = 1}^{m} δ_{i} [w_{i} γ + α (m_{i} (T_{i}^{*})) - log \int_{0}^{T_{i}^{*}} exp {w_{i} γ + α m_{i} (s)} d s] . \end{matrix}

(6)

Then, replacing

m_{i} (t)

by the estimator

{\hat{m}}_{i} (t)

, the estimator

\hat{θ} = (\hat{γ}, \hat{α})

can be obtained by maximizing Equation (6).

We define the survival function of T as:

S_{T} (t) = P (T > t) .

The estimate of

S_{T} (t)

can be obtained from the hazards function in joint model:

{\hat{S}}_{T} (t) = exp \{- \int_{0}^{t} h_{0} (s) exp {w^{T} \hat{γ} + \hat{α} \hat{m} (s)} d s\} .

Theoretically, the estimate

{\hat{S}}_{T} (t)

is related to the values of covariates

x (s)

for

s \in [0, t]

. Since

x (t)

generally is not continuously observed, the estimate

{\hat{S}}_{T} (t)

can not be derived even at observed time points.

{\hat{S}}_{T} (t)

can be calculated only if

x (t)

can be observed continuously. Thus, we replace

{\hat{S}}_{T} (t)

with the Kaplan–Meier estimator [16]. Alternatively, in terms of the law of large numbers, the survival function can also be estimated as follows:

\hat{S} (t) = \frac{1}{m} \sum_{i = 1}^{m} I {T_{i} ⩾ t} .

(7)

Moreover, the estimate of the survival function of

{\hat{S}}^{*} (t) = P (T^{*} > t)

for

T^{*}

can be obtained in a similar way.

2.3. Estimation of Cumulative Mean State Function

Suppose

ν (t)

is the mean function of

Y (t)

, that is, given

X (t)

,

ν (t) = E (Y (t) ∣ X (t))

. Because of the censoring, the proposed estimator

\hat{ν} (t)

for the mean state function

ν (t)

at any time is as follows:

\hat{ν} (t) = \frac{\frac{1}{m} \sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t} {\hat{Y}}_{i} (t)}{{\hat{S}}^{*} (t)},

(8)

where

{\hat{Y}}_{i} (t) = x_{i} (t) \hat{β} (t)

is the fitted value of

Y_{i} (t)

at time point t.

Then, the cumulative mean function

μ (t)

for any time point t can be obtained as:

\hat{μ} (t) = \int_{0}^{t} \hat{ν} (s) d s .

(9)

2.4. The Asymptotic Property of Estimators

Here, we discuss the asymptotic property of estimators. To rigorously define the statements, we introduce various notations. Let

R (t)

denote the observed covariate processes, such as baseline information, study time, and so on, that is,

R (t) = {x (t), w}

. Then, let

{\bar{Z}}_{R} (t)

denote the longitudinal covariate history prior to time t and

{\bar{Z}}_{Y} (t)

denote the response history prior to time t, that is,

{\bar{Z}}_{R} (t) = {R (s) : s < t}

and

{\bar{Z}}_{Y} (t) = {Y (s) : s < t}

. Furthermore, we use

‖ \cdot ‖

to denote the Euclidean norm in real space and

R^{'} (t)

to denote the derivative of

R (t)

with respect to time t. It is assumed that all observed time points

t_{i j}

are independent of each other and follow a distribution

F_{T}

with density

f_{T}

.

Assumption 1.

X (t)

is Lipschitz-continuous with order

λ_{0}

,

| E (X_{l}^{T} X_{l}) - E (X_{s}^{T} X_{s}) | ⩽ C_{0} {| t_{(l)} - t_{(s)} |}^{λ_{0}}

for any

t_{(l)}

and

t_{(s)}

in support of

f_{T}

and some

C_{0} > 0

, and

β (t)

and

f_{T}

are Lipschitz-continuous with orders of

λ_{1} > 0

and

λ_{2} > 0

, respectively.

Assumption 2.

o (t) = {lim}_{Δ t \to 0} E {ϵ (t + Δ t) ϵ (t)}

and

w (t) = E {ϵ^{2} (t)}

are continuous.

Assumption 3.

W_{k, h} (\cdot)

is square-integrable that integrates to one and satisfies

\int u W_{k, h} (u) d u = 0

,

m_{1} = \int u^{2} W_{k, h} (u) d u < \infty

, and

m_{2} = \int W_{k, h}^{2} (u) d u < \infty

, while

h \to 0

and

N h \to \infty

as

N \to \infty

.

Assumption 4.

The bandwidth satisfies

h = N^{- 1 / 5} M_{0}

for some constant

M_{0}

.

Assumption 5.

{lim}_{n \to \infty} N^{- 6 / 5} \sum_{i = 1}^{n} n_{i} = U

for some

0 ⩽ U < \infty

Theorem 1.

Under Assumptions (1)–(5), the estimator

\hat{β} (τ)

defined in Equation (4) is asymptotically multivariate normal for any

τ \in [0, T]

as

m \to \infty

.

We summarize the asymptotic normality of

\hat{β} (τ)

at a fixed point

τ

, where

τ \in R

in Appendix A.

Remark 2.

Most assumptions are similar regularity conditions as that in Wu et al. [12]. Assumptions 1 and 2 are rigorous statements for which

(\sum_{k = 1}^{N} X_{t_{(k)}}^{T} {\tilde{D}}_{k} X_{t_{(k)}})

is positive-definite and invertible asymptotically. Assumption 3 ensures that

W_{k, h} (\cdot)

has a compact support on R. Assumptions 4 and 5 are results from finite moments in Assumptions 1 and 2. More discussions about the asymptotic risk for the kernel estimators can be found in [12].

Theorem 2.

It is assumed that for

t \in (0, T)

,

E (y_{i} (t) ∣ x_{i} (t)) = E (Y (t) ∣ X (t))

, the estimator

\hat{ν} (t)

defined in Equation (8) is an unbiased estimator for any

t \in Δ

.

The proof can be found in Appendix B.

Based on Zeng and Cai [17], the following assumptions are imposed on the joint model.

Assumption 6.

For any

t \in [0, T]

, the covariate process

R (t)

is fully observed and conditional on

{\bar{Z}}_{R} (t)

,

{\bar{Z}}_{Y} (t)

, and

T ⩾ t

; the distribution of

R (t)

depends only on

{\bar{Z}}_{R} (t)

.

R (t)

is continuously differentiable in

[0, T]

and

m a x_{t \in [0, T]} ‖ R^{'} (t) ‖ < \infty

with a probability of one.

Assumption 7.

The censoring time C depends only on

{\bar{Z}}_{R} (t)

and

R (t)

for any

t < T

conditional on

{\bar{Z}}_{R} (t)

,

{\bar{Z}}_{Y} (t)

,

R (t)

and

T ⩾ t

.

Assumption 8.

Full-rank

P (X^{T} X

) is positive. Additionally, if there is an existing constant vector

C_{0}

satisfying

X {(t)}^{T} C_{0} = g (t)

for a deterministic function

g (t)

for all

t \in [0, T]

with a positive probability, then

C_{0} = 0

and

g (t) = 0

.

Assumption 9.

The true value of parameter

θ = (σ^{2}, γ^{T}, α)

satisfies

‖ θ ‖ ⩽ Q_{0}, σ_{0}^{2} > Q_{0}^{- 1}

for a known positive constant

Q_{0}

.

Assumption 10.

The baseline hazard function

h_{0} (t)

is bounded and positive in

[0, T]

.

Assumption 11.

There is an existing positive constant

a > 0

satisfying

S_{T} (T) ⩾ a

.

Remark 3.

Assumption 6 serves as a fundamental statement in joint models, indicating that the association between the history process and the survival time is due to observed covariate processes, such as baseline information, study time, and so on, denoted by

R (t)

. Assumption 7 means that there exist some appropriate measures such that the intensity function of

N c (t)

exists. Assumption 8 is the identifiability assumption in a linear mixed-effects model. Assumptions 9–11 imply that, conditional on

{\bar{Z}}_{R} (t)

and

R (t)

, the probability of a subject surviving after time τ is at least some positive constant. Theorem 3.1 in Zeng and Cai [17] states the strong consistency of the maximum likelihood estimator. More discussions on the assumptions can be found in [17].

Theorem 3.

Under Assumptions (1)–(11), the estimators

\hat{ν} (t)

defined in Equation (8) and

\hat{μ} (t)

defined in Equation (9) are consistent for any

t \in [0, T]

.

The proof can be found in Appendix C.

3. Simulation

In this section, some numerical results are presented. In our simulation, the joint model can be described as:

\{\begin{matrix} Y (t) = x (t) β (t) + ϵ (t), \\ h (t) = h_{0} (t) exp {γ w + α (Y (t) - ϵ (t))}, \end{matrix}

where

Y (t)

is the state function,

x (t) = (1, x_{1} (t), \dots, x_{p} (t))

is the vector of covariates for regression parameters,

β (t) = {(β_{0} (t), β_{1} (t), \dots, β_{p} (t))}^{T}

is the time-varying coefficients, w is the covariates for regression parameter

γ

,

α

is the association parameter, and

ϵ (t) \sim N (0, 0.1)

.

The standard deviation (Std.dev) and the root mean square errors (RMSE) with

R M S E = {[\frac{1}{N} \sum_{k = 1}^{N} {(\hat{g} (t_{(k)}) - g (t_{(k)}))}^{2}]}^{\frac{1}{2}}

of overall estimates calculated by R 4.3.1 [18] are used to assess the quality of estimators. We summarize the steps in the following procedure:

Set the sample size n, the true function $β (t) = (β_{0} (t), β_{1} (t), β_{2} (t))$ , the true value of parameters $α, γ$ , and the rate of censoring r;
Generate a random sample $g_{i} \sim U [0, 1]$ ;
Derive the random sample $s_{i}$ of lifetime with the hazards function $h (t)$ ;
Generate a random sample of censoring $C_{i} \sim U [a, b]$ ;
Set $t_{i} = m i n {v_{i}, c_{i}}, δ_{i} = I {v_{i} \leq c_{i}}$ ;
Generate the random sample of the time-dependent covariates $x_{1} (s), x_{2} (s)$ , and the baseline hazards function $h_{0} (s)$ .
For $s = {1, 2, \dots, t_{i}}$ , generate the response variables $y (s) = β_{0} + β_{1} x_{1} (s) + β_{2} x_{2} (s)$ .

Output the estimated function

\hat{β} (t), \hat{μ} (t)

, the estimated value of parameters

\hat{α}

and

\hat{γ}

, the bias and the std.err of parameters

α

and

γ

, and the RMSE of the estimated function

\hat{ν} (t), \hat{μ} (t)

.

We utilize packages of ‘MASS’, ‘splines’, ‘survival’, ‘nlme’, ‘JM’, ‘lattice’, ‘mvtnorm’, ‘tibble’ and ‘ggplot2’ in our simulation study.

Now, we consider the following scenarios:

Scenario 1: Set

x (t) = (1, x_{1}, x_{2})

,

x_{1} = r (1.5 sin (t) + 1)

where

r \sim B e r n o u l i (0.5)

,

x_{2} \sim U [1, 2]

,

β (t) = {(β_{0}, β_{1}, β_{2})}^{T}

,

β_{0} (t) = 1.5 t

,

β_{1} (t) = 1.2 t^{0.5}

,

β_{2} (t) = t^{0.2}

,

w \sim U [0, 1]

,

γ = 1.0

,

α = 0.25

, and

h_{0} (t) = h_{0}

.

Scenario 2: Set

x (t) = (1, x_{1}, x_{2})

,

x_{1} = r log (t)

where

r \sim N (1, 0.5)

,

x_{2} \sim e x p (0.5)

,

β (t) = {(β_{0}, β_{1}, β_{2})}^{T}

,

β_{0} (t) = t

,

β_{1} (t) = t^{0.5}

,

β_{2} (t) = sin (t)

,

w \sim U [0, 1]

,

γ = 1.0

,

α = 0.15

, and

h_{0} (t) = h_{0}

.

The following is the form based on the joint model:

\{\begin{matrix} Y (t) = β_{0} (t) + β_{1} (t) x_{1} (t) + β_{2} (t) x_{2} + ϵ (t), \\ h (t) = h_{0} (t) exp {γ w + α (β_{0} (t) + β_{1} (t) x_{1} (t) + β_{2} (t) x_{2}} . \end{matrix}

After 1000 replications in R, we obtain the results of the estimation at different sample sizes. In each scenario, we control the censoring rate by about

25 %

.

In scenario 1, we set the true values of parameters as

α = 0.25

and

γ = 1.0

. In the scenario 2, we set the true values of parameters as

α = 0.15

and

γ = 1.0

. From the proposed estimators given in Equations (8) and (9), fitted values of the state function

v (t)

and the cumulative mean function

μ (t)

for any time points can be computed.

Table 1 and Table 2 present the estimates of the root of mean square errors (RMSEs) of the state function

v (t)

and the cumulative mean function

μ (t)

with different numbers of time points N for the two scenarios. The RMSE is small when

N = 106

because the estimators of

v (t)

are unbiased at observed time points. Since the estimators are biased at other time points, the RMSE increases as N increases.

Table 3 and Table 4 summarize the main findings of fixed parameters. The results show that as the sample size increases, the bias and model-based standard errors decrease, which coincides with empirical results reasonably well. The performance improves with larger sample size. Note that it is common for the Std. Err. of

γ

to be very large, sometimes reaching as large as

0.2

in small samples for linear mixed-effects models (see Table 1 in [5]). In the semi-parametric model estimation based on polynomial regression, the Std. Err. of

γ

even reaches

0.3

(see Tables 1 and 2 in [19]). Compared with these methods, the Std. Err. of

γ

in our paper is less.

Figure 1 and Figure 2 present the proposed estimated functions of

β (t)

as a continuous curve and true values of

β (t)

at time points

t \in Δ

as a series of points. Figure 3 and Figure 4 present the estimated functions of

β (t)

using the local polynomial regression method. In Scenario 1, our method is not significantly superior to polynomial regression. However, in Scenario 2, the term regression method does not fit

β_{2} (t)

well, and our method estimates

{\hat{β}}_{2} (t)

close to the true value. This is because

β_{2} (t)

is set up as a trigonometric function rather than a linear combination of power functions. It can be seen that while the local polynomial regression method is suitable for power basis functions, the kernel smoothing method fits the function better, even in such cases.

Furthermore, Figure 5 and Figure 6 show the true curves and fitted values of the state function

v (t)

and the mean of cumulative mean function

μ (t)

. In each figure, the estimated functions approximate to the values of true functions. Figure 7 and Figure 8 show the results of the local polynomial regression method. Similar to the previous conclusion, the method we proposed works better.

4. An Application to MADIT Data

In this section, we validate the proposed estimator with a real data set from a multicenter automatic defibrillator implantation trial (MADIT). MADIT data contain 181 subjects (patients) from 36 centers in the USA that were observed at a total of 134,853 discrete time points. With the 181 patients, 89 of them chose to implant cardiac defibrillators, while another 92 did not. Throughout this section, we encode the ‘implanted’ group as

I C D = 1

and ‘not implanted’ group as

I C D = 0

. Since the effect of treatment of whether to implant cardiac defibrillators (ICDs) did not directly induce any medical costs but actually affected the expected survival time, we consider it as the time-independent covariate in the survival sub-model.

The observed patients have six types of costs that can be observed daily from the start to the death time or censoring time: Type 1: hospitalization and emergency department visits; Type 2: outpatient tests and procedures; Type 3: physician/specialist visits; Type 4: community services; Type 5: medical supplies; Type 6: medication. These costs directly drive medical costs, thus they should be considered as time-dependent covariates in the longitudinal process sub-model. It should be pointed out that the whole observation contains quite a lot of data points, so the R-code cannot work for the daily cost data. Therefore, we merge the data by combining 12 days into one time unit.

These data also include the patients’ ID codes, observed survival times in days, and death indicators.

Now, in the data set, we encode the total 181 patients as follows:

ID code (from 1 to 181);
Treatment code (1 for ICD and 0 for conventional);
Observation of survival time;
Death indicator(1 for death, 0 for censored);
Merged medical costs of type 1–6.

To analyze this data set, we describe the model as follows:

\{\begin{matrix} \begin{matrix} Y_{i} (t) = & β_{0} (t) + β_{1} (t) x_{1 i} (t) + β_{2} (t) x_{2 i} (t) + β_{3} (t) x_{3 i} (t) + β_{4} (t) x_{4 i} (t) \\ + β_{5} (t) x_{5 i} (t) + β_{6} (t) x_{6 i} (t) + ϵ_{i} (t), \end{matrix} \\ h_{i} (t) = h_{0} (t) exp {γ w_{i} + α (Y_{i} (t) - ϵ_{i} (t))}, \end{matrix}

where for

r = 1, 2, \dots, 6

,

x_{i r} (t) = 1

if type = 1, otherwise,

x_{i r} (t) = 0

,

w_{i} = 1

for the ICD group and

w_{i} = 0

for the not implanted group. The estimated parameters

\hat{γ}

and

\hat{α}

in survival sub-model are obtained by Cox partial maximum likelihood method, they are always asymptotically normal. The estimates of

γ

and

α

in this paper are attained by calling ‘coxph()’ and ‘method = peicewise-PH-GH’ in the JM package. The Std. Err. and p-value are automatically produced.

In this case,

β (t)

can be considered the relationship between the natural logarithm of medical cost and time unit. The fitted curve is illustrated in Figure 9. Time-varying parameters more accurately describe the effect of covariates on state function over time, such as

β_{5} (t)

; after a certain time point,

x_{5} (5)

does not affect the state function, although the in early stage it does.

Table 5 shows the estimates of the survival sub-model. The association effect

α

is positive, which corresponds to reality, that is, patients in a serious life condition require more medical attention. In other words, serious illness leads to higher medical costs, corresponding to lower survival rates. This further supports that our model is efficient and reasonable. However, due to the large p-value of the result, this conclusion is not specific, which should be further tested using a large sample in future research. The treatment effect

γ

is negative, which corresponds to reality, that is, an automatic defibrillator implantation trial can reduce the risk of death. In other words, the ‘implanted’ group (ICD = 1) has a lower risk of death. This conclusion is supported by the small p-value of the result.

Table 6 presents the estimated values of the cumulative cost for 5 years and the total treatment period. Comparing Sub-figure(a) with Sub-figure(b) in Figure 10 and Figure 11, we realize that the fitted points for the mean medical cost based on the current approach better describe the elaborate trajectories of medical costs, and the result for the fitted points of the cumulative mean medical cost based on the current approach exhibit a similar trend to Li’s [19] result; however, the result is more accurate and shows continuous change, which, in turn, leads to improved understanding of real-world data.

Compared with parametric/semi-parametric estimation, our method suggests a smoother time-varying mean medical cost function, which is no longer a straight line with an observable change rate, but exhibits more fluctuations over time. In practical situations, government agencies or insurance companies would not know which treatment was selected by a patient. The estimated cumulative mean function provides them with a more accurate statistical decision recommendation for macro allocation. For example, we should consider the additional changes in medical costs because they would suddenly increase or decrease at distinct time points.

Moreover, the estimated marginal survival function is smoother compared with the Kaplan–Meier estimator. The result is similar to research by Li’s [19], as shown in Figure 12. The solid line is the estimated survival rate, and the dotted line represents the

95 %

confidence interval of the estimated survival rate.

5. Discussion

Generally, some estimate methods can lead to biased estimates in the model, and the time-varying coefficient model compensates for this shortcoming. In this study, we provide an estimation of the time-varying coefficient using the kernel function technique. When the covariates are continuous, the continuous state function can be obtained, allowing for the derivation of the continuous cumulative mean state function. In summary, our method is more precise because the kernel function smooths the estimator; more importantly, the proposed estimator at the original observed true points is unbiased. Theorem 2 demonstrates this perspective. In our numerical simulation and application analysis, we consider the observation interval to be a unit of 1, and we chose a bandwidth h of 1 in the kernel smoothing function. Notably, in cases where the observation time interval is not 1 or involves interval censoring, we must choose the appropriate bandwidth. Many studies on kernel function have proposed methods for choosing the bandwidth. In general, too large a bandwidth distorts data, and too small a bandwidth does not lead to a smooth continuous function. Moreover, as

h \to \infty

,

β (t)

turns out to be a series of constants, that is, a time-independent parametric estimator.

In this study, we do not consider the random effect coefficients in the longitudinal data model, and the correlation that may exist among repeated observations within individuals is ignored. If we consider the random effect coefficients, we will not obtain Theorem 2, but we can still obtain Theorem 3. In future studies, random effects should be considered in longitudinal models, which may yield better results.

Furthermore, the estimation of the time-varying coefficient using the kernel function technique can be extended to multiple longitudinal outcomes, and the justification for the survival sub-model would need a multi-dimensional association coefficient. The methods proposed in [20] may be applicable.

Author Contributions

Conceptualization, S.L., D.D. and Y.H.; methodology, S.L. and D.D.; software, S.L. and D.D.; validation, D.D. and Y.H.; formal analysis, S.L.; investigation, S.L.; resources, D.D.; data curation, S.L. and D.D.; writing—original draft preparation, S.L.; writing—review and editing, D.D.; visualization, Y.H.; supervision, D.D. and Y.H.; project administration, D.D.; funding acquisition, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

Han’s work is partially supported by the National Natural Science Foundation of China (grant number 11871244); Deng’s work is partially supported by the Natural Sciences and Engineering Research Council of Canada.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. The Asymptotic Normality of $\hat{β} (τ)$

For each

t_{(l)} \in Δ

, denote

β (t_{(l)}) = (β_{1} (t_{(l)}), β_{2} (t_{(l)}), \dots, β_{q} (t_{(l)})) .

For all

r, c = 0, \dots, q

, denote

b_{r z} (t_{(l)}) = E [x_{i r} (t_{(l)}) I {T_{m}^{*} ⩾ t_{(l)}} x_{i c} (t_{(l)})];

\begin{matrix} u_{r} (t_{(l)}) = & M_{0}^{3 / 2} \sum_{z = 1}^{q} m_{1} [β_{z}^{'} (t_{(l)}) b_{r z}^{'} (t_{(l)}) f ((t_{(l)}) \\ + β_{z}^{'} (t_{(l)}) b_{r z} (t_{(l)}) f^{'} (t_{(l)}) \\ + (1 / 2) β_{z}^{″} (t_{(l)}) b_{r z} (t_{(l)}) f (t_{(l)})]; \end{matrix}

G (t_{(l)}) = {(f (t_{(l)}))}^{- 1} E {[X_{l}^{T} X_{l}]}^{- 1} (t_{(l)}) (u_{0} (t_{(l)}), \dots, u_{q});

V_{r z} (t_{(l)}) = m^{2} (t_{(l)}) b_{r z} (t_{(l)}) f ((t_{(l)}) m_{2} + U M_{0} o_{ϵ} (t_{(l)}) b_{r z} (t_{(l)}) f^{2} ((t_{(l)});

V (t_{(l)}) = (\begin{matrix} D_{00} (t_{(l)}) & \dots & D_{0 q} (t_{(l)}) \\ ⋮ & \dots & ⋮ \\ D_{q 0} (t_{(l)}) & \dots & D_{Q q} (t_{(l)}) \end{matrix});

W (t_{(l)}) = {(f (t_{(l)}))}^{- 2} E {[X_{l}^{T} X_{l}]}^{- 1} V (t_{(l)}) E {[X_{l}^{T} X_{l}]}^{- 1} .

Then,

{(m h)}^{1 / 2} (\hat{β} (t_{(l)}) - β (t_{(l)})) \to N (G (t_{(l)}), W (t_{(l)}))

Appendix B. Proof of Theorem 2

Proof of Theorem 2.

For each

t_{(l)} \in Δ

,

\hat{β} (t_{(l)}) = {(\sum_{k = 1}^{N} X_{k}^{T} {\tilde{D}}_{k} (t_{(l)}) X_{k})}^{- 1} (\sum_{k = 1}^{N} X_{k}^{T} {\tilde{D}}_{k} (t_{(l)}) Y_{k}),

where

{\tilde{D}}_{k} (t_{(l)}) = diag (W_{k, h} (t_{(l)} - t_{(k)}) I {T_{1}^{*} ⩾ t_{(l)}}, \dots, W_{k, h} (t_{(l)} - t_{(k)}) I {T_{m}^{*} ⩾ t_{(l)}}) .

When

t_{(l)} = t_{(k)}

,

\begin{matrix} {\tilde{D}}_{l} (t_{(l)}) & = diag (W_{k, h} (t_{(l)} - t_{(l)}) I {T_{1}^{*} ⩾ t_{(l)}}, \dots, W_{k, h} (t_{(l)} - t_{(l)}) I {T_{m}^{*} ⩾ t_{(l)}}) \\ = diag (I {T_{1}^{*} ⩾ t_{(l)}}, \dots, I {T_{m}^{*} ⩾ t_{(l)}}), \end{matrix}

when

t_{(l)} \neq t_{(k)}

,

\begin{matrix} {\tilde{D}}_{k} (t_{(l)}) & = diag (W_{k, h} (t_{(l)} - t_{(k)}) I {T_{1}^{*} ⩾ t_{(l)}}, \dots, W_{k, h} (t_{(l)} - t_{(k)}) I {T_{m}^{*} ⩾ t_{(l)}}) \\ = diag (0, \dots, 0) . \end{matrix}

\hat{β} (t_{(l)})

can be written as

\begin{matrix} \hat{β} (t_{(l)}) & = {(X_{l}^{T} {\tilde{D}}_{l} (t_{(l)}) X_{l})}^{- 1} (X_{l}^{T} {\tilde{D}}_{l} (t_{(l)}) Y_{l}) \\ = {(\sum_{i = 1}^{m} x_{i}^{T} (t_{(l)}) {\tilde{D}}_{l} (t_{(l)}) x_{i} (t_{(l)}))}^{- 1} (\sum_{i = 1}^{m} x_{i}^{T} (t_{(l)}) {\tilde{D}}_{l} (t_{(l)}) y_{i} (t_{(l)})) . \end{matrix}

Under the assumption that

ϵ (t)

is independent of

X (t)

, and

E ϵ (t) = 0

. Then from Gauss–Markov theory,

E \hat{β} (t_{(l)}) = β (t_{(l)}) .

(A1)

Combining Equations (7) and (8), the estimator

\hat{ν} (t_{(l)})

of mean state function at time point

t_{(l)}

is

\hat{ν} (t_{(l)}) = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} {\hat{Y}}_{i} (t_{(l)})}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}},

where

{\hat{Y}}_{i} (t_{(l)})

is the fitted value of

Y_{i} (t_{(l)})

at time point

t_{(l)}

from the joint model:

{\hat{Y}}_{i} (t_{(l)}) = x_{i} (t_{(l)}) \hat{β} (t_{(l)}) .

By Assumption

E (y_{i} (t) ∣ x_{i} (t)) = E (Y (t) ∣ X (t))

, we have

\begin{matrix} ν (t_{(l)}) = E (Y (t_{(l)}) ∣ X (t_{(l)})) & = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} E [Y (t_{(l)}) ∣ X (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} \\ = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} E [y_{i} (t_{(l)}) ∣ x_{i} (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} \\ = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} [x_{i} (t_{(l)}) β (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} . \end{matrix}

Thus, for any

t_{(l)} \in Δ

, from Equation (A1), we have

\begin{matrix} E \hat{ν} (t_{(l)}) - ν (t_{(l)}) & = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} E [x_{i} (t_{(l)}) \hat{β} (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} - \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} [x_{i} (t_{(l)}) β (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} \\ = \frac{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}} [x_{i} (t_{(l)})]}{\sum_{i = 1}^{m} I {T_{i}^{*} ⩾ t_{(l)}}} (E \hat{β} (t_{(l)}) - β (t_{(l)})) = 0 . \end{matrix}

□

This completes the proof of Theorem 2.

Appendix C. Proof of Theorem 3

Proof of Theorem 3.

From Equation (8), we have

\begin{matrix} \hat{ν} (t) - ν (t) = & \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{{\hat{S}}_{T} (t)} {\hat{Y}}_{i} (t) - E Y (t) \\ = & \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{{\hat{S}}_{T} (t) S_{T (t)}} {\hat{Y}}_{i} (t) [S_{T (t)} - {\hat{S}}_{T} (t)] \\ + \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} ({\hat{Y}}_{i} (t) - Y_{i} (t)) \\ + \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} Y_{i} (t) - E Y (t) \\ = & (I) + (I I) + (I I I) . \end{matrix}

Under the assumption that

ϵ (t)

is independent of T and C conditional on

X (t)

and W, from the law of large numbers,

(I I I) = \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} Y_{i} (t) - E Y (t) \to 0 a . s . a s m \to \infty .

For the second term

(I I)

, we have

\begin{matrix} (I I) = & \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} ({\hat{Y}}_{i} (t) - Y_{i} (t)) \\ = & \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} x_{i} (t) (\hat{β} (t) - β (t)) \\ + & \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} (- ϵ_{i} (t)) \\ \equiv & (I V) + (V) . \end{matrix}

By Assumption (6),

‖ x_{i} (t) ‖ ⩽ C_{0}

for some positive constants

C_{0}

. By Theorem 3.1 in Zeng and Cai [17] and Assumption (11), we have that

(I V) ⩽ \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} ‖ x_{i} (t) ‖ ‖ \hat{β} (t) - β (t) ‖ ⩽ \frac{C_{0}}{a^{2}} ‖ \hat{β} (t) - β (t) ‖ a . s . a s m \to \infty .

By Theorem 1 in Wu et al. [12] and Assumptions (1)–(5), we have that

(I V) ⩽ \frac{C_{0}}{a^{2}} ‖ \hat{β} (t) - β (t) ‖ \to 0 a . s . a s m \to \infty .

Also, from the law of large numbers,

\begin{matrix} (V) & = \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{S_{T} (t)} (- ϵ_{i} (t)) \\ = E (\frac{I {T ⩾ t}}{S_{T} (t)} (- ϵ (t))) = \frac{E (I {T ⩾ t})}{S_{T} (t)} E (- ϵ (t)) = 0 a . s . a s m \to \infty \end{matrix}

Similarly, by Assumption (9), Theorem 3.1 in Zeng and Cai [17], and Theorem 2 in Phadia and Ryzin [21], we have that

\begin{matrix} (I) & = \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{{\hat{S}}_{T} (t) S_{T} (t)} x_{i} (t) \hat{β} (t) [S_{T} (t) - {\hat{S}}_{T} (t)] \\ ⩽ \frac{1}{m} \sum_{i = 1}^{m} \frac{I {T_{i} ⩾ t}}{{\hat{S}}_{T} (t) S_{T} (t)} ‖ x_{i} (t) ‖ ‖ \hat{β} (t) ‖ ∣ S_{T} (t) - {\hat{S}}_{T} (t) ∣ \\ ⩽ \frac{M_{0} C_{0}^{2}}{a^{4}} ∣ S_{T} (t) - {\hat{S}}_{T} (t) ∣ \to 0 a . s . a s m \to \infty . \end{matrix}

Now,

\hat{ν} (τ)

converges to

ν (τ)

almost uniformly in

τ \in [0, T]

.

Then, based on Egoroff Theorem in Bartle [22],

\hat{μ} (τ)

converges to

μ (τ)

in probability. □

This completes the proof of Theorem 3.

References

Bang, H.; Tsiatis, A. Estimating medical costs with censored data. Biometrika 2000, 87, 329–343. [Google Scholar] [CrossRef]
Lin, D.Y.; Feuer, E.J.; Etzioni, R.; Wax, Y. Estimating medical costs from incomplete follow-Up data. Biometrics 1997, 53, 419–434. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Lovato, L. Tests for lifetime utility or cost via calibrating survival time. Stat. Sin. 2002, 12, 707–723. [Google Scholar]
Rizopoulos, D. Joint Models for Longitudinal and Time-to-Event Data: With Applications in R; CRC Press: Boca Raton, FL, USA, 2012. [Google Scholar]
Deng, D. Estimating the cumulative mean function for history process with time-dependent covariates and censoring mechanism: Estimating the cumulative mean function for history process. Stat. Med. 2016, 35, 4624–4636. [Google Scholar] [CrossRef] [PubMed]
Korn, E.L. On estimating the distribution function for quality of life in cancer clinical trials. Biometrika 1993, 80, 535–542. [Google Scholar] [CrossRef]
Hoover, D.R.; Rice, J.A.; Wu, C.O.; Yang, L. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika 1998, 85, 809–822. [Google Scholar] [CrossRef]
Zhao, X.; Tong, X.; Sun, L. Joint analysis of longitudinal data with dependent observation times. Stat. Sin. 2012, 22, 317–336. [Google Scholar] [CrossRef]
Li, C.; Xiao, L.; Luo, S. Joint model for survival and multivariate sparse functional data with application to a study of Alzheimer’s Disease. Biometrics 2022, 78, 435–447. [Google Scholar] [CrossRef] [PubMed]
Do, H.; Nandi, S.; Putzel, P.; Smyth, P.; Zhong, J. A joint fairness model with applications to risk predictions for under-represented populations. Biometrics 2022, 79, 826–840. [Google Scholar] [CrossRef] [PubMed]
Eubank, R.L.; Speckman, P.L. Confidence bands in nonparametric regression. J. Am. Stat. Assoc. 1993, 88, 1287–1301. [Google Scholar]
Wu, C.O.; Chiang, C.T.; Hoover, D.R. Asymptotic confidence regions for kernel smoothing of a varying-coefficient model with longitudinal data. J. Am. Stat. Assoc. 1998, 93, 1388–1402. [Google Scholar] [CrossRef]
You, L.; Qiu, P. Joint modeling of multivariate nonparametric longitudinal data and survival data: A local smoothing approach. Stat. Med. 2021, 40, 6689–6706. [Google Scholar] [CrossRef] [PubMed]
Silverman, B.W. Density Estimation: For Statistics and Data Analysis; Chapman & Hall: London, UK, 2018. [Google Scholar]
Rice, J.A.; Silverman, B.W. Estimating the Mean and Covariance Structure Nonparametrically When the Data are Curves. J. R. Stat. Soc. Ser. Methodol. 1991, 53, 233–243. [Google Scholar] [CrossRef]
Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
Zeng, D.; Cai, J. Asymptotic results for maximum likelihood estimators in joint analysis of repeated measurements and survival time. Ann. Stat. 2005, 33, 2132–2163. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023. [Google Scholar]
Li, S.; Deng, D.; Han, Y.; Zhang, D. Joint model for estimating the asymmetric distribution of medical costs based on a history process. Symmetry 2023, 15, 2130. [Google Scholar] [CrossRef]
Kenyon, J.R. Analysis of Multivariate Survival Data. Technometrics 2002, 44, 86–87. [Google Scholar] [CrossRef]
Phadia, E.G.; Ryzin, J.V. A note on convergence rates for the product limit estimator. Ann. Stat. 1980, 8, 673–678. [Google Scholar] [CrossRef]
Bartle, R.G. The Elements of Integration and Lebesgue Measure; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2011. [Google Scholar]

Figure 1. True values at observed time points and estimated function of

β (t)

for scenario 1.

Figure 1. True values at observed time points and estimated function of

β (t)

for scenario 1.

Figure 2. True values at observed time points and estimated function of

β (t)

for scenario 2.

Figure 2. True values at observed time points and estimated function of

β (t)

for scenario 2.

Figure 3. Polynomial regression estimator of

β (t)

for scenario 1.

Figure 3. Polynomial regression estimator of

β (t)

for scenario 1.

Figure 4. Polynomial regression estimator of

β (t)

for scenario 2.

Figure 4. Polynomial regression estimator of

β (t)

for scenario 2.

Figure 5. True values at observed time points and estimated function of

v (t)

and

μ (t)

for scenario 1.

Figure 5. True values at observed time points and estimated function of

v (t)

and

μ (t)

for scenario 1.

Figure 6. True values at observed time points and estimated function of

v (t)

and

μ (t)

for scenario 2.

Figure 6. True values at observed time points and estimated function of

v (t)

and

μ (t)

for scenario 2.

Figure 7. Polynomial regression estimator of

v (t)

and

μ (t)

for scenario 1.

Figure 7. Polynomial regression estimator of

v (t)

and

μ (t)

for scenario 1.

Figure 8. Polynomial regression estimator of

v (t)

and

μ (t)

for scenario 2.

Figure 8. Polynomial regression estimator of

v (t)

and

μ (t)

for scenario 2.

Figure 9. Estimated function of

β (t)

.

Figure 9. Estimated function of

β (t)

.

Figure 10. Fitted points of cost based on current approach and the semi-parametric estimating approach.

Figure 11. Fitted points of cumulative cost based on current approach and the semi-parametric estimating approach.

Figure 12. Marginal survival compared with Kaplan–Meier estimator.

Table 1. The estimates of RMSE with different numbers of N for scenario 1.

Parameter	$N = 106$	$N = 221$	$N = 441$
$ν (t)$	0.9019575	1.291602	1.762792
$μ (t)$	11.42673	15.1283	27.13671

Table 2. The estimates of RMSE with different numbers of N for scenario 2.

Parameter	$N = 106$	$N = 221$	$N = 441$
$ν (t)$	1.830297	2.434687	3.582806
$μ (t)$	19.03169	45.26685	84.30626

Table 3. The estimate results of the event process for scenario 1.

Parameter	True	$n = 125$		$n = 250$		$n = 500$
Parameter	True	Bias	Std. Err.	Bias	Std. Err.	Bias	Std. Err.
$α$	0.25	−0.0345	0.0300	−0.0302	0.0212	−0.0300	0.0154
$γ$	1.00	−0.0119	0.2590	0.0275	0.1730	−0.0040	0.1260

Table 4. The estimate results of the event process for scenario 2.

Parameter	True	$n = 125$		$n = 250$		$n = 500$
Parameter	True	Bias	Std. Err.	Bias	Std. Err.	Bias	Std. Err.
$α$	0.15	−0.0286	0.0311	−0.0230	0.0188	−0.0223	0.0116
$γ$	1.00	0.0219	0.1170	0.0208	0.1160	0.0235	0.1120

Table 5. Estimate results of the event process.

Parameter	Value	Std. Err.	p-Value
Treatment	−1.2219	0.3362	0.0003
Association effect	0.2519	0.2087	0.2274

Table 6. Estimated cumulative costs for 5 years and total period.

Year 1	Year 2	Year 3	Year 4	Year 5	Total
21,661.83	45,403.10	67,981.31	101,983.30	134,749.20	135,412.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, S.; Deng, D.; Han, Y. A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry 2024, 16, 389. https://doi.org/10.3390/sym16040389

AMA Style

Li S, Deng D, Han Y. A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry. 2024; 16(4):389. https://doi.org/10.3390/sym16040389

Chicago/Turabian Style

Li, Simeng, Dianliang Deng, and Yuecai Han. 2024. "A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs" Symmetry 16, no. 4: 389. https://doi.org/10.3390/sym16040389

APA Style

Li, S., Deng, D., & Han, Y. (2024). A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs. Symmetry, 16(4), 389. https://doi.org/10.3390/sym16040389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs

Abstract

1. Introduction

2. Estimation for a Time-Varing Coefficient Model

2.1. Estimation of Time-Varying Coefficient in Longitudinal Model

2.2. Estimation of Survival Model

2.3. Estimation of Cumulative Mean State Function

2.4. The Asymptotic Property of Estimators

3. Simulation

4. An Application to MADIT Data

5. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. The Asymptotic Normality of $\hat{β} (τ)$

Appendix B. Proof of Theorem 2

Appendix C. Proof of Theorem 3

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Symmetric Kernel Smoothing Estimation of the Time-Varying Coefficient for Medical Costs

Abstract

1. Introduction

2. Estimation for a Time-Varing Coefficient Model

2.1. Estimation of Time-Varying Coefficient in Longitudinal Model

2.2. Estimation of Survival Model

2.3. Estimation of Cumulative Mean State Function

2.4. The Asymptotic Property of Estimators

3. Simulation

4. An Application to MADIT Data

5. Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. The Asymptotic Normality of β ^ ( τ )

Appendix B. Proof of Theorem 2

Appendix C. Proof of Theorem 3

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix A. The Asymptotic Normality of $\hat{β} (τ)$