Symmetric Double Normal Models for Censored, Bounded, and Survival Data: Theory, Estimation, and Applications

Martínez-Flórez, Guillermo; Salinas, Hugo; Ramírez-Montoya, Javier

doi:10.3390/math14020384

Open AccessArticle

Symmetric Double Normal Models for Censored, Bounded, and Survival Data: Theory, Estimation, and Applications

by

Guillermo Martínez-Flórez

¹

,

Hugo Salinas

²

and

Javier Ramírez-Montoya

^1,*

¹

Departamento de Matemática y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 230001, Colombia

²

Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó 7500015, Chile

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(2), 384; https://doi.org/10.3390/math14020384

Submission received: 15 December 2025 / Revised: 19 January 2026 / Accepted: 21 January 2026 / Published: 22 January 2026

(This article belongs to the Section D1: Probability and Statistics)

Download

Browse Figures

Versions Notes

Abstract

We develop a unified likelihood-based framework for limited outcomes built on the two-piece normal family. The framework includes a censored specification that accommodates boundary inflation, a doubly truncated specification on

(0, 1)

for rates and proportions, and a survival formulation with a log-two-piece normal baseline and Gamma frailty to account for unobserved heterogeneity. We derive closed-form building blocks (pdf, cdf, survival, hazard, and cumulative hazard), full log-likelihoods with score functions and observed information, and stable reparameterizations that enable routine optimization. Monte Carlo experiments show a small bias and declining RMSE with increasing sample size; censoring primarily inflates the variability of regression coefficients; the scale parameter remains comparatively stable, and the shape parameter is most sensitive under heavy censoring. Applications to HIV-1 RNA with a detection limit, household food expenditure on

(0, 1)

, labor-supply hours with a corner solution, and childhood cancer times to hospitalization demonstrate improved fit over Gaussian, skew-normal, and beta benchmarks according to AIC/BIC/CAIC and goodness-of-fit diagnostics, with model-implied censoring closely matching the observed fraction. The proposed formulations are tractable, flexible, and readily implementable with standard software.

Keywords:

symmetric double normal distribution; Tobit model; censored data; doubly truncated unit interval data; proportional regression; maximum likelihood estimation; Gamma frailty; survival analysis; goodness-of-fit; Anderson–Darling test

MSC:

60E05; 62F10; 62N02

1. Introduction

The normal distribution serves as a fundamental element in statistical modeling because of its analytical ease and its essential role in limit theorems. However, real-world data frequently exhibit characteristics such as heavier or lighter tails, asymmetry, and, in certain contexts, multimodality, which a single Gaussian density cannot sufficiently represent [1]. Classes of normal distribution may be found [2]. However, there are flexible symmetric families that fit better and, in particular, may be bimodal. For rates and proportions on the unit interval, beta regression is a standard choice [3].

Within this landscape, two-piece constructions provide an appealing route to flexibility by allowing distinct shapes on either side of a central location while preserving closed-form expressions for key quantities. Building on this idea, the two-piece normal family (symmetric double normal; with an asymmetric extension) offers a tractable kernel with a shape parameter that controls departures from normality, enabling richer behavior than the symmetric Gaussian distribution and supporting likelihood-based inference with standard tools [4,5].

These features are consequential in applications involving censoring and bounded support. We leverage them to develop the following: (i) a censored model for limited outcomes, (ii) a doubly truncated model on

(0, 1)

, and (iii) a survival specification with a log-two-piece baseline and Gamma frailty. In biostatistics and environmental monitoring, detection limits induce mass accumulation at thresholds; in econometrics, corner solutions yield limited dependent variables; and for proportions, support is intrinsically constrained to

(0, 1)

[1]. A coherent framework based on the double normal (two-component normal mixture) family accommodates such constraints while allowing for flexible shapes (including possible bimodality) and straightforward computation [4,5]. Note that the

T N (λ)

specification considered in Section 2 is symmetric by construction; any asymmetry in the observed data distribution may arise from censoring/truncation mechanisms rather than from the latent error distribution itself.

This paper develops two extensions tailored to these scenarios. First, a censored specification of the double normal distribution provides a Tobit-type model that allows for flexible modality while accounting for accumulation at a censoring boundary [1]. Second, a doubly truncated specification on

(0, 1)

offers a natural competitor to beta regression for proportions and rates, with the added ability to represent platykurtic shapes and, when supported by the data-generating mechanism, multimodality through a principled truncation operator [3,6]. We also consider a survival formulation with Gamma frailty, linking heterogeneity modeling to established survival-analysis methodology [7].

Our contributions are as follows. First, we establish formal properties for the proposed models, including stochastic representations, generating functions, moments, and conditions for modality [4,5]. Second, we propose regression structures via suitable link functions for bounded outcomes and survival contexts [1,7]; for completeness and reproducibility, we also provide the corresponding log-likelihood expressions under censoring/truncation and the analytical derivatives (score functions and observed Fisher information) used for numerical maximization. Third, we conduct Monte Carlo experiments to assess finite-sample performance (bias and root mean squared error) across sample sizes, censoring and truncation intensities, and modality regimes, and we compare models using classical information criteria [8,9,10]. Finally, empirical illustrations in biomedical, economic, and labor datasets benchmark the proposed models against Gaussian, skewed, and beta alternatives and include formal tests of unimodality using the Hartigan-Hartigan dip test [11].

Section 2 introduces the symmetric two-piece normal (double normal) distribution and summarizes its main properties (moments, kurtosis, and modality conditions). Section 3 presents the censored two-piece normal (CTN) model, including its distributional properties, the Tobit-type regression extension, and likelihood-based estimation components for CTN. Section 4 develops the doubly truncated two-piece normal (TTN) model on

(0, 1)

and its regression specification. Section 5 introduces the survival formulation with a log-two-piece normal baseline and Gamma frailty and provides the resulting marginal likelihood. Section 6 consolidates general inference and implementation details (asymptotics, standard errors, and boundary issues). Section 7 reports the simulation study, and Section 8 presents the empirical applications. Finally, Section 9 concludes.

2. The Symmetric Two-Piece Normal (Double Normal) Distribution

2.1. Notation

Remark 1.

Vectors are typeset in bold; for example,

x_{i} \in R^{p}

and

β \in R^{p}

; scalar components are denoted by

x_{i j}

and

β_{j}

. We write

ϕ (\cdot)

and

Φ (\cdot)

for the standard normal pdf and cdf, respectively. Unless stated otherwise,

λ \geq 0

is a scalar shape parameter, and n is the sample size. To avoid ambiguity, we reserve

θ > 0

for the frailty parameter (when applicable) and denote the full model parameter vector by Θ, with MLE

\hat{Θ}

and observed information

I (Θ)

. Superscripts such as * or ** are used only when explicitly defined.

2.2. The Two-Piece Normal Family

Motivated by tractability in likelihood-based analysis, we adopt a representation of the two-piece normal (symmetric “double normal”) distribution that makes its structure transparent: it is the equal-weight mixture of two unit-variance normal distributions centered at

\pm λ

. This representation highlights both the name and the modeling intent (two shifted Gaussian “pieces”) and is more convenient for derivations than alternative kernels.

Definition 1.

A random variable Z follows a two-piece normal distribution with shape parameter

λ \geq 0

, denoted

Z \sim T N (λ)

, if its pdf is

f_{Z} (z; λ) = \frac{1}{2} (ϕ (z - λ) + ϕ (z + λ)), z \in R .

(1)

Equation (1) admits the equivalent “tilted normal” form

f_{Z} (z; λ) = ϕ (z) e^{- λ^{2} / 2} cosh (λ z),

(2)

because

ϕ (z \mp λ) = ϕ (z) exp (\pm λ z - λ^{2} / 2)

and

cosh (a) = (e^{a} + e^{- a}) / 2

. The normalizing factor

e^{- λ^{2} / 2}

follows from

E (\cosh (λ Z)) = e^{λ^{2} / 2}

for

Z \sim N (0, 1)

. Both (1) and (2) deliver closed-form expressions for common functionals and will be used interchangeably [4,5].

Lemma 1.

Let

Z \sim T N (λ)

. Then

(i): The cdf is

$F_{Z} (z; λ) = \frac{1}{2} (Φ (z - λ) + Φ (z + λ)) .$

(3)
(ii): Symmetry holds around zero: $F_{Z} (- z; λ) = 1 - F_{Z} (z; λ)$ ; hence $E (Z) = 0$ and all odd central moments vanish.
(iii): The variance inflates with the separation of the two normals: $Var (Z) = 1 + λ^{2}$ .

Proposition 1.

Let

Z \sim T N (λ)

; then for any integer

k \geq 0

,

E (Z^{k}) = \sum_{j = 0}^{k} (\binom{k}{j}) λ^{k - j} \frac{1 + {(- 1)}^{k - j}}{2} I_{j} = \sum_{\begin{matrix} 0 \leq j \leq k \\ j \equiv k (\mod 2) \end{matrix}} (\binom{k}{j}) λ^{k - j} I_{j} .

(4)

where

I_{j} : = \int_{- \infty}^{\infty} u^{j} ϕ (u) d u

is the j-th moment of the standard normal distribution (so that

I_{j}

is zero for odd j and

I_{j} = 2^{j / 2 - 1} (1 + {(- 1)}^{j}) Γ (\frac{1 + j}{2}) / \sqrt{π}

, in general). In particular,

E (Z^{2 m + 1}) = 0

and

E (Z^{2}) = 1 + λ^{2}

, in agreement with Lemma 1.

Proof.

Using the mixture representation (1),

E (Z^{k}) = \frac{1}{2} \int_{- \infty}^{\infty} z^{k} ϕ (z - λ) d z + \frac{1}{2} \int_{- \infty}^{\infty} z^{k} ϕ (z + λ) d z .

With the changes of variables

u = z - λ

and

u = z + λ

, respectively,

E (Z^{k}) = \frac{1}{2} \int_{- \infty}^{\infty} {(u + λ)}^{k} ϕ (u) d u + \frac{1}{2} \int_{- \infty}^{\infty} {(u - λ)}^{k} ϕ (u) d u .

Expanding using the binomial theorem and applying linearity,

\begin{matrix} E (Z^{k}) & = & \sum_{j = 0}^{k} (\binom{k}{j}) λ^{k - j} (\frac{1}{2} \int_{- \infty}^{\infty} u^{j} ϕ (u) d u + \frac{1}{2} {(- 1)}^{k - j} \int_{- \infty}^{\infty} u^{j} ϕ (u) d u) \\ = & \sum_{j = 0}^{k} (\binom{k}{j}) λ^{k - j} (\frac{1 + {(- 1)}^{k - j}}{2}) I_{j} . \end{matrix}

The factor

(1 + {(- 1)}^{k - j}) / 2

equals 1 if

k - j

is even and 0 otherwise, yielding the parity-restricted sum. For

k = 1

, the sum vanishes (symmetry), and for

k = 2

, we obtain

E (Z^{2}) = λ^{2} I_{0} + I_{2} = λ^{2} + 1

. □

Remark 2.

Let

Z \sim T N (λ)

. Then

Var (Z) = 1 + λ^{2}

and the kurtosis satisfy

κ (λ) = \frac{E (Z^{4})}{Var {(Z)}^{2}} = \frac{λ^{4} + 6 λ^{2} + 3}{{(1 + λ^{2})}^{2}} = 3 - \frac{2 λ^{4}}{{(1 + λ^{2})}^{2}} < 3, λ > 0,

with equality only at

λ = 0

(the standard normal case).

Remark 3.

Let

Z \sim T N (λ)

with pdf

f_{Z} (z; λ) = \frac{1}{2} (ϕ (z - λ) + ϕ (z + λ))

. Then

f_{Z} (- z; λ) = f_{Z} (z; λ)

for all

z \in R

, so

T N (λ)

is symmetric about 0. In particular, all odd moments are zero (when they exist), and hence the skewness is 0.

The family

T N (λ)

thus preserves the interpretability and computational convenience of the Gaussian law (closed-form pdf, cdf, and simple moments) while introducing a tunable shape mechanism that accommodates platykurtosis and, for sufficiently large separation

λ

, the emergence of bimodal features that are valuable in applications involving censoring and bounded support. Moreover, asymmetric generalizations (“asymmetric double normal”) extend this construction to skewed settings while retaining many of these analytical advantages [5]. These properties make

T N (λ)

a natural building block for the censored and doubly truncated models developed next [4,5]. From a probabilistic standpoint, increasing

λ

enhances the contribution of the component associated with larger latent locations, inflating upper-tail probabilities, while decreasing

λ

emphasizes the lower-tail behavior, making it a key parameter for capturing tail risk and latent subpopulation structure without introducing discrete mixture indicators.

Proposition 2.

Let Z have a density given in Equation (1). Then

f_{Z} (\cdot)

is unimodal if

0 \leq λ \leq 1

and bimodal if

λ > 1

. Moreover,

f^{″} (0) = (λ^{2} - 1) ϕ (λ),

so that

z = 0

is a local maximum if

λ < 1

and a local minimum if

λ > 1

.

Proof.

By symmetry,

f^{'} (0) = 0

. Differentiating twice and evaluating at 0 yields the expression above. For

λ > 1

, 0 is a local minimum, and two symmetric modes appear. □

3. Censored Two-Piece Normal (CTN) Models

3.1. Motivation: Censored Outcomes and Tobit-Type Models

It is common to observe response variables subject to a known bound T (lower, upper, or both), with a substantial fraction of the sample concentrated at that bound. This mixture of a point mass and a continuous component was popularized by [1] and is widely known as the Tobit model. In its standard left-censored form, one posits an unobserved latent outcome

y_{i}^{*} = x_{i}^{⊤} β + ε_{i}

with

ε_{i} \sim N (0, σ^{2})

and observes

y_{i} = max {y_{i}^{*}, T}

for

i = 1, \dots, n

, that is

y_{i} = \{\begin{matrix} x_{i}^{⊤} β + ε_{i}, & if x_{i}^{⊤} β + ε_{i} > T, \\ T, & otherwise, \end{matrix}

which admits well-known likelihood expressions and inferential procedures [1]. In many applications, however, the proportion of censored observations, or the tail behavior of

y_{i}^{*}

, substantially exceeds what a Gaussian latent error can accommodate, motivating alternatives that relax distributional assumptions and better handle excess mass at the boundary and tail thickness [12]. Examples include detection-limit problems and antibody response analyzes, where log-normal and related specifications are natural baselines, as well as broader families that capture asymmetry and tail behavior beyond the censored normal [12,13]. In this spirit, power-normal, t-based, and other shape-parameter models have been explored under censoring. Our focus, instead, is on two-piece constructions that preserve analytical convenience while offering extra shape flexibility [4,5].

3.2. Censored Two-Piece Normal Model

We consider a left-censoring threshold

T \in R

and a latent outcome

Y_{i}^{*} = μ + σ Z_{i}

with

Z_{i} \sim T N (λ)

, where

μ \in R

,

σ > 0

, and

λ \geq 0

. The observed variable is

Y_{i} = max {Y_{i}^{*}, T}, i = 1, \dots, n,

(5)

so that observations at the boundary T are censored realizations of

Y_{i}^{*}

.

Remark 4.

For

σ > 0

and

λ \geq 0

, the location-scale TN model is identifiable via

Y^{*} \sim \frac{1}{2} N (μ - σ λ, σ^{2}) + \frac{1}{2} N (μ + σ λ, σ^{2})

(the constraint

λ \geq 0

removes sign/label switching). Identification weakens at the boundary

λ = 0

(Gaussian submodel) and may be practically compromised under heavy censoring/truncation when only one side of the distribution is effectively observed.

Definition 2.

If

Y_{i}^{*} \sim T N (μ, σ, λ)

and

Y_{i}

are given by (5), then

Y_{i}

follows a censored two-piece normal distribution with parameters

(μ, σ, λ; T)

, denoted

Y_{i} \sim C T N (μ, σ, λ; T)

.

The distribution of

Y_{i}

is a mixture of a point mass at T and a continuous component on

(T, \infty)

:

\begin{matrix} P (Y_{i} = T) & = & F_{Z} (\frac{T - μ}{σ}; λ), \end{matrix}

(6)

\begin{matrix} f_{Y} (y_{i}) & = & \frac{1}{σ} f_{Z} (\frac{y_{i} - μ}{σ}; λ), y_{i} > T . \end{matrix}

(7)

Equivalently, the cdf of

Y_{i}

satisfies

F_{Y} (y) = 0

for

y < T

,

F_{Y} (T) = F_{Z} ((T - μ) / σ; λ)

, and

F_{Y} (y) = F_{Z} ((y - μ) / σ; λ)

for

y > T

.

Remark 5.

(i) Right-censoring at a known upper censoring point U (that is,

Y_{i} = min {Y_{i}^{*}, U}

) and interval censoring follow from symmetry and standard modifications of (6). (ii) Allowing subject-specific thresholds

T_{i}

is immediate by replacing T with

T_{i}

throughout. (iii) The representation in (1) is often more transparent for derivations and interpretation than cosine-hyperbolic tilts, since it explicitly states that the latent kernel is an equal-weight mixture of two unit-variance normals centered at

\pm λ

.

Proposition 3.

Let

Y^{*} = μ + σ Z \sim T N (μ, σ, λ)

with

μ \in R

,

σ > 0

, and define

Y = max {Y^{*}, T}

for a fixed left-censoring threshold

T \in R

. Let

z_{T} = (T - μ) / σ

,

Φ_{λ} (x) : = F_{Z} (x; λ) = \frac{1}{2} (Φ (x - λ) + Φ (x + λ))

,

ϕ_{λ} (x) : = \frac{1}{2} (ϕ (x - λ) + ϕ (x + λ))

, and

S_{Φ} (x) : = 1 - Φ (x)

. Then:

(i): Mass at the censoring point.

$P (Y \leq T) = P (Y = T) = Φ_{λ} (z_{T}) = \frac{1}{2} (Φ (z_{T} - λ) + Φ (z_{T} + λ)) .$
(ii): Mixed measure, Y has an atom at T with mass given in (a), and for $y > T$ it has density

$f_{Y} (y) = \frac{1}{σ} f_{Z} (\frac{y - μ}{σ}; λ) = \frac{ϕ (\frac{y - μ}{σ} - λ) + ϕ (\frac{y - μ}{σ} + λ)}{2 σ} .$

There is no Lebesgue density at $y = T$ ; all mass there is the atom in (a).
(iii): Conditional cdf given $Y > T$ . For $y > T$ and $z = (y - μ) / σ$ ,

$\begin{matrix} P (Y \leq y | Y > T) & = \frac{Φ_{λ} (z) - Φ_{λ} (z_{T})}{1 - Φ_{λ} (z_{T})} \\ = \frac{(Φ (z - λ) - Φ (z_{T} - λ)) + (Φ (z + λ) - Φ (z_{T} + λ))}{S_{Φ} (z_{T} - λ) + S_{Φ} (z_{T} + λ)} . \end{matrix}$
(iv): Raw r-th moment. For any integer $r \geq 0$ ,

$E (Y^{r}) = T^{r} Φ_{λ} (z_{T}) + \frac{1}{2} \sum_{k = 0}^{r} (\binom{r}{k}) μ^{r - k} σ^{k} \sum_{j = 0}^{k} (\binom{k}{j}) λ^{k - j} (I_{j} (z_{T} - λ) + {(- 1)}^{k - j} I_{j} (z_{T} + λ)),$

(8)

where $I_{j} (α) : = \int_{α}^{\infty} u^{j} ϕ (u) d u$ .
(v): Mean and second moment.

$E (Y) = T Φ_{λ} (z_{T}) + μ (1 - Φ_{λ} (z_{T})) + σ ϕ_{λ} (z_{T}) + \frac{σ λ}{2} (Φ (z_{T} + λ) - Φ (z_{T} - λ)),$

and, using $I_{2} (α) = S_{Φ} (α) + α ϕ (α)$ ,

$\begin{matrix} E (Y^{2}) & = T^{2} Φ_{λ} (z_{T}) \\ + \frac{1}{2} \sum_{k \in {- 1, 1}} ((σ^{2} + {(μ + k σ λ)}^{2}) S_{Φ} (z_{T} - k λ) + σ (T + μ + k σ λ) ϕ (z_{T} - k λ)) . \end{matrix}$

Proof.

(i) Since

Y = max {Y^{*}, T}

, we have

{Y = T} = {Y^{*} \leq T}

, and thus

P (Y = T) = P (Y^{*} \leq T) = F_{Z} (z_{T}; λ) = Φ_{λ} (z_{T})

.

(ii) For

y > T

, we have

Y = Y^{*}

; hence

f_{Y} (y) = f_{Y^{*}} (y) = (1 / σ) f_{Z} ((y - μ) / σ; λ)

. At

y = T

, the distribution has an atom with mass given by (a).

(iii) For

y > T

,

P (Y \leq y ∣ Y > T) = \frac{P (T < Y \leq y)}{P (Y > T)} = \frac{F_{Z} (z; λ) - F_{Z} (z_{T}; λ)}{1 - F_{Z} (z_{T}; λ)} .

Substituting

F_{Z} (\cdot; λ) = \frac{1}{2} (Φ (\cdot - λ) + Φ (\cdot + λ))

and simplification yield the stated form.

(iv) Write

E (Y^{r}) = T^{r} P (Y = T) + \int_{T}^{\infty} x^{r} f_{Y^{*}} (x) d x, f_{Y^{*}} (x) = \frac{1}{2 σ} (ϕ (\frac{x - μ}{σ} - λ) + ϕ (\frac{x - μ}{σ} + λ)) .

With

t = (x - μ) / σ

and then

u = t - λ

in the first term and

u = t + λ

in the second term, expand

{(μ + σ t)}^{r} = \sum_{k = 0}^{r} (\binom{r}{k}) μ^{r - k} σ^{k} t^{k}

and

{(u \pm λ)}^{k} = \sum_{j = 0}^{k} (\binom{k}{j}) u^{j} {(\pm λ)}^{k - j}

to obtain (8) in terms of

I_{j} (α) = \int_{α}^{\infty} u^{j} ϕ (u) d u

.

(v) Take

r = 1, 2

in (8) and use

I_{0} (α) = S_{Φ} (α)

,

I_{1} (α) = ϕ (α)

, and

I_{2} (α) = S_{Φ} (α) + α ϕ (α)

, together with

S_{Φ} (a) - S_{Φ} (b) = Φ (b) - Φ (a)

and the definition of

ϕ_{λ} (\cdot)

, to obtain the stated closed forms. □

Remark 6.

In the standard case

μ = 0

and

σ = 1

, replace

z_{T}

with T in all expressions.

3.3. Censored Two-Piece Normal Regression

We extend the classical (Gaussian) Tobit regression to the two-piece normal family. Let

Y_{i}^{*} = {x_{i}}^{⊤} β + ε_{i}, ε_{i} = σ Z_{i}, Z_{i} \sim T N (λ),

with

μ \in R

absorbed in

{x_{i}}^{⊤} β

, scale

σ > 0

, and shape

λ \geq 0

. The observed response is left-censored at a known threshold

T \in R

as follows:

Y_{i} = max {Y_{i}^{*}, T}, i = 1, \dots, n .

(9)

We write

z_{i} = (y_{i} - {x_{i}}^{⊤} β) / σ

and

z_{T, i} = (T - {x_{i}}^{⊤} β) / σ

use the shorthand

Φ_{λ} (z) = \frac{1}{2} (Φ (z - λ) + Φ (z + λ)), ϕ_{λ} (z) = \frac{1}{2} (ϕ (z - λ) + ϕ (z + λ)) .

The law of

Y_{i}

is a mixed measure with an atom at T and a continuous component on

(T, \infty)

:

P (Y_{i} = T ∣ x_{i}) = Φ_{λ} (z_{T, i}), f_{Y_{i}} (y ∣ x_{i}) = \frac{1}{σ} ϕ_{λ} (\frac{y - {x_{i}}^{⊤} β}{σ}), y > T .

Equivalently,

F_{Y_{i}} (y ∣ x_{i}) = \{\begin{matrix} 0, & y < T, \\ Φ_{λ} (z_{T, i}), & y = T, \\ Φ_{λ} ((y - {x_{i}}^{⊤} β) / σ), & y > T . \end{matrix}

Given

Y_{i} > T

, the conditional density is

f_{Y_{i} ∣ Y_{i} > T} (y ∣ x_{i}) = \frac{σ^{- 1} ϕ_{λ} (\frac{y - {x_{i}}^{⊤} β}{σ})}{1 - Φ_{λ} (z_{T, i})}, y > T .

(10)

From the CTN moment formulas,

E (Y_{i} ∣ x_{i}, Y_{i} > T) = {x_{i}}^{⊤} β + \frac{σ ϕ_{λ} (z_{T, i})}{1 - Φ_{λ} (z_{T, i})} + \frac{σ λ}{2} \frac{Φ (z_{T, i} + λ) - Φ (z_{T, i} - λ)}{1 - Φ_{λ} (z_{T, i})},

(11)

and the unconditional mean is

E (Y_{i} ∣ x_{i}) = T Φ_{λ} (z_{T, i}) + (1 - Φ_{λ} (z_{T, i})) E (Y_{i} ∣ x_{i}, Y_{i} > T) .

(12)

Remark 7.

(i) When

λ \to 0

,

Z_{i} \Rightarrow N (0, 1)

, the model reduces to Gaussian Tobit regression. (ii) Right-censoring at a known U (for example,

Y_{i} = min {Y_{i}^{*}, U}

) or observation-specific thresholds

T_{i}

is followed by replacing T with U or

T_{i}

in the formulas above. (iii) A zero-threshold reparameterization is immediate: define

y_{i}^{* *} = y_{i} - T

,

{x_{i}}^{* *} = {(x_{i}, T)}^{⊤}

,

β^{* *} = {(β, - 1)}^{⊤}

, such that

y_{i}^{* *} = max {{x_{i}}^{* * ⊤} β^{* *} + ε_{i}, 0}

.

4. Doubly Truncated Two-Piece Normal (TTN) Models on (0, 1)

4.1. Two-Piece Normal Truncated to the Unit Interval

Following the truncation generator in [6], let F be a distribution function on

(a, b)

with

a \leq 0 < 1 \leq b

. The distribution function of the variable U truncated to

(0, 1)

is

F_{U ∣ (0, 1)} (u) = \frac{F (u) - F (0)}{F (1) - F (0)}, 0 < u < 1 .

More generally, the TF-G class uses a baseline cdf F and a transformation cdf G; the induced cdf is

G_{X} (x) = \frac{F (G (x)) - F (0)}{F (1) - F (0)} .

If f and g denote the densities of F and G, then

f_{X} (x) = \frac{g (x) f (G (x))}{F (1) - F (0)}, S_{X} (x) = \frac{F (1) - F (G (x))}{F (1) - F (0)}, h_{X} (x) = \frac{g (x) f (G (x))}{F (1) - F (G (x))} .

Definition 3.

Let

Y^{*} = μ + σ Z

with

Z \sim T N (λ)

. Define

Y = Y^{*} | 0 < Y^{*} < 1

. Then the TTN

(μ, σ, λ)

distribution on

(0, 1)

has

\begin{matrix} f (y) & = & \frac{f_{Z} (z_{y}; λ)}{σ (F_{Z} (z_{1}; λ) - F_{Z} (z_{0}; λ))} \\ = & \frac{ϕ (z_{y} - λ) + ϕ (z_{y} + λ)}{σ (Φ (z_{1} - λ) + Φ (z_{1} + λ) + Φ (- z_{0} - λ) + Φ (- z_{0} + λ) - 2)}, \\ F (y) & = & \frac{F_{Z} (z_{y}; λ) - F_{Z} (z_{0}; λ)}{F_{Z} (z_{1}; λ) - F_{Z} (z_{0}; λ)} \\ = & \frac{Φ (z_{y} - λ) + Φ (z_{y} + λ) + Φ (- z_{0} - λ) + Φ (- z_{0} + λ) - 2}{Φ (z_{1} - λ) + Φ (z_{1} + λ) + Φ (- z_{0} - λ) + Φ (- z_{0} + λ) - 2}, \\ S (y) & = & \frac{F_{Z} (z_{1}; λ) - F_{Z} (z_{y}; λ)}{F_{Z} (z_{1}; λ) - F_{Z} (z_{0}; λ)} \\ = & \frac{Φ (z_{1} - λ) + Φ (z_{1} + λ) - Φ (z_{y} - λ) - Φ (z_{y} + λ)}{Φ (z_{1} - λ) + Φ (z_{1} + λ) + Φ (- z_{0} - λ) + Φ (- z_{0} + λ) - 2}, \\ h (y) & = & \frac{f (y)}{S (y)} = \frac{ϕ (z_{y} - λ) + ϕ (z_{y} + λ)}{σ (Φ (z_{1} - λ) + Φ (z_{1} + λ) - Φ (z_{y} - λ) - Φ (z_{y} + λ))}, 0 < y < 1, \end{matrix}

where

z_{y} = (y - μ) / σ

,

z_{0} = (0 - μ) / σ = - μ / σ

,

z_{1} = (1 - μ) / σ

and the normalizing constant is

F_{Z} (z_{1}; λ) - F_{Z} (z_{0}; λ) = \frac{1}{2} (Φ (z_{1} - λ) + Φ (z_{1} + λ) + Φ (- z_{0} - λ) + Φ (- z_{0} + λ) - 2) .

Remark 8.

(i) Since

F_{Z}

is symmetric,

F_{Z} (- u; λ) = 1 - F_{Z} (u; λ)

; writing the formulas via

z_{0} = - μ / σ

and

z_{1} = (1 - μ) / σ

makes the cancelation of the factor

1 / 2

explicit. (ii) As

λ \to 0

, TTN

(μ, σ, λ)

reduces to the doubly truncated Gaussian on

(0, 1)

.

4.2. Covariates in the Unit-Interval

Let the latent model be

Y_{i}^{*} = μ_{i} + σ Z_{i}, Z_{i} \sim T N (λ), σ > 0, λ \geq 0,

and the observed response must be doubly truncated to

(0, 1)

Y_{i} = Y_{i}^{*} | 0 < Y_{i}^{*} < 1, i = 1, \dots, n .

We link the location parameter to covariates via a strictly monotonic, twice differentiable link

g : (0, 1) \to R

:

g (μ_{i}) = η_{i} = {x_{i}}^{⊤} β, x_{i} = {(1, x_{1 i}, \dots, x_{p i})}^{⊤}, β = {(β_{0}, \dots, β_{p})}^{⊤} .

We adopt the logit link for interpretability,

g (μ_{i}) = log (\frac{μ_{i}}{1 - μ_{i}}) = η_{i}, μ_{i} = \frac{e^{η_{i}}}{1 + e^{η_{i}}}, \frac{\partial μ_{i}}{\partial β} = μ_{i} (1 - μ_{i}) x_{i} .

Under this specification, increasing covariate

x_{k}

by m units multiplies the odds

μ_{i} / (1 - μ_{i})

by

exp (m β_{k})

(holding the others fixed). Hence

Y_{i} \sim T T N (μ_{i}, σ, λ) on (0, 1), i = 1, \dots, n .

Remark 9.

The likelihood below is for the doubly truncated model on

(0, 1)

(TTN), not for censored data: the term

N_{i} = Pr (0 < Y_{i}^{*} < 1 ∣ x_{i})

is the truncation normalizing constant, which yields the contribution

- \sum_{i = 1}^{n} log (N_{i})

in the log-likelihood.

Using the tilted TN kernel, the sample log-likelihood is

ℓ (Θ) = - \frac{n}{2} log (2 π) - n log (σ) - \frac{n}{2} λ^{2} - \frac{1}{2} \sum_{i = 1}^{n} z_{i}^{2} + \sum_{i = 1}^{n} log (cosh (λ z_{i})) - \sum_{i = 1}^{n} log (N_{i}),

for

Θ = {(β^{⊤}, σ, λ)}^{⊤}

. Therefore, the score function is defined as the derivative of the log-likelihood function with respect to each parameter.

\begin{matrix} \frac{\partial ℓ}{\partial λ} & = & \sum_{i = 1}^{n} (- λ + z_{i} {tanh}_{i}) - \sum_{i = 1}^{n} \frac{- ϕ (A_{i}) + ϕ (B_{i}) - ϕ (C_{i}) + ϕ (D_{i})}{N_{i}}, \\ \frac{\partial ℓ}{\partial β_{j}} & = & \frac{1}{σ} \sum_{i = 1}^{n} x_{i j} μ_{i} (1 - μ_{i}) (z_{i} - λ {tanh}_{i}) - \frac{1}{σ} \sum_{i = 1}^{n} x_{i j} μ_{i} (1 - μ_{i}) \frac{- ϕ (A_{i}) - ϕ (B_{i}) + ϕ (C_{i}) + ϕ (D_{i})}{N_{i}}, \\ j = 0, \dots, p \\ \frac{\partial ℓ}{\partial σ} & = & \frac{1}{σ} \sum_{i = 1}^{n} (- 1 + z_{i}^{2} - λ z_{i} {tanh}_{i}) + \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{(1 - μ_{i}) (ϕ (A_{i}) + ϕ (B_{i})) + μ_{i} (ϕ (C_{i}) + ϕ (D_{i}))}{N_{i}} . \end{matrix}

where

z_{i} = (y_{i} - μ_{i}) / σ

,

A_{i} = (1 - μ_{i}) / σ - λ

,

B_{i} = (1 - μ_{i}) / σ + λ

,

C_{i} = μ_{i} / σ - λ

,

D_{i} = μ_{i} / σ + λ

,

{tanh}_{i} = tanh (λ z_{i})

, and the truncation normalizer

N_{i} = Φ (A_{i}) + Φ (B_{i}) + Φ (C_{i}) + Φ (D_{i}) - 2

.

Remark 10.

MLEs are obtained by solving the score equations numerically. Standard errors follow from the inverse of the observed information at

\hat{θ}

; Wald intervals and likelihood-ratio tests are standard.

5. Survival Models with Log-Two-Piece Normal Baseline and Gamma Frailty

5.1. Log-Two-Piece Normal with Gamma Frailty: Foundations

In survival analysis, unobserved heterogeneity is often modeled via a non-negative random effect (frailty) that multiplies the hazard [14]. Shared frailty in proportional hazards settings was developed in [15,16]; see [7] for an overview. We adopt a frailty specification over a log-two-piece normal kernel. Let

Y_{i} = min {T_{i}, C_{i}}

and

δ_{i} = 1 {T_{i} \leq C_{i}}

,

i = 1, \dots, n

. Conditioned on

V_{i}

with

E (V_{i}) = 1

,

\begin{matrix} h (y ∣ V_{i}, x_{i}) & = V_{i} h (y ∣ x_{i}), log (T_{i}) = μ_{i} + σ Z_{i}, Z_{i} \sim T N (λ), \\ z_{i} & = \frac{log (y) - μ_{i}}{σ}, μ_{i} = x_{i}^{⊤} β, σ > 0, λ \geq 0, \end{matrix}

where

Φ

is the standard normal cdf,

Φ^{c} (u) = 1 - Φ (u)

, and

ϕ

its pdf.

Baseline (no frailty).

F (y ∣ x_{i}) = 1 - \frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)), S (y ∣ x_{i}) = \frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)) .

f (y ∣ x_{i}) = \frac{1}{2 σ y} (ϕ (z_{i} - λ) + ϕ (z_{i} + λ)), h (y ∣ x_{i}) = \frac{1}{σ y} \frac{ϕ (z_{i} - λ) + ϕ (z_{i} + λ)}{Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)} .

H (y ∣ x_{i}) = - log (S (y ∣ x_{i})) = - log (\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ))) .

Frailty (conditional and marginal).

Given

V_{i} = v_{i}

,

S (y ∣ v_{i}, x_{i}) = exp (- v_{i} H (y ∣ x_{i})), h (y ∣ v_{i}, x_{i}) = v_{i} h (y ∣ x_{i}) .

If

L_{V} (s) = E (e^{- s V})

is the Laplace transform of V, then

S_{m} (y ∣ x_{i}) = L_{V} (H (y ∣ x_{i})) .

For

V \sim Gamma (1 / θ, θ)

,

S_{m} (y ∣ x_{i}) = {(1 + θ H (y ∣ x_{i}))}^{- 1 / θ}, h_{m} (y ∣ x_{i}) = \frac{h (y ∣ x_{i})}{1 + θ H (y ∣ x_{i})} .

Likelihood. Note that now

Θ = {(β^{⊤}, σ, λ, θ)}^{⊤}

and

L_{i} (Θ ∣ v_{i}) = {(f (y_{i} ∣ v_{i}, x_{i}))}^{δ_{i}} S {(y_{i} ∣ v_{i}, x_{i})}^{1 - δ_{i}},

the complete likelihood is

L (Θ) = \prod_{i = 1}^{n} \int_{0}^{\infty} L_{i} (Θ ∣ v_{i}) f_{V} (v_{i}) d v_{i}, ℓ (Θ) = log (L (Θ)) .

For each i, using

f (y ∣ v_{i}, x_{i}) = h (y ∣ v_{i}, x_{i}) S (y ∣ v_{i}, x_{i})

and

S (y ∣ v_{i}, x_{i}) = exp (- v_{i} H (y ∣ x_{i}))

, we can write

L_{i} (Θ ∣ v_{i}) = {(v_{i} h (y_{i} ∣ x_{i}))}^{δ_{i}} exp (- v_{i} H (y_{i} ∣ x_{i})) .

Therefore, the marginal contribution is

L_{i} (Θ) = \int_{0}^{\infty} L_{i} (Θ ∣ v_{i}) f_{V} (v_{i}) d v_{i} = {(h (y_{i} ∣ x_{i}))}^{δ_{i}} E (V^{δ_{i}} e^{- H (y_{i} ∣ x_{i}) V}) .

With

S_{m} (y ∣ x_{i}) = L_{V} (H (y ∣ x_{i}))

, we have

E (e^{- H V}) = L_{V} (H) = S_{m}

(when

δ_{i} = 0

) and

E (V e^{- H V}) = - L_{V}^{'} (H)

(when

δ_{i} = 1

), so that

f_{m} (y ∣ x_{i}) = h (y ∣ x_{i}) (- L_{V}^{'} (H (y ∣ x_{i}))) = h_{m} (y ∣ x_{i}) S_{m} (y ∣ x_{i})

. Hence,

L_{i} (Θ) = {(h_{m} (y_{i} ∣ x_{i}))}^{δ_{i}} S_{m} (y_{i} ∣ x_{i})

.

Under Gamma frailty, the marginal log-likelihood is

ℓ (Θ) = \sum_{i = 1}^{n} (δ_{i} log (h_{m} (y_{i} ∣ x_{i})) + log (S_{m} (y_{i} ∣ x_{i}))) .

Other authors, such as [17], incorporate generalized Gamma and Normal random effects into a Weibull distribution.

Left-Censoring with Gamma Frailty: Model and Marginal Likelihood

We analyze left-censored survival data under a log-two-piece normal baseline with Gamma frailty [18,19,20,21]. Let

y_{i}

be the observed time and let

δ_{i} = 1

indicate an exact time;

δ_{i} = 0

represents a left-censored observation at

y_{i}

(i = 1, \dots, n)

. Using the marginal functions in Section 5.1

L_{i} (Θ) = {(f_{m} (y_{i} ∣ x_{i}))}^{δ_{i}} {(F_{m} (y_{i} ∣ x_{i}))}^{1 - δ_{i}}, F_{m} (y ∣ x) = 1 - S_{m} (y ∣ x),

the log-likelihood is

\begin{matrix} ℓ (Θ) = & \sum_{i = 1}^{n} δ_{i} (log (ϕ (z_{i} - λ) + ϕ (z_{i} + λ)) - log (σ y_{i}) - log (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ))) \\ - \frac{1}{θ} log (θ) - log (Γ (\frac{1}{θ})) + log (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) . \end{matrix}

(13)

Maximization of (13) yields

(\hat{β}, \hat{σ}, \hat{λ}, \hat{θ})

. Technical details of the integral derivation (mixing over V), including score functions and the Hessian, are provided in Appendix A.

5.2. Likelihood Inference for CTN

5.2.1. Likelihood

The likelihood splits naturally because the observed response combines a point mass at the censoring threshold and a continuous component on

(T, \infty)

. Throughout, we write

z_{T, i} = (T - {x_{i}}^{⊤} β) / σ

and

z_{i} = (y_{i} - {x_{i}}^{⊤} β) / σ

.

5.2.2. Censored Contribution ( $y_{i} = T$ )

For a censored observation, the individual contribution is the probability that the latent variable falls below the threshold, evaluated under the two-piece normal kernel. Taking logs yields

ℓ_{1 i} (Θ) = log (Φ_{λ} (z_{T, i})) = log (Φ (z_{T, i} - λ) + Φ (z_{T, i} + λ)) - log (2),

(14)

where the term −

log (2)

reflects the equal mixture normalization implicit in

Φ_{λ}

.

5.2.3. Uncensored Contribution ( $y_{i} > T$ )

For an uncensored outcome, the contribution equals the continuous density at

y_{i}

. Using the tilted representation

f_{Z} (z; λ) = ϕ (z) e^{- λ^{2} / 2} cosh (λ z)

of the two-piece kernel, the log-term becomes

ℓ_{2 i} (Θ) = - \frac{1}{2} log (2 π) - log (σ) - \frac{1}{2} z_{i}^{2} - \frac{1}{2} λ^{2} + log (cosh (λ z_{i})) .

(15)

Full log-likelihood. Collecting both cases by means of the indicator

δ_{i} = 1 {y_{i} > T}

, the sample log-likelihood can be written compactly as

ℓ (Θ) = \sum_{i = 1}^{n} ((1 - δ_{i}) ℓ_{1 i} (Θ) + δ_{i} ℓ_{2 i} (Θ)),

(16)

so that

δ_{i}

simply toggles the discrete (censored) and continuous (uncensored) contributions. This decomposition will underpin the score functions and information matrices derived below.

5.3. Score Functions

5.3.1. Censored Term ( $y_{i} = T$ )

For observations piled up at the censoring point, the contribution to the score stems solely from the probability mass at T. In this setting, the gradient behaves like a probit-type score evaluated at the standardized index

z_{T, i}

, but under the two-piece kernel, which modulates the tails through

λ

. The resulting components are

\begin{matrix} \frac{\partial ℓ_{1 i}}{\partial β} & = - \frac{ϕ_{λ} (z_{T, i})}{σ Φ_{λ} (z_{T, i})} x_{i}, \frac{\partial ℓ_{1 i}}{\partial σ} = - \frac{ϕ_{λ} (z_{T, i})}{Φ_{λ} (z_{T, i})} \frac{z_{T, i}}{σ}, \frac{\partial ℓ_{1 i}}{\partial λ} = \frac{ϕ_{+} (z_{T, i}) - ϕ_{-} (z_{T, i})}{Φ_{-} (z_{T, i}) + Φ_{+} (z_{T, i})} . \end{matrix}

Here, the score in

β

pushes the linear predictor upward to reduce the chance of censoring, while the

λ

-component contrasts the two flanks of the mixture via

ϕ_{\pm}

.

5.3.2. Uncensored Term ( $y_{i} > T$ )

When the outcome is observed, the score follows from the tilted-normal density in the continuous region. The hyperbolic terms arise from differentiating

log cosh (λ z_{i})

, since

\partial_{λ} log cosh (λ z_{i}) = z_{i} tanh (λ z_{i})

and

\partial_{z_{i}} log cosh (λ z_{i}) = λ tanh (λ z_{i})

. This yields

\begin{matrix} \frac{\partial ℓ_{2 i}}{\partial β} & = \frac{z_{i} - λ tanh (λ z_{i})}{σ} x_{i}, \\ \frac{\partial ℓ_{2 i}}{\partial σ} & = \frac{- 1 + z_{i}^{2} - λ z_{i} tanh (λ z_{i})}{σ}, \\ \frac{\partial ℓ_{2 i}}{\partial λ} & = - λ + z_{i} tanh (λ z_{i}) . \end{matrix}

The correction terms involving

tanh (λ z_{i})

capture how departures from normality reweight central versus tail residuals.

5.3.3. Compact Form with Indicators

Combining both data types is conveniently handled with an indicator that toggles the appropriate contributions. Letting

δ_{i} = 1 {y_{i} > T}

, the overall score becomes

\begin{matrix} \frac{\partial ℓ (Θ)}{\partial β} & = \sum_{i = 1}^{n} ((1 - δ_{i}) (- \frac{ϕ_{λ} (z_{T, i})}{σ Φ_{λ} (z_{T, i})} x_{i}) + δ_{i} (\frac{z_{i} - λ tanh (λ z_{i})}{σ} x_{i})), \\ \frac{\partial ℓ (Θ))}{\partial σ} & = \sum_{i = 1}^{n} ((1 - δ_{i}) (- \frac{ϕ_{λ} (z_{T, i})}{Φ_{λ} (z_{T, i})} \frac{z_{T, i}}{σ}) + δ_{i} (\frac{- 1 + z_{i}^{2} - λ z_{i} tanh (λ z_{i})}{σ})), \\ \frac{\partial ℓ (Θ))}{\partial λ} & = \sum_{i = 1}^{n} ((1 - δ_{i}) \frac{ϕ_{+} (z_{T, i}) - ϕ_{-} (z_{T, i})}{Φ_{-} (z_{T, i}) + Φ_{+} (z_{T, i})} + δ_{i} (- λ + z_{i} tanh (λ z_{i}))), \end{matrix}

where

ϕ_{\pm} (z) = ϕ (z \pm λ)

and

Φ_{\pm} (z) = Φ (z \pm λ)

.

This representation is compact for implementation and clarifies the limit

λ \to 0

, where the expressions collapse to the Gaussian Tobit score.

5.4. Observed and Fisher Information

Let

I (Θ) = - \nabla^{2} ℓ (Θ)

be the observed information. Writing

H_{1 i} = - \nabla^{2} ℓ_{1 i} (Θ))

and

H_{2 i} = - \nabla^{2} ℓ_{2 i} (Θ))

,

I (Θ)) = \sum_{i = 1}^{n} ((1 - δ_{i}) H_{1 i} + δ_{i} H_{2 i}) .

Closed forms for the blocks of

H_{2 i}

(uncensored) follows from differentiating (15), letting

t_{i} = tanh (λ z_{i})

and

s_{i} = {sech}^{2} (λ z_{i})

\begin{matrix} \frac{\partial^{2} ℓ_{2 i}}{\partial β \partial β^{⊤}} & = (- 1 + λ^{2} s_{i}) \frac{x_{i} {x_{i}}^{⊤}}{σ^{2}}, \frac{\partial^{2} ℓ_{2 i}}{\partial β \partial σ} = \frac{- 2 z_{i} + λ t_{i} + λ^{2} z_{i} s_{i}}{σ^{2}} x_{i}, \\ \frac{\partial^{2} ℓ_{2 i}}{\partial β \partial λ} & = \frac{- t_{i} - λ z_{i} s_{i}}{σ} x_{i}, \frac{\partial^{2} ℓ_{2 i}}{\partial σ^{2}} = \frac{1 - 3 z_{i}^{2} + 2 λ z_{i} t_{i} + λ^{2} z_{i}^{2} s_{i}}{σ^{2}}, \\ \frac{\partial^{2} ℓ_{2 i}}{\partial σ \partial λ} & = \frac{- z_{i} t_{i} - λ z_{i}^{2} s_{i}}{σ}, \frac{\partial^{2} ℓ_{2 i}}{\partial λ^{2}} = - 1 + z_{i}^{2} s_{i} . \end{matrix}

For censored terms, define

R_{i} = ϕ_{λ} (z_{T, i}) / Φ_{λ} (z_{T, i})

,

A_{i} = Φ (z_{T, i} - λ)

,

B_{i} = Φ (z_{T, i} + λ)

,

a_{i} = ϕ (z_{T, i} - λ)

,

b_{i} = ϕ (z_{T, i} + λ)

. Then

\frac{\partial^{2} ℓ_{1 i}}{\partial β \partial β^{⊤}} = \frac{R_{i}^{'}}{σ^{2}} x_{i} {x_{i}}^{⊤}, \frac{\partial^{2} ℓ_{1 i}}{\partial β \partial σ} = (\frac{R_{i}^{'} z_{T, i}}{σ^{2}} + \frac{R_{i}}{σ^{2}}) x_{i}, \frac{\partial^{2} ℓ_{1 i}}{\partial σ^{2}} = \frac{R_{i}^{'} z_{T, i}^{2} + 2 R_{i} z_{T, i}}{σ^{2}},

with

R_{i}^{'} = \frac{ϕ_{λ}^{'} (z_{T, i}) Φ_{λ} (z_{T, i}) - ϕ_{λ} {(z_{T, i})}^{2}}{Φ_{λ} {(z_{T, i})}^{2}}, ϕ_{λ}^{'} (z) = - \frac{1}{2} ((z - λ) ϕ (z - λ) + (z + λ) ϕ (z + λ)) .

Mixed derivatives with respect to

λ

can be obtained analogously; in practice, we compute

I (Θ)

analytically for uncensored terms and numerically (via automatic differentiation or finite differences) for censored terms to enhance stability.

The Fisher information is

J_{F} (Θ) = E (I (Θ))

. Closed forms are unwieldy; throughout, we use the observed information at the MLE as a consistent estimator

{\hat{Σ}}_{Θ} = I {(\hat{Θ})}^{- 1}

.

6. Inference and Practical Implementation

The model-specific likelihoods, score functions, and observed information are presented within each modeling section as follows: CTN in Section 3, TTN in Section 4, and the survival frailty model in Section 5. This section consolidates general inferential remarks (asymptotics, standard errors, and boundary considerations) that apply across the proposed likelihood-based specifications.

Asymptotic Theory

Let

Θ

denote the model parameter vector with dimension d (for example,

d = p + 2

in the case of CTN/TTN and

d = p + 3

for the Gamma-frailty survival specification). Under standard regularity conditions (identifiability, interior parameter values for the nuisance/scale components, and a non-degenerate sampling scheme, for instance, for CTN

0 < Pr (Y = T) < 1

, for TTN

0 < Pr (0 < Y^{*} < 1) < 1

, and for survival

0 < Pr (δ = 1) < 1

), the MLE is consistent and asymptotically normal as follows:

\hat{Θ} \overset{d}{\to} N_{d} (Θ_{0}, J_{F} {(Θ_{0})}^{- 1}), {\hat{Σ}}_{Θ} \equiv I {(\hat{Θ})}^{- 1} \overset{p}{\to} J_{F} {(Θ_{0})}^{- 1} .

Wald intervals for component

Θ_{ℓ}

are used:

{\hat{Θ}}_{ℓ} \pm z_{1 - α / 2} \sqrt{{[{\hat{Σ}}_{Θ}]}_{ℓ ℓ}} .

Wald tests, likelihood-ratio tests and score tests are standard. When enforcing positivity constraints, we recommend reparameterizations such as

η = log (σ)

and

ξ = log (1 + λ)

; for frailty models, one may additionally use

ζ = log (θ)

. Standard errors on the original scale follow from the delta method with the appropriate Jacobian.

Remark 11.

Under

H_{0} : λ = 0

, the parameter lies on the boundary of the parameter space (

λ \geq 0

). Consequently, the usual

χ_{1}^{2}

approximation for the LR statistic may fail; under standard regularity conditions, a mixture

\frac{1}{2} χ_{0}^{2} + \frac{1}{2} χ_{1}^{2}

typically arises.

7. Simulation Study

Here, LTN denotes the log-two-piece normal baseline survival model (Section 5); left-censoring at L enters the likelihood through

F (\cdot)

or

F_{m} (\cdot)

, paralleling the boundary contribution induced by censoring in the CTN setting.

We conducted a Monte Carlo study to assess the finite-sample performance of the MLEs for the left-censored LTN regression model. A categorical covariate with three levels was simulated, and inference was based on 1000 replicates with sample sizes

n \in {100, 500, 1000}

. The true parameters were

β_{0} = 1

,

β_{1} = 0.5

,

β_{2} = - 0.5

,

σ = 1.0

, and

λ \in {0.8, 1.6, 2.0}

. For the Gamma-frailty extension, we considered

θ \in {0.1, 0.25}

. We report the mean estimates, absolute bias, RMSE, and Monte Carlo standard error (SE). Results are shown in Table 1, Table 2 and Table 3; NF denotes the LTN model without frailty. Estimation used optim in R with the BFGS method.

We note that we do not include a separate TTN simulation study for proportion data: the TTN model on

(0, 1)

is a purely doubly truncated continuous specification, and its likelihood-based estimation follows the standard truncated-likelihood construction. Our simulations, therefore, focus on the settings that introduce the main inferential challenges of the paper, namely censoring-induced point mass (CTN) and frailty marginalization in survival models.

The data were generated as follows:

(i): Fix the model parameters: $β = {(β_{0}, β_{1}, β_{2})}^{⊤}$ , $σ > 0$ , the LTN shape parameter $λ \geq 0$ , the frailty variance $θ > 0$ (when applicable), the sample size n, and the target left-censoring proportion $p_{c}$ .
(ii): For each $i = 1, \dots, n$ , draw a group indicator $g_{i} \in {0, 1, 2}$ with equal probabilities. Construct $x_{i}$ and set $μ_{i} = x_{i}^{⊤} β$ . In the frailty model, draw $v_{i} \sim Gamma (1 / θ, θ)$ ; otherwise, set $v_{i} \equiv 1$ . Conditional on $(x_{i}, v_{i})$ , draw $U_{i} \sim Uniform (0, 1)$ , set $s_{i} = U_{i}^{1 / v_{i}}$ , and obtain the failure time by inversion as follows:

$T_{i} = S_{0}^{- 1} (s_{i} ∣ μ_{i}, σ, λ),$

where $S_{0}$ is the LTN baseline survival function (no frailty).
(iii): Given $p_{c}$ , choose a common threshold L such that $Pr (T_{i} \leq L) \approx p_{c}$ by solving

$E (F (L ∣ μ_{i}, σ, λ, θ)) - p_{c} = 0,$

using $F_{0} (\cdot)$ for the model without frailty and $F_{m} (\cdot)$ for the marginal frailty model. Finally, define the observed time and indicator as follows:

$Y_{i} = \{\begin{matrix} T_{i}, & δ_{i} = 1 if T_{i} > L (exact), \\ L, & δ_{i} = 0 if T_{i} \leq L (left-censored) . \end{matrix}$

Figure 1, Figure 2 and Figure 3 display MSEs by censoring level (0%, 25%, 50%) and sample size. Panels labeled without frailty correspond to independence. The MSEs of the regression coefficients are most affected by heavier censoring, particularly under independence, and decrease as n grows. The LTN parameters show some sensitivity in

λ

, while

σ

is comparatively stable. Overall, with or without frailty, the estimators track the reference values well, even under censoring.

8. Applications

We present two real-data examples to illustrate two-piece normal models for censored data, both with and without covariates, and to highlight their practical value under censoring.

8.1. HIV Dataset

As a first illustration, we analyze HIV-1 RNA measurements from the Colombian SIVIGILA surveillance system (Ministry of Health). The database preserves patient anonymity and records age, sex, date of entry, HAART (highly active antiretroviral therapy) status at different stages, CD4 count, and viral load (copies of HIV-1 RNA). We focus on 106 women with at least one year of HAART. The assay detection limit is

L = 50

copies/mL; on the log scale, the censoring threshold is

{log}_{10} (L)

. In this sample,

34.9 %

of observations fall below L.

We fit the censored two-piece normal model to the

{log}_{10}

viral-load data. For the 69 uncensored observations (women on HAART ≥ 1 year), the summary statistics were: mean

3.4101

, variance

1.4249

, skewness

0.3549

, and kurtosis

1.9836

. The mild right skewness and kurtosis below 3 indicate lighter-than-normal tails, a feature that CTN accommodates under censoring.

Hartigan’s dip test [11] yields

D = 0.0686

(p-value

= 0.0133

), rejecting unimodality. Alongside CTN, we also fitted a censored bimodal normal (CBN) model and a censored flexible normal (CFN) model (see [22] for a flexible bimodal normal family). Table 4 reports the corresponding MLEs (with standard errors), information criteria, and Anderson–Darling goodness-of-fit results. Maximum likelihood estimation was carried out using the optim function in R. Model comparison relied on AIC [8], BIC [9], and CAIC [10]. Larger p-values in the Anderson–Darling (AD) goodness-of-fit test indicate better agreement with the fitted cdf; we computed the AD statistic using the goftest package in R. Note that in the CFN competitor, the shape parameter is not restricted in sign; hence, negative estimates (for instance, when

\hat{λ} < 0

) are admissible; by contrast, in our TN-based models, we assume

λ \geq 0

by construction.

As shown in Table 4, according to AIC/BIC/CAIC, CTN and CFN provide the best fits (in that order). The AD p-values likewise support CTN/CFN for these data. The observed censoring proportion is

34.9 %

. Model-implied expected censoring proportions, computed as

{\hat{p}}_{c} = n^{- 1} \sum_{i = 1}^{n} \hat{F} (L ∣ x_{i})

, were

32.98 %

(CTN),

31.11 %

(CFN), and

21.25 %

(CBN), indicating that CTN and CFN closely reproduce the observed censoring level.

8.2. FoodExpenditure Dataset

Following [3], we analyze the FoodExpenditure data, where the response is the proportion of household income spent on food (

Y \in (0, 1)

), and the covariates are household income (

X_{1}

) and household size (

X_{2}

) [23]. The data are distributed using the R package betareg [24].

We fit three regression models for Y as follows: (i) a beta regression as in [3]; (ii) a doubly truncated normal (truncated Gaussian, TG); and (iii) a doubly truncated two-piece normal (TTN), noting that TG is the limit of TTN as

λ \to 0

. Table 5 reports MLEs (SEs) and the information criteria AIC, BIC, and CAIC. Estimation used optim in R; beta regression used betareg [24]. By all three criteria, TTN provides the best fit, followed by TG and then beta.

To check model adequacy and detect outliers, we used the transformed martingale residual

r M T_{i}

with normal envelopes [25,26]. The QQ envelopes (Figure 4) indicate that TTN adheres more closely to the reference line than TG and beta, corroborating the information–criteria ranking. We also report the Anderson–Darling (AD) statistic with p-values in Table 5; larger p-values favor TTN over TG and beta.

8.3. Mroz Dataset

We analyze 753 married women from the classic labor–supply study: 428 work positive hours, while 325 do not. Let

H_{i}

be the annual hours worked and define

Y_{i} = log (H_{max} - H_{i} + 1)

, where

H_{max} = {max}_{j} H_{j}

, so non-workers map to the point

T_{1} = log (H_{max} + 1) \approx 8.5073

and workers to a continuous range below

T_{1}

. As covariates, we use the number of children aged 6–18 (

X_{1}

), the woman’s age (

X_{2}

), and years of work experience (

X_{3}

). We fit censored normal (CN), censored skew–normal (CSN), and censored two-piece normal (CTN) linear models by maximum likelihood.

Table 6 reports parameter estimates (SEs) and information criteria. According to AIC, BIC, and CAIC, the CTN model provides the best fit, followed by CSN and CN. QQ envelopes of the transformed martingale residuals

r M T_{i}

(Figure 5) also favor CTN over the competitors.

The

r M T_{i}

envelopes for CN, CSN, and CTN are shown in Figure 5a–c. The CTN envelopes adhere most closely to the reference, reinforcing the information–criteria ranking.

8.4. Childhood Cancer Dataset

We analyze children in Colombia (2023) with confirmed cancer who were hospitalized, obtained from the National Institute of Health via the SIVIGILA portal [27]. The outcome is the time from symptom onset to hospitalization (in days). Left-censoring arises when the onset date is unknown at the time of admission. Covariates are sex (female/male) and age in years at consultation. Figure 6 indicates a skewed distribution of times.

In total,

n = 37

children were analyzed; the event proportion was

0.81

. Sample summaries were as follows: mean

= 23

, sd

= 33.2

, median

= 9

,

min = 0.25

,

max = 123.64

(days). Additional descriptive statistics by censoring status and sex are reported in Appendix B (Table A1).

Model Estimates

We fitted log-two-piece normal baseline models for the time distribution, both without and with Gamma frailty

V \sim Gamma (1 / θ, θ)

. The parameters are the regression vector

β

, scale

σ

, shape

λ

, and (for frailty) variance

θ

. Table 7 reports MLEs (SEs); bold entries denote Wald

p < 0.05

. The model without frailty is labeled LTN, and the model with Gamma frailty is labeled LTN–G. Covariates:

β_{0}

(intercept),

β_{1}

(male vs. female), and

β_{2}

(age in years, centered).

Under the model without frailty, covariate effects are not statistically different from zero. In contrast, with Gamma frailty, there is strong evidence of association as follows:

{\hat{β}}_{1} = 0.87

and

{\hat{β}}_{2} = - 0.08

, indicating (conditional on frailty) shorter expected times for males and for older children on the log–time scale. The estimated scale is

\hat{σ} = 1.92

(SE

0.051

), and the frailty variance

\hat{θ} = 0.50

suggests substantial unobserved heterogeneity. A likelihood–ratio comparison favors the frailty model (LR

= 450.91

,

p < 0.01

), with fit indices: LTN,

ℓ = - 225.70

, AIC

= 461.40

, BIC

= 469.45

, CAIC

= 474.45

; LTN–G,

ℓ = - 0.24

, AIC

= 12.49

, BIC

= 22.15

, CAIC

= 28.15

. This specification accommodates baseline bimodality (via

λ

) and heterogeneity (via

θ

).

9. Conclusions

This paper develops a unified likelihood-based framework for limited outcomes using the two-piece normal construction. The framework covers a censored specification for boundary-inflated responses, a doubly truncated specification on

(0, 1)

for rates and proportions, and a survival formulation with a log-two-piece normal baseline and Gamma frailty. We provide explicit closed-form building blocks (pdf, cdf, survival, hazard, and cumulative hazard) and likelihood expressions that support routine maximum likelihood estimation and reproducible implementation.

Monte Carlo results suggest that the maximum likelihood estimators exhibit a small bias and decreasing RMSE as the sample size increases. Censoring primarily increases the variability of regression coefficients; the scale parameter is comparatively stable, whereas the shape parameter can be more sensitive under heavy censoring. In the frailty setting, the variance component is recoverable at moderate sample sizes in the scenarios considered, and the observed information provides standard errors with satisfactory frequentist behavior.

The empirical analyzes broadly support these findings. For HIV-1 RNA subject to a detection limit, the censored two-piece normal attains competitive information–criterion values and passes goodness-of-fit checks; model-implied censoring proportions closely match the observed fraction. For household food expenditure on

(0, 1)

, the doubly truncated two-piece normal improves fit relative to the truncated Gaussian and beta regression in these datasets. In the labor-supply application, the censored specification improves fit over Gaussian and skew-normal alternatives. For childhood cancer survival data, adding Gamma frailty to a log-two-piece baseline improves fit and highlights covariate effects consistent with unobserved heterogeneity; however, sensitivity to starting values and parameterization should be assessed.

Limitations and future work. While the proposed models retain closed forms and integrate naturally with standard diagnostics (information criteria, Anderson–Darling tests, QQ envelopes, and residual analyzes), caution is warranted under extreme censoring/truncation or weak separation, where the shape parameter may be weakly identified. Practical strategies include warm starts from simpler baselines, multiple initializations, and profile-likelihood checks; moreover, when

λ

approaches the boundary value 0, standard large-sample approximations for inference and likelihood-ratio testing may require adjustment. Finally, in the doubly truncated setting, regression on the location parameter does not necessarily imply mean regression, and interpretation should be made explicit. Future work includes Bayesian implementations with identifiability-aware priors, penalized and robust estimation in contaminated settings, shared-frailty and multilevel extensions, interval/combined censoring and time-varying covariates, semi-/nonparametric baselines within the two-piece construction, and tailored diagnostics for multimodality; public software and fully reproducible workflows would further support adoption in applied domains.

Author Contributions

Conceptualization, G.M.-F., H.S. and J.R.-M.; methodology, G.M.-F., H.S. and J.R.-M.; software, J.R.-M.; validation, G.M.-F., H.S. and J.R.-M.; formal analysis, G.M.-F., H.S. and J.R.-M.; investigation, G.M.-F., H.S. and J.R.-M.; writing—original draft preparation, G.M.-F., H.S. and J.R.-M.; writing—review and editing, H.S.; visualization, G.M.-F., H.S. and J.R.-M.; project administration, G.M.-F. and J.R.-M.; funding acquisition, G.M.-F. and J.R.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Vice-rectorate for Research of the Universidad de Córdoba, Colombia, project grant FCB-02-24: “Modelos estadísticos aplicados a eventos dependientes” (G.M.-F. and J.R.-M.).

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors gratefully acknowledge the Universidad de Córdoba, Colombia for the financial support for this project.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

RMSE	Root Mean Squared Error
TN	Two-piece normal distribution
CTN	Two-piece normal distribution censored
TTN	Doubly truncated two-piece normal distribution
LTN	Two-piece Lognormal distribution
LTN-G	Two-piece Lognormal distribution with Gamma frailty
AIC	Akaike Information Criterion
BIC	Bayesian Information Criterion
CAIC	Consistent AIC

Appendix A. Derivation of the Left-Censoring Likelihood Under Gamma Frailty

Log-Two-Piece Normal Model with Left-Censoring and Gamma Frailty

We analyze left-censored survival data under a log-two-piece normal baseline with a multiplicative Gamma frailty. Let

Y_{i}

be the survival time of subject i

(i = 1, \dots, n)

, and let

δ_{i} = 1

indicate an exact observation and

δ_{i} = 0

a left-censored one at

y_{i}

.

Under the CLTN baseline with covariates,

f (y_{i} ∣ x_{i}) = \frac{1}{2 σ y_{i}} (ϕ (\frac{log (y_{i}) - x_{i}^{⊤} β}{σ} - λ) + ϕ (\frac{log (y_{i}) - x_{i}^{⊤} β}{σ} + λ)),

(A1)

where

ϕ

is the standard normal density,

σ > 0

is a scale parameter,

λ

controls the shape, and

x_{i}^{⊤} β

shifts

log (Y_{i})

. Write

z_{i} = \frac{log (y_{i}) - x_{i}^{⊤} β}{σ}, Φ^{c} (u) = 1 - Φ (u)

, with

Φ

as the standard normal cdf.

Frailty terms are assumed

V \sim Gamma (1 / θ, θ)

, so

E (V) = 1

and

Var (V) = θ > 0

represent unobserved heterogeneity. The marginal survival (integrating out

v_{i}

) is

S (y_{i} ∣ x_{i}) = {(1 + θ H_{0} (y_{i} ∣ x_{i}))}^{- 1 / θ}, H_{0} (y_{i} ∣ x_{i}) = - log (S_{0} (y_{i} ∣ x_{i})),

where

S_{0}

denotes the CLTN survival without frailty.

For left-censored data, the individual contribution is

L_{i} (β, σ, λ, θ) = \int_{0}^{\infty} f {(y_{i} ∣ x_{i}, v_{i})}^{δ_{i}} {(1 - S (y_{i} ∣ x_{i}, v_{i}))}^{1 - δ_{i}} f (v_{i}) d v_{i},

with

f (v) = v^{1 / θ - 1} e^{- v / θ} / (θ^{1 / θ} Γ (1 / θ))

. Hence,

\begin{matrix} L (Θ) = & \prod_{i = 1}^{n} L_{i} (β, σ, λ, θ) \\ = & \prod_{i = 1}^{n} \int_{0}^{\infty} {(h (y_{i} ∣ x_{i}, v_{i}) S (y_{i} ∣ x_{i}, v_{i}))}^{δ_{i}} {[1 - S (y_{i} ∣ x_{i}, v_{i})]}^{1 - δ_{i}} f (v_{i}) d v_{i} \\ = & \prod_{i = 1}^{n} \int_{0}^{\infty} v_{i}^{δ_{i}} {(\frac{f (y_{i} ∣ x_{i})}{S (y_{i} ∣ x_{i})} S (y_{i} ∣ x_{i}, v_{i}))}^{δ_{i}} {[1 - S (y_{i} ∣ x_{i}, v_{i})]}^{1 - δ_{i}} \frac{v_{i}^{\frac{1}{θ} - 1} exp (- v_{i} / θ)}{Γ (\frac{1}{θ}) θ^{1 / θ}} d v_{i} \\ = & \prod_{i = 1}^{n} {(\frac{ϕ (z_{i} - λ) + ϕ (z_{i} + λ)}{σ y_{i} [Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)]})}^{δ_{i}} (\int_{0}^{\infty} v_{i}^{\frac{1}{θ} + δ_{i} - 1} e^{- v_{i} / θ} {(\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))}^{δ_{i} v_{i}} \\ {[1 - {(\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))}^{v_{i}}]}^{1 - δ_{i}} \frac{d v_{i}}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ})}) \\ = & \prod_{i = 1}^{n} {(\frac{ϕ (z_{i} - λ) + ϕ (z_{i} + λ)}{σ y_{i} [Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)]})}^{δ_{i}} (\int_{0}^{\infty} v_{i}^{\frac{1}{θ} + δ_{i} - 1} e^{(- \frac{v_{i}}{θ} - v_{i} δ_{i} log (\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))} \\ [1 - (1 - δ_{i}) {(\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))}^{v_{i}}] \frac{d v_{i}}{θ^{\frac{1}{θ}} Γ (1 / θ)}) . \end{matrix}

(A2)

Note that,

S (y_{i} ∣ x_{i}, v_{i}) = exp (- v_{i} H (y_{i} ∣ x_{i})) = {(\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))}^{v_{i}}

. Also, the term

{(1 - S (y_{i} ∣ x_{i}, v_{i}))}^{1 - δ_{i}} = 1 - (1 - δ_{i}) {(\frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)))}^{v_{i}}

for

δ_{i} = 1

or 0.

After simplification,

\begin{matrix} L (Θ) = & \prod_{i = 1}^{n} \frac{1}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ})} {(\frac{ϕ (z_{i} - λ) + ϕ (z_{i} + λ)}{σ y_{i} [Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)]})}^{δ_{i}} (\int_{0}^{\infty} v_{i}^{\frac{1}{θ} + δ_{i} - 1} e^{(- v_{i} (\frac{1}{θ} - δ_{i} log (A_{i})))} d v_{i} \\ - (1 - δ_{i}) \int_{0}^{\infty} v_{i}^{\frac{1}{θ} + δ_{i} - 1} e^{(- v_{i} (\frac{1}{θ} - (δ_{i} + 1) log (A_{i})))} d v_{i}), \end{matrix}

(A3)

where

A_{i} = \frac{1}{2} (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ))

. These integrals reduce to Gamma functions via the substitutions

u = v_{i} (\frac{1}{θ} - δ_{i} log (A_{i})), a n d u = v_{i} (\frac{1}{θ} - (δ_{i} + 1) log (A_{i})),

which yields

\begin{matrix} L (Θ) = & \prod_{i = 1}^{n} {(\frac{ϕ (z_{i} - λ) + ϕ (z_{i} + λ)}{σ y_{i} [Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ)]})}^{δ_{i}} \frac{1}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ})} \\ \times (\frac{Γ (\frac{1}{θ} + δ_{i})}{{(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{{(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) . \end{matrix}

(A4)

Finally, the log-likelihood is

\begin{matrix} ℓ (Θ) = & \sum_{i = 1}^{n} δ_{i} (log (ϕ (z_{i} - λ) + ϕ (z_{i} + λ)) - log (σ y_{i}) - log (Φ^{c} (z_{i} - λ) + Φ^{c} (z_{i} + λ))) \\ - \frac{1}{θ} log (θ) - log (Γ (\frac{1}{θ})) + log (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) . \end{matrix}

(A5)

The parameter vector

Θ = {(β^{⊤}, σ, λ, θ)}^{⊤}

is estimated by maximum likelihood. The estimator is obtained by numerically solving the score equations

U (Θ) = \partial ℓ (Θ) / \partial Θ = 0

using a quasi-Newton algorithm. The expressions for the score components

U_{β} (Θ)

and

U_{θ} (Θ)

are provided below; the derivatives with respect to

σ

and

λ

follow analogously.

\begin{matrix} \frac{\partial ℓ}{\partial β} = & \sum_{i = 1}^{n} [\frac{δ_{i}}{h (y_{i} ∣ x_{i})}] \frac{\partial h (y_{i} ∣ x_{i})}{\partial β} + \frac{\partial log}{\partial β} (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) \\ \frac{\partial ℓ}{\partial θ} = & \sum_{i = 1}^{n} \frac{1}{θ} [log (θ - 1)] - \frac{\partial log (Γ (θ))}{\partial θ} + \frac{\partial log}{\partial θ} (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) \end{matrix}

The second derivatives are given by

\begin{matrix} \frac{\partial^{2} ℓ}{\partial β \partial β^{⊤}} = & \sum_{i = 1}^{n} [\frac{δ_{i}}{h (y_{i} ∣ x_{i})}] \frac{\partial^{2} h (y_{i} ∣ x_{i})}{\partial β \partial β^{⊤}} + \frac{\partial^{2} log}{\partial β \partial β^{⊤}} (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) \\ \frac{\partial^{2} ℓ}{\partial θ^{2}} = & \sum_{i = 1}^{n} \frac{1}{θ^{2}} [2 - log (θ)] - \frac{\partial^{2} log (Γ (θ))}{\partial θ^{2}} + \frac{\partial^{2} log}{\partial θ^{2}} (\frac{Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + δ_{i} log (A_{i}))}^{\frac{1}{θ} + δ_{i}}} - \frac{(1 - δ_{i}) Γ (\frac{1}{θ} + δ_{i})}{Γ (\frac{1}{θ}) {(\frac{1}{θ} + (δ_{i} + 1) log (A_{i}))}^{\frac{1}{θ} + δ_{i}}}) \end{matrix}

For inference, under standard regularity conditions, the MLE is asymptotically normal,

\sqrt{n} (\hat{Θ} - Θ) \overset{d}{\to} N_{p + 3} (0, I {(Θ)}^{- 1}),

where

I (Θ)

denotes the Fisher information matrix, consistently estimated by the observed information

\hat{I} (\hat{Θ}) = - \partial^{2} ℓ (Θ) / \partial Θ \partial Θ^{⊤} |_{Θ = \hat{Θ}}

.

Appendix B. Additional Descriptive Statistics for the Childhood Cancer Dataset

Table A1. Exploratory summary by censoring status and sex for the childhood cancer dataset.

Type	Gender	Mean_Time	Min_Time	Max_Time	Mean_Age	Sd_Age	Min_Age	Max_Age
Censored	F	0.50	0.50	0.50	12.33	4.04	8.00	16.00
Censored	M	0.50	0.50	0.50	14.00	3.46	9.00	17.00
Observed	F	31.91	0.25	124.00	7.25	5.18	2.00	17.00
Observed	M	26.93	0.25	122.00	9.59	6.01	1.00	17.00

References

Tobin, J. Estimation of relationships for limited dependent variables. Econometrica 1958, 26, 24–36. [Google Scholar] [CrossRef]
Martínez-Flórez, G.; Tovar-Falón, R.; Elal-Olivero, D. Some new flexible classes of normal distribution for fitting multimodal data. Statistics 2022, 56, 182–205. [Google Scholar] [CrossRef]
Ferrari, S.L.P.; Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
Salinas, H.S.; Bakouch, H.; Qarmalah, N.; Martínez-Flórez, G. A flexible class of two-piece normal distribution with a regression illustration to biaxial fatigue data. Mathematics 2023, 11, 1271. [Google Scholar] [CrossRef]
Salinas, H.S.; Martínez-Flórez, G.; Bakouch, H.S.; Alyami, L.; Caimanque, W.E. Modeling Bimodal and Skewed Data: Asymmetric Double Normal Distribution with Applications in Regression. Symmetry 2025, 17, 942. [Google Scholar] [CrossRef]
Mahdavi, A.; da Oliveira Silva, G. A Method to Expand Family of Continuous Distributions Based on Truncated Variable on (0,1). J. Stat. Res. Iran 2016, 13, 231–247. [Google Scholar]
Klein, J.P.; Moeschberger, M.L. Survival Analysis: Techniques for Censored and Truncated Data, 2nd ed.; Springer: New York, NY, USA, 2003. [Google Scholar]
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Bozdogan, H. Model selection and Akaike’s Information Criterion (AIC): The general theory and its analytical extensions. Psychometrika 1987, 52, 345–370. [Google Scholar] [CrossRef]
Hartigan, J.A.; Hartigan, P.M. The Dip Test of Unimodality. Ann. Stat. 1985, 13, 70–84. [Google Scholar] [CrossRef]
Burke, W.J. Fitting and interpreting Cragg’s Tobit alternative using Stata. Stata J. 2009, 9, 584–592. [Google Scholar] [CrossRef]
Moulton, L.H.; Halsey, N.A. A mixture model with detection limits for regression analyses of antibody response to vaccine. J. Am. Stat. Assoc. 1995, 90, 545–555. [Google Scholar]
Vaupel, J.W.; Manton, K.G.; Stallard, E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 1979, 16, 439–454. [Google Scholar] [CrossRef] [PubMed]
Clayton, D.G. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika 1978, 65, 141–151. [Google Scholar] [CrossRef]
Clayton, D.; Cuzick, J. Multivariate generalizations of the proportional hazards model. J. R. Stat. Soc. Ser. A 1985, 148, 82–108. [Google Scholar] [CrossRef]
Liu, K.; Wang, Y.Q.; Zhu, X.; Balakrishnan, N. Generalized Gamma Frailty and Symmetric Normal Random Effects Model for Repeated Time-to-Event Data. Symmetry 2025, 17, 1760. [Google Scholar] [CrossRef]
Hougaard, P. Analysis of Multivariate Survival Data; Springer: New York, NY, USA, 2000. [Google Scholar]
Aalen, O.O.; Borgan, Ø.; Gjessing, H.K. Survival and Event History Analysis: A Process Point of View; Springer: New York, NY, USA, 2008. [Google Scholar]
Wienke, A. Frailty Models in Survival Analysis; Chapman & Hall/CRC: Boca Raton, FL, USA, 2010. [Google Scholar]
Hanagal, D.D. Modeling Survival Data Using Frailty Models, 2nd ed.; Springer Nature: Singapore, 2019. [Google Scholar]
Elal-Olivero, D. Alpha-skew-normal distribution. Proyecciones 2010, 29, 224–240. [Google Scholar] [CrossRef]
Griffiths, W.E.; Hill, R.C.; Judge, G.G. Learning and Practicing Econometrics; John Wiley & Sons: New York, NY, USA, 1993. [Google Scholar]
Cribari-Neto, F.; Zeileis, A. Beta regression in R. J. Stat. Softw. 2010, 34, 1–24. [Google Scholar]
Barros, M.; Galea, M.; González, M.; Leiva, V. Influence diagnostics in the tobit censored response model. Stat. Methods Appl. 2010, 19, 379–397. [Google Scholar] [CrossRef]
Silva, G.O.; Ortega, E.M.M.; Cordeiro, G.M. Residuals for log–Burr XII regression models in survival analysis. J. Appl. Stat. 2011, 38, 1435–1445. [Google Scholar] [CrossRef]
Instituto Nacional de Salud (INS). Sistema Nacional de Vigilancia en Salud Pública (SIVIGILA), Colombia. Portal SIVIGILA 4.0 y Microdatos. 2023. Available online: https://portalsivigila.ins.gov.co/ (accessed on 19 January 2026).

Figure 1. MSE for the LTN model (

λ = 0.8

) with and without frailty.

Figure 1. MSE for the LTN model (

λ = 0.8

) with and without frailty.

Figure 2. MSE for the LTN model (

λ = 1.6

) with and without frailty.

Figure 2. MSE for the LTN model (

λ = 1.6

) with and without frailty.

Figure 3. MSE for the LTN model (

λ = 2.0

) with and without frailty.

Figure 3. MSE for the LTN model (

λ = 2.0

) with and without frailty.

Figure 4. Envelope graphs: (a) beta model, (b) TG model, and (c) TTN model.

Figure 5. Confidence–envelope QQ plots of

r M T_{i}

for the fitted models.

Figure 5. Confidence–envelope QQ plots of

r M T_{i}

for the fitted models.