Linear Models with Nested Random Effects

Ferreira, Dário

doi:10.3390/sym17030374

Open AccessArticle

Linear Models with Nested Random Effects

by

Dário Ferreira

Department of Mathematics and Center of Mathematics and Applications, University of Beira Interior, 6201-001 Covilha, Portugal

Symmetry 2025, 17(3), 374; https://doi.org/10.3390/sym17030374

Submission received: 9 January 2025 / Revised: 12 February 2025 / Accepted: 25 February 2025 / Published: 28 February 2025

(This article belongs to the Section Mathematics)

Download Versions Notes

Abstract

Symmetry is a crucial concept in various fields of mathematics, offering a systematic approach to understanding structural properties and simplifying complex problems. This study focuses on linear mixed models, emphasizing the role of symmetry in the design of experiments and the structure of variance–covariance matrices. Building on foundational works, this paper introduces the concept of nested models and employs Wishart matrices for variance component estimation, enhancing the efficiency of complex model analysis. The methodology is particularly applicable when variance–covariance matrices conform to a commutative Jordan algebra (CJA). The practical significance of this modeling approach is demonstrated through numerical applications using both simulated and real-world data.

Keywords:

linear models; nested models; symmetry; commutative Jordan algebra

1. Introduction

Symmetry plays a fundamental role in various branches of mathematics and their applications, providing a framework for understanding structural properties and simplifying complex problems. In the context of linear mixed models, symmetry can manifest in the design of experiments, the structure of variance–covariance matrices, and the underlying algebraic structures used in estimation procedures.

Linear mixed models have been extensively explored in the literature, with numerous seminal contributions enhancing the theoretical and practical understanding of them. For instance, Brown and Prescott [1] provide a thorough exploration of how linear mixed models can be effectively applied in medical research, offering invaluable insights into both theoretical foundations and practical applications. Pinheiro and Bates [2] extend the applicability of mixed-effects models by delving into their implementation in S and S-PLUS, highlighting the versatility and practicality of these models. Similarly, Rao and Kleffe [3] make a foundational contribution by advancing the estimation of variance components and their applications, laying a critical groundwork for subsequent developments in the field. Sahai and Ageel [4] enrich the discourse on linear mixed models by addressing the complexities of fixed, random, and mixed models in their analysis of variance, providing a comprehensive treatment of these methodologies. Additionally, Demidenko [5] presents a modern perspective by integrating theoretical insights with practical implementations in R, bridging the gap between abstract theory and applied statistics.

Recent advancements have also addressed computational challenges in estimating variance components efficiently. Tack and Müller [6] developed fast restricted maximum likelihood (REML) estimation techniques for linear mixed models with Kronecker product covariance structures, improving computational performance in large-scale applications. Additionally, Lee et al. [7] provide a unified approach to generalized linear mixed models, offering a comprehensive treatment that extends classical methods to accommodate non-normal responses, thus broadening the scope of applications.

Beyond these influential works on linear mixed models, Fonseca et al. [8], for example, investigate the properties and applications of binary operations in linear algebra, specifically examining the scope of orthogonal normal models. Furthermore, Mexia et al. [9] contribute to this domain by developing the COBS methodology, which incorporates segregation, matching, crossing, and nesting, streamlining complex problems through innovative techniques. More recently, Ferreira et al. [10] extend these advancements by exploring inference in nonorthogonal mixed models, addressing key challenges in model estimation and interpretation. Additionally, Ferreira et al. [11] further enhance statistical methodologies by examining inference in mixed models with a mixture of distributions and controlled heteroscedasticity. Complementing these developments, Bailey et al. [12] focus on experimental design, proposing designs for half-diallel experiments with commutative orthogonal block structures, which optimize efficiency in statistical analyses.

In our study, we build on these foundational works by introducing the concept of nested models and utilizing Wishart matrices for estimating variance components. By simplifying complex models as building blocks, our approach streamlines the estimation process and makes it more efficient. Our method is particularly effective when variance–covariance matrices fall under a commutative Jordan algebra, known as CJA, and allows for the structure of variance–covariance matrices to be captured and manipulated in a way that respects both symmetry and commutativity, facilitating the estimation of variance components.

In the next section, we analyze the structure of linear models with nested random effects models, which refer to models with random effects that are hierarchically structured within each level of the model. Moreover, we also analyze how to estimate the variance components associated with them. Then, we discuss the testing of hypotheses on the effects and interactions of fixed effect factors using F-tests and introduce a measure of relevance for the hypotheses, which becomes especially useful when there are several rejected null hypotheses. In the special case section, we focus on a specific case where the mean and the vector of residuals to the mean are independent for every block of observations, and analyze the implications of this assumption for the model. The Multiple Regression Designs section extends these results to multi-treatment regression models, with nested random effect factors, see Mexia [13]. Finally, we illustrate our methodology through a numerical application, followed by concluding remarks.

2. Variance Components

Let b represent the number of blocks in a mixed model that correspond to the level combinations of fixed effect factors. The observation vector for each block,

Y_{l}

,

l = 1, \dots, b

, is assumed to be normally distributed and independent, with mean vector

E (Y_{l}) = 1_{r} μ + X_{l} β, l = 1, \dots, b,

(1)

where r represents the number of observations within each block,

1_{r}

is a column vector of ones with r elements,

μ

is the overall average response, and

β

represents a vector of unknown fixed effects.

The variance–covariance structure of

Y_{l}

is given by

Var (Y_{l}) = V_{l} (θ) = \sum_{j = 1}^{w} θ_{j} M_{j},

(2)

where w represents the number of variance components in the model,

M_{j} = Z_{j} {(Z_{j})}^{⊤}

,

Z_{j}

is a known design matrix, and

θ_{j}

represents the jth variance component.

Each block l has the same factor structure, so the design matrix for the random effects in block l, denoted by

Z_{l}

, is related to the design matrices

Z_{j}

through

Z_{l} = \sum_{j = 1}^{w} Z_{j}^{(l)},

(3)

where

Z_{j}^{(l)}

represents the contribution of the jth random effect in block l. The linear mixed model for each block may then be written as

Y_{l} = 1_{r} μ + X_{l} β + Z_{l} γ + ε_{l}, l = 1, \dots, b,

(4)

where

γ

is a vector of random effects, assumed to be normally distributed with mean zero and variance–covariance matrix

G

, and

ε_{l}

represents the error term for block l. The elements of

ε_{l}

are typically assumed to be independent and normally distributed with mean zero and variance

σ^{2} I_{r}

.

Symmetry is explicitly leveraged in this model through the use of orthogonal transformations. Let

L_{r}

be a

(r - 1) \times r

matrix whose row vectors constitute an orthogonal basis for the orthogonal complement,

R {(1_{r})}^{⊥}

, of the range space of the column matrix

1_{r}

. The transformed vectors

{\dot{Y}}_{l} = L_{r} Y_{l}, l = 1, \dots, b,

(5)

are normally distributed and independent with mean zero and variance–covariance matrix

\dot{V} (θ) = \sum_{j = 1}^{w} θ_{j} {\dot{M}}_{j},

(6)

where

{\dot{M}}_{j} = L_{r} M_{j} L_{r}^{⊤}, j = 1, \dots, w .

(7)

This transformation simplifies the variance–covariance structure while preserving the symmetry in the model. The orthogonal decomposition facilitated by

L_{r}

ensures that the variance components remain independent, reflecting the fundamental principles of symmetry.

Next, consider a symmetric matrix

W

of dimension

d \times d

with elements

w_{l, h}

. The half-vectorization of

W

, denoted as

vech (W)

, is a column vector of dimension

\frac{d (d + 1)}{2}

obtained by extracting the upper triangular elements (including the diagonal) of

W

vech (W) = (w_{1, 1}, \dots, w_{1, d}, w_{2, 2}, \dots, w_{2, d}, \dots, w_{d - 1, d}, w_{d, d}) .

(8)

Thus,

vech (W)

contains the main diagonal and upper triangle of

W

.

Now, defining

\{\begin{matrix} {\dot{m}}_{j} = vech ({\dot{M}}_{j}), j = 1, \dots, w, \\ \dot{v} = vech (\dot{V} (θ)), \end{matrix}

(9)

and

\dot{M} = [{\dot{m}}_{1}, \dots, {\dot{m}}_{w}],

(10)

we obtain

\dot{M} θ = \dot{v} .

(11)

If

{\dot{m}}_{1}, \dots, {\dot{m}}_{w}

are linearly independent, then

\dot{M}

is full-rank, and the variance components can be estimated using the Moore–Penrose inverse

θ = {\dot{M}}^{+} \dot{v},

(12)

where + denotes the Moore–Penrose inverse.

We have a set of b independent normally distributed random variables

{\dot{Y}}_{l}

with null mean vectors and variance–covariance matrix

\dot{V} (θ) = \sum_{j = 1}^{w} θ_{j} {\dot{M}}_{j}

, as given by Equation (2). For each of the random variables

{\dot{Y}}_{l}

, we can calculate its variance–covariance matrix as

{\dot{V}}_{l} = E [{\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}],

(13)

where E denotes the expectation operator. Using the linearity of expectation we can then calculate the expected variance–covariance matrix of all the random variables as

\dot{V} = E [\sum_{l = 1}^{b} {\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}] = \sum_{l = 1}^{b} E [{\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}] = \sum_{l = 1}^{b} {\dot{V}}_{l} .

(14)

So the usual estimator of

\dot{V}

is then simply the sample variance–covariance matrix, given by

\tilde{\dot{V}} (θ) = \frac{1}{b} \sum_{l = 1}^{b} {\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤} .

(15)

Using Equation (12), and substituting the estimator

\tilde{\dot{V}} (θ)

for

\dot{V} (θ)

, we obtain the estimator

\tilde{θ} = {\dot{m}}^{+} \tilde{\dot{V}} (θ),

(16)

for

θ

.

To understand the properties of the estimator

\tilde{θ}

, we analyze the distribution and covariance structure of the sample variance–covariance matrix

\tilde{\dot{V}} (θ)

and its role in the estimation process.

We can write

\begin{matrix} b \tilde{\dot{V}} (θ) = \sum_{l = 1}^{b} {\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}, \end{matrix}

(17)

where

b \tilde{\dot{V}} (θ)

follows a Wishart distribution. Given that

\tilde{\dot{V}} (θ)

is an unbiased estimator of

\dot{V} (θ)

, we have

E [\tilde{\dot{V}} (θ)] = \dot{V} (θ) = \sum_{j = 1}^{w} θ_{j} {\dot{M}}_{j} .

(18)

So the covariance matrix of the estimator

\tilde{\dot{V}} (θ)

is

\begin{matrix} Σ / (\tilde{\dot{v}}) & = E [(\tilde{\dot{V}} (θ) - E [\tilde{\dot{V}} (θ)]) {(\tilde{\dot{V}} (θ) - E [\tilde{\dot{V}} (θ)])}^{⊤}] \end{matrix}

(19)

\begin{matrix} = E [\tilde{\dot{V}} (θ) \tilde{\dot{V}} {(θ)}^{⊤}] - E [\tilde{\dot{V}} (θ)] E {[\tilde{\dot{V}} (θ)]}^{⊤}, \end{matrix}

(20)

see Anderson [14].

Furthermore, since

{\dot{Y}}_{l}

are independent and have null mean vectors, we have

E [\tilde{\dot{V}} (θ)] = E [\frac{1}{b} \sum_{l = 1}^{b} {\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}] = \frac{1}{b} \sum_{l = 1}^{b} E [{\dot{Y}}_{l} {\dot{Y}}_{l}^{⊤}] = \dot{V} (θ) .

(21)

Note that the final step follows from the fact that the expected value of the sample variance–covariance matrix is just the variance–covariance matrix itself. Now, substituting

E [\tilde{\dot{V}} (θ)] = \dot{V} (θ)

into the expectation of the estimator, we have

E [\tilde{θ}] = E [{\dot{m}}^{+} \tilde{\dot{V}} (θ)] = {\dot{m}}^{+} E [\tilde{\dot{V}} (θ)] = {\dot{m}}^{+} \dot{V} (θ) .

(22)

Recalling Equation (12),

θ = {\dot{m}}^{+} \dot{v}

, we then have

E [\tilde{θ}] = θ = {\dot{m}}^{+} \dot{v} .

(23)

Therefore, the covariance matrix of the estimator

\tilde{θ}

is

\begin{matrix} Σ / (\tilde{θ}) & = E [(\tilde{θ} - E [\tilde{θ}]) {(\tilde{θ} - E [\tilde{θ}])}^{⊤}] \end{matrix}

(24)

\begin{matrix} = E [\tilde{θ} {\tilde{θ}}^{⊤}] - {\dot{m}}^{+} \dot{v} {({\dot{m}}^{+} \dot{v})}^{⊤} \end{matrix}

(25)

\begin{matrix} = {\dot{m}}^{+} Σ / (\tilde{\dot{v}}) {({\dot{m}}^{+})}^{⊤} . \end{matrix}

(26)

This result establishes that the sample variance–covariance matrix

\tilde{\dot{V}} (θ)

is unbiased and provides the foundation for deriving the properties of the estimator

\tilde{θ}

, including its covariance structure, as shown in subsequent equations.

The estimator

\tilde{θ}

derived in Equation (16) is both efficient and effective in estimating the variance components. Efficiency in this context refers to the estimator having minimal variance among the class of unbiased estimators, ensuring that the variance components are estimated with the greatest possible precision given the available data. Effectiveness pertains to how well the estimator captures the underlying variance structure while maintaining desirable statistical properties such as unbiasedness and consistency. By leveraging the transformation matrix

L_{r}

and ensuring independence in variance decomposition, the method provides an optimal approach for estimating variance components in mixed models. The explicit use of symmetry further enhances the reliability and interpretability of the estimates, making the approach both statistically sound and computationally feasible. Moreover, we can consider

\tilde{θ}

as a least square estimator, measuring the quality of

\tilde{θ}

by its determination coefficient,

R^{2}

, as well as in the fixed effects models, to obtain the partitions

Y_{l} = 1_{r} Y_{\circ, l} + L_{r}^{⊤} {\dot{Y}}_{l}, l = 1, \dots, b,

(27)

where

Y_{\circ, l} = \frac{1}{r} 1_{r}^{⊤} Y_{l}, l = 1, \dots, b .

(28)

However, in general, the

Y_{\circ, l}

and

{\dot{Y}}_{l}

,

l = 1, \dots, b

, are not independent. To quantify the variability of the block means

Y_{\circ, l}

, we define

σ^{2}

as their variance. This variance can be expressed in terms of the variance components

θ_{j}

as follows

σ^{2} = \frac{1}{r^{2}} 1_{r}^{⊤} (\sum_{j = 1}^{w} θ_{j} M_{j}) 1_{r} = \sum_{j = 1}^{w} c_{j} θ_{j},

(29)

with

c_{j} = \frac{1}{r^{2}} (1_{r}^{⊤} M_{j} 1_{r}), j = 1, \dots, w .

(30)

So, to estimate

σ^{2}

, we define the estimator

{\tilde{σ}}^{2} = \sum_{j = 1}^{w} c_{j} {\tilde{θ}}_{j},

(31)

which depends on the estimated variance components

{\tilde{θ}}_{j}

. However, in general, this estimator

{\tilde{σ}}^{2}

is not independent of the vector of block means

Y_{\circ} = (Y_{\circ, 1}, \dots, Y_{\circ, b}),

(32)

because the variance components

{\tilde{θ}}_{j}

themselves are influenced by the variability within blocks. This lack of independence introduces a complication in interpreting the variance structure.

Special Case

An interesting and important special case arises when the conditionr

1_{r}^{⊤} M_{j} L_{r}^{⊤} = 0_{h_{j}}^{⊤}, j = 1, \dots, w,

(33)

is satisfied. Under this condition, the matrices

{\dot{M}}_{j}

simplify to

{\dot{M}}_{j} = L_{r} M_{j} L_{r}^{⊤} = 0_{h_{j}}^{⊤},

(34)

which implies that the transformed variance–covariance matrix becomes

\dot{V} = \sum_{j = 1}^{w} θ_{j} {\dot{M}}_{j} = 0_{h_{j}}^{⊤} .

(35)

In this case, the block means

Y_{\circ, l}

are independent of the deviations

{\dot{Y}}_{l}

,

l = 1, \dots, b

. Consequently, the estimator

{\tilde{σ}}^{2}

becomes independent of the vector of block means

Y_{\circ} = (Y_{\circ, 1}, \dots, Y_{\circ, b})

. This independence is a desirable property because it simplifies the interpretation of the variance components. Furthermore, under this condition, the estimator

{\tilde{σ}}^{2}

is unbiased, as its expectation equals the true variance

E [{\tilde{σ}}^{2}] = σ^{2} = \sum_{j = 1}^{w} c_{j} θ_{j} .

(36)

This result underscores the importance of the condition in Equation

(33)

, as it ensures that

{\tilde{σ}}^{2}

accurately reflects the variability of the block means without being confounded by the within-block deviations.

3. Multiple Regression Designs

The application of regression models with fixed and random effects has a long history in the statistical literature. Early foundational work, such as that by Henderson [15], introduced best linear unbiased predictors (BLUPs) for mixed models, setting the stage for their use in various disciplines. Later, Searle, Casella, and McCulloch [16] provided comprehensive methodologies for variance component estimation and model inference. These developments have been applied to diverse fields, including genetics, where mixed models are used to estimate heritability [17], and industrial quality control, where random effects are incorporated to account for batch-to-batch variability [18].

A fundamental principle underlying multiple regression designs is the concept of symmetry, which plays a crucial role in variance component estimation. Symmetry in experimental designs allows for an equitable decomposition of effects and interactions, ensuring orthogonality and simplifying inference. As outlined by Scheffé [19], the decomposition of effects into orthogonal components facilitates hypothesis testing and interpretation in complex models. More recently, Mexia [13] extended these ideas to encompass generalized least squares estimators and their properties under specific experimental conditions.

In this section, we analyze multiple regression designs in the context of the base model defined earlier. The focus is on assessing the influence of factors and their interactions on estimable functions. We derive least squares estimators, establish their statistical properties, and construct hypothesis tests and confidence regions. These results are critical for understanding the effects of the experimental design and for validating the model’s assumptions.

Consider the case where the observation vector

Y (l)

, for each of the c treatments, satisfies Equation (33). The mean vector of

Y_{\circ} (l)

is

μ (l) = X β (l)

, and the variance–covariance matrix is

σ^{2} I_{b r}

, with

X

having k linearly independent column vectors. This type of model has been applied in many situations, see Mexia [13]. To simplify computation, we replace

X

with

X^{o}

, derived using the Gram–Schmidt orthonormalization technique, ensuring

R (X^{o}) = R (X)

. This yields the least squares estimator

{\tilde{β}}^{o} (l) = X^{o^{⊤}} Y (l), l = 1, \dots, c,

(37)

with mean vector

β^{o} (l)

and variance–covariance matrix

σ^{2} I_{k}

. The estimators

{\tilde{β}}^{o} (l)

,

l = 1, \dots, c,

are mutually independent and independent from the

S (l) = Y^{o^{⊤}} (l) Y^{o} (l) - X^{o} {\tilde{β}}^{o} (l), l = 1, \dots, c,

(38)

which will be the product by

σ^{2}

of independent central chi squares with

n - k

degrees of freedom,

S (l) = σ^{2} χ_{n - k}^{2}, l = 1, \dots, c,

. Thus,

S = \sum_{l = 1}^{c} S (l) \sim σ^{2} χ_{g}^{2},

(39)

with

g = c (n - k)

.

To analyze the behavior of the individual components of the regression coefficients, we define the vectors

λ (i)

and

\bar{λ} (i)

. These quantities summarize the behavior of the estimators across treatments and allow us to study their distributional properties. Specifically, we define

\{\begin{matrix} λ (i) = (β_{i}^{o} (1), \dots, β_{i}^{o} (c)), i = 1, \dots, k, \\ \bar{λ} (i) = ({\tilde{β}}_{i}^{o} (1), \dots, {\tilde{β}}_{i}^{o} (c)), i = 1, \dots, k, \end{matrix}

(40)

where

λ (i)

represents the true regression coefficients for the i-th component across all c treatments, and

\bar{λ} (i)

represents their least squares estimators.

The estimators

\bar{λ} (i)

,

i = 1, \dots, k

, will be normally distributed, with mean vectors

λ (i)

,

i = 1, \dots, k

, and variance-covariance matrix

σ^{2} I_{c}

. Furthermore, the estimators are independent of one another and also independent of the residual sums of squares

S (1), \dots, S (c)

. As a result, they are also independent of the overall residual sum of squares, S.

To study the influence of factors and interactions in the base design, we introduce an orthogonal partition of the space

R^{c}

. This partition allows us to isolate and analyze the contributions of different effects and interactions through appropriate test statistics. Specifically, we assume the orthogonal partition

R^{c} = ⊞_{j = 1}^{u} \nabla_{j},

(41)

where ⊞ denotes the orthogonal direct sum of subspaces associated with the effects and interactions of the factors. If the

g_{j}

row vectors of

A_{j}

form an orthonormal basis for

\nabla_{j}

,

j = 1, \dots, u

, we have

A_{j} v = 0_{g_{j}}, j = 1, \dots, u,

(42)

if and only if

v \in \nabla_{j}^{⊥}, j = 1, \dots, u .

The symmetric structure of this partition ensures that estimates remain unbiased and efficiently computed.

To analyze specific effects, we define the quantities

\{\begin{matrix} {\tilde{ψ}}_{j} (i) = A_{j} {\tilde{λ}}^{o} (i), \\ ψ_{j} (i) = A_{j} λ (i), \\ ψ_{0, j} (i) = A_{j} d (i), \end{matrix},

(43)

where

i = 1, \dots, k

and

j = 1, \dots, u .

Using these quantities, we define

δ_{j} (i) = \frac{1}{σ^{2}} {∥ ψ_{j} (i) - ψ_{0, j} (i) ∥}^{2}

(44)

and

δ_{j} = \sum_{i = 1}^{k} δ_{j} (i) .

(45)

The hypotheses to be tested are

\{\begin{matrix} H_{0, j} (i, w_{i}) : δ_{j} (i) \leq w_{i}, \\ H_{0, j} (i) : δ_{j} (i) = 0, \\ {\bar{H}}_{0, j} (w) : δ_{j} \leq w, \end{matrix}

(46)

where rejection of the null hypotheses indicates a significant effect or interaction.

To evaluate these hypotheses, we construct the test statistics

\{\begin{matrix} F_{j} (i) = \frac{g}{g_{j}} \frac{{∥ {\tilde{ψ}}_{j} (i) - ψ_{0, j} (i) ∥}^{2}}{S}, \\ F_{j} = \frac{g}{k g_{j}} \frac{\sum_{i = 1}^{k} {∥ {\tilde{ψ}}_{j} (i) - ψ_{0, j} (i) ∥}^{2}}{S}, \end{matrix}

(47)

where the test statistics follow an

F

-distribution with

g_{j}

and g [or

k g_{j}

and g] degrees of freedom and non-centrality parameters

δ_{j} (i)

[or

δ_{j}

] for

i = 1, \dots, k

and

j = 1, \dots, u .

The uniformity in variance structure, a direct consequence of symmetry, enables the construction of uniformly most powerful (UMP) tests. Specifically, as noted by Lehmann-EL and Romano [20], the

F

-tests for the hypotheses

H_{0, j} (i)

,

i = 1, \dots, k

, and

{\bar{H}}_{0, j} (0)

,

j = 1, \dots, c

, are uniformly most powerful (UMP) within the class of tests whose power is determined by non-centrality parameters. Furthermore, their power increases proportionally with these parameters, as established in Mexia [21].

Let

f_{p, g^{'}, g}

denote the p-th quantile for the central

F

-distribution with

g^{'}

and g degrees of freedom. Since

{∥ {\tilde{ψ}}_{j} (i) - ψ_{j} (i) ∥}^{2} \sim σ^{2} χ_{g_{j}}^{2}, i = 1, \dots, k, j = 1, \dots, u,

(48)

and the quantities are independent from S, we have

\sum_{i = 1}^{k} {∥ {\tilde{ψ}}_{j} (i) - ψ_{j} (i) ∥}^{2} \sim σ^{2} χ_{k g_{j}}^{2}, i = 1, \dots, k,

(49)

also independent of

S .

The pivot variables

\{\begin{matrix} F_{j, i}^{'} = \frac{g}{g_{j}} \frac{{∥ {\tilde{ψ}}_{j} (i) - ψ_{j} (i) ∥}^{2}}{S}, i = 1, \dots, k, j = 1, \dots, u, \\ F_{j}^{'} = \frac{g}{k g_{j}} \frac{\sum_{i = 1}^{k} {∥ {\tilde{ψ}}_{j} (i) - ψ_{j} (i) ∥}^{2}}{S}, j = 1, \dots, u, \end{matrix}

(50)

follow a central

F

-distribution with degrees of freedom

g_{j}

[or

k g_{j}

] and

g .

This result provides a framework for constructing confidence regions for the parameters

ψ_{j} (i), i = 1, \dots, k, j = 1, \dots, u .

Specifically, we have

\{\begin{matrix} {∥ ψ_{j} (i) - {\tilde{ψ}}_{j} (i) ∥}^{2} \leq \frac{g_{j}}{g} S, i = 1, \dots, k, j = 1, \dots, u, \\ {∥ ψ_{j} - {\tilde{ψ}}_{j} ∥}^{2} \leq \frac{k g_{j}}{g} S, j = 1, \dots, u, \end{matrix}

(51)

where

{\tilde{ψ}}_{j} = {[{\tilde{ψ}}_{j} {(1)}^{⊤}, \dots, {\tilde{ψ}}_{j} {(k)}^{⊤}]}^{⊤}, j = 1, \dots, u .

(52)

To validate the model, we use the

R^{2} (1), \dots, R^{2} (c)

values introduced earlier and the Bartlett homoscedasticity test, see Bartlett and Kendall [22]. Assuming

S (l) \sim σ^{2} (l) χ_{b (r - 1)}^{2}, l = 1, \dots, c,

(53)

with independent chi squares, we test the hypothesis

H_{0}^{o} : σ^{2} (1) = \dots = σ^{2} (c) .

(54)

Rejection of

H_{0}^{o}

indicates heteroscedasticity, suggesting that the model may need refinement.

Finally, we construct confidence regions for estimable functions. Let

{\tilde{a}}^{⊤} \tilde{μ} \sim N (a^{⊤} μ, σ^{2} a^{⊤} X^{⊤} X a),

(55)

and assume

S \sim σ^{2} χ_{c (n - k)}^{2} .

(56)

The pivot variable

F_{a} = \frac{c (n - k)}{g} \frac{{∥ {\tilde{a}}^{⊤} Y - a^{⊤} X \tilde{β} ∥}^{2}}{a^{⊤} X^{⊤} X a S},

(57)

has a central

F

-distribution with g and

c (n - k)

degrees of freedom. This leads to the

1 - α

confidence region

{∥ {\tilde{a}}^{⊤} Y - a^{⊤} X \tilde{β} ∥}^{2} \leq f_{1 - α, c (n - k), g} \sqrt{a^{⊤} X^{⊤} X a S} .

(58)

This confidence region quantifies the uncertainty in the estimable functions, providing a practical tool for inference.

4. Numerical Application

In this section, we first illustrate the theory using simulated data and then apply the proposed model to a real dataset.

All computations and simulations were conducted using the R software 4.3.0. The program estimates variance components within a balanced experimental design, defining both a full model and a two-block model. It constructs the necessary design matrices, simulates responses from a normal distribution, and applies the formulas presented in the previous sections to estimate the variance components.

4.1. Simulated Data

In agricultural experiments, it is common to study crop yields under different fertilizers while accounting for the variability introduced by different plots of land. In this example, let us consider such data, where plots are treated as fixed effects and fertilizers as random effects. The dataset consists of crop yields for four plots (fixed effect factor) under four different fertilizers (random effect factor), as shown in Table 1. To investigate potential differences in variability across subsets of the data, the dataset was divided into two blocks: Block 1, which includes data from Plot 1 and Plot 2, and Block 2, which includes data from Plot 3 and Plot 4.

The crop yield data in Table 1 were generated based on the following:

Y_{l} = 1_{8} μ + X_{l} β + Z_{l} γ + ε, l = 1, 2,

(59)

where

1_{8}

is a column vector of ones with eight elements

(r = 8)

, representing the intercept term;

Y_{l}

contains the observed crop yields for block l;

β

is a vector of unknown fixed effects associated with the plots; and

γ

is a vector of random effects associated with the fertilizers, assumed to be normally distributed with mean zero and variance–covariance matrix

G

, where

G = θ_{γ} I_{8}

, with

θ_{γ}

being the variance component for the random effects and

I_{8}

being the

8 \times 8

identity matrix. Additionally,

ε_{l}

represents the error term for block l, with its elements being independent and identically distributed, following a normal distribution with mean zero and variance

σ^{2}

.

The design matrices for the fixed and random effects are, respectively,

X_{l} = [\begin{matrix} 1 & 0 \\ 0 & 1 \\ 1 & 0 \\ 0 & 1 \\ 1 & 0 \\ 0 & 1 \\ 1 & 0 \\ 0 & 1 \end{matrix}], Z_{l} = [\begin{matrix} 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 1 \end{matrix}] .

For data generation, the variance components were set as

θ_{γ} = 1

and

σ^{2} = 1

. The crop yield data for each block,

Y_{l}

, were simulated in R software 4.3.0 according to Equation (59).

θ_{γ}

and

σ^{2}

were estimated as indicated in (16) and the orthogonal transformation matrix used was

L_{r} = \frac{1}{\sqrt{2}} [\begin{matrix} - 1 & 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 \\ - 1 & - 1 & 1 & 1 & - 1 & - 1 & 1 & 1 \\ 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 & 1 \\ - 1 & - 1 & - 1 & - 1 & 1 & 1 & 1 & 1 \\ 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 \\ 1 & 1 & - 1 & - 1 & - 1 & - 1 & 1 & 1 \\ - 1 & 1 & 1 & - 1 & 1 & - 1 & - 1 & 1 \end{matrix}] .

This transformation ensures that the variance–covariance structure becomes clearer. For example, the variance of

Y_{l}

is represented as

\dot{V} (θ) = \sum_{j = 1}^{w} θ_{j} {\dot{M}}_{j},

where

{\dot{M}}_{1} = L_{r} Z_{l} Z_{l}^{⊤} L_{r}^{⊤}, {\dot{M}}_{2} = L_{r} I_{8} L_{r}^{⊤} .

Using the transformed data, the variance components

θ_{γ}

and

θ_{ε}

were estimated by solving

v e c h (\tilde{\dot{V}} (θ)) = \dot{m} θ,

with

\tilde{θ} = {\dot{m}}^{+} v e c h (\tilde{\dot{V}} (θ)) .

To analyze the entire model (i.e., using data from all four plots without splitting into blocks), the same linear mixed model framework was applied. The dataset was treated as a single block, where the observation vector

Y

for all plots was modeled as

Y = 1_{16} μ + X β + Z γ + ε,

(60)

with

1_{16}

being a column vector of ones with 16 elements,

X

encoding the fixed effects for all plots, and

Z

representing the random effects for the fertilizers,

X = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \end{matrix}], Z = [\begin{matrix} 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 1 \end{matrix}] .

The variance components

θ_{γ}

and

σ^{2}

were estimated using the same method described before.

The estimation procedure was repeated 1000 times, yielding the mean variance component estimates in Table 2.

As can be seen, the estimates for the variance components

θ_{γ}

and

σ^{2}

differed slightly when calculated within the blocks as compared to the entire model. When considering the entire model, the estimates for the variance components were more averaged, representing a general overview of the variability across all plots. However, the block-wise analysis yielded slightly more refined estimates for each block, indicating the possibility of different variance structures within the blocks. So, while the differences in estimates were not large, the block approach can be beneficial for situations where there are substantial differences between groups (blocks), as it may lead to more accurate and tailored estimates of the variance components.

4.2. Real Data

In this subsection, we apply the proposed model to a real dataset related to housing affordability across different European countries. Specifically, we analyze the standardized house-price-to-income ratio for four countries (Croatia, Spain, Ireland, and Poland) over four years (2020–2023). This ratio measures how affordable housing is in each country, with higher values indicating higher house prices relative to income.

The dataset, sourced from Eurostat [23], 2025, is presented in Table 3.

Here, years are treated as a fixed effect, while countries are modeled as a random effect. The analysis follows the same methodology as in the simulated data case. That is, considering the entire dataset and the block approach, the dataset is divided into Block 1 (2020 and 2021) and Block 2 (2022 and 2023). The results are shown in Table 4.

The random effect variance

{\tilde{θ}}_{γ}

, which accounts for country-level differences, varies significantly across blocks (15.4309 for Block 1 vs. 19.2157 for Block 2). The error variance

{\tilde{σ}}^{2}

also differs (higher in Block 1, lower in Block 2), suggesting changes in overall variability over time. The entire model provides an averaged estimate of variance components (

{\tilde{θ}}_{γ} = 17.2496

,

{\tilde{σ}}^{2} = 2.6584

), potentially smoothing over important structural differences across years.

The block-wise model provides a more refined view of the variance structure across different periods, which is valuable when significant shifts occur in the data. In contrast, the entire model offers a more general picture, which may be useful for broad trend analysis. In this case, since

{\tilde{θ}}_{γ}

and

{\tilde{σ}}^{2}

differ across blocks, using a separated model can be advantageous if the goal is to capture year-specific variability. However, if the primary interest is in overall trends, the entire model remains a reasonable choice.

5. Limitations and Future Directions

While the models proposed in this paper offer significant advancements, several limitations should be acknowledged. First, the assumptions regarding the structure of the data, such as the CJA condition for variance–covariance matrices, may not always hold in practical applications. Additionally, the computational complexity of implementing these models could present challenges, particularly in large-scale datasets or when dealing with high-dimensional random effects. Future work could explore relaxing these assumptions.

6. Final Remarks

This paper offers a comprehensive guide to linear models with nested random effects. The statistical methods developed in this paper rely on symmetry principles. From the orthogonal transformations that simplify the variance structure to the independence and unbiasedness of estimators, symmetry plays a pivotal role in ensuring the robustness and elegance of the statistical framework. A significant innovative aspect is the extension of results to multi-treatment regression models with nested random effect factors, providing a more versatile modeling approach. The models developed are particularly practical when variance–covariance matrices fall under a commutative Jordan algebra (CJA), expanding their applicability. A numerical application is also provided to illustrate the practical implementation of these concepts and methods.

Funding

Nece and this work are supported by FCT—Fundação para a Ciência e a Tecnologia, I.P. by project references UIDB/00212/2020, UIDB/04630/2020 and UIDB/00297/2020.

Data Availability Statement

Eurostat Database. Available online: https://ec.europa.eu/eurostat/databrowser/view/tipsho60/default/table?lang=en&category=prc.prc_hpi (accessed on 1 February 2025).

Conflicts of Interest

The author declares no conflicts of interest.

References

Brown, H.; Prescott, R. Applied Mixed Models in Medicine; Wiley: New York, NY, USA, 1999. [Google Scholar]
Pinheiro, J.C.; Bates, D.M. Mixed-Effects Models in S and S-PLUS; Springer: New York, NY, USA, 2000. [Google Scholar]
Rao, C.R.; Kleffe, J. Estimation of Variance Components and Applications; North-Holland: Amsterdam, The Netherlands, 1988. [Google Scholar] [CrossRef]
Sahai, H.; Ageel, M.I. Analysis of Variance: Fixed, Random and Mixed Models; Birkhäuser: Cambridge, MA, USA, 2000. [Google Scholar]
Demidenko, E. Mixed Models: Theory and Applications with R; Wiley: New York, NY, USA, 2013. [Google Scholar]
Tack, L.; Müller, S. Fast REML estimation for linear mixed models with kronecker product covariance structures. J. Comput. Graph. Stat. 2021, 30, 1148–1160. [Google Scholar]
Lee, Y.; Nelder, J.A.; Pawitan, Y. Generalized Linear Mixed Models: A Unified Approach; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
Fonseca, M.; Mexia, J.T.; Zmyslony, R. Binary operation on Jordan algebras and orthogonal normal models. Linear Algebra Appl. 2006, 417, 75–86. [Google Scholar] [CrossRef]
Mexia, J.T.; Vaquinhas, R.; Fonseca, M.; Zmyslony, R. COBS: Segregation, Matching, Crossing and Nesting. Latest Trends on Applied Mathematics, Simulation, Modeling. In Proceedings of the 4th International Conference on Applied Mathematics, Simulation, Modelling, Corfu Island, Greece, 22–25 July 2010. [Google Scholar]
Ferreira, D.; Ferreira, S.S.; Nunes, C.; Mexia, J.T. Inference in nonorthogonal mixed models. Math. Methods Appl. Sci. 2022, 45, 3183–3196. [Google Scholar] [CrossRef]
Ferreira, D.; Ferreira, S.S.; Antunes, P.; Oliveira, T.A.; Mexia, J.T. Inference in mixed models with a mixture of distributions and controlled heteroscedasticity. Commun. Stat.—Theory Methods 2024. [Google Scholar] [CrossRef]
Bailey, R.A.; Cameron, P.J.; Ferreira, D.; Ferreira, S.S.; Nunes, C. Designs for Half-Diallel Experiments with Commutative Orthogonal Block Structure. J. Stat. Plan. Inference 2024, 231, 106139. [Google Scholar] [CrossRef]
Mexia, J.T. Multi-Treatment Regression Designs; Universidade Nova de Lisboa: Lisbon, Portugal, 1987. [Google Scholar]
Anderson, T.W. Introduction to Multivariate Statistical Analysis; John Wiley & Sons, Inc.: New York, NY, USA, 1958. [Google Scholar]
Henderson, C.R. Estimation of genetic parameters. Ann. Math. Stat. 1950, 21, 309–310. [Google Scholar]
Searle, S.R.; Casella, G.; McCulloch, C.E. Variance Components; John Wiley & Sons: New York, NY, USA, 1992. [Google Scholar]
Lynch, M.; Walsh, B. Genetics and Analysis of Quantitative Traits; Sinauer Associates: Sunderland, MA, USA, 1998. [Google Scholar]
Montgomery, D.C. Design and Analysis of Experiments, 8th ed.; John Wiley & Sons: New York, NY, USA, 2008. [Google Scholar]
Scheffé, H. The Analysis of Variance; Wiley: New York, NY, USA, 1959. [Google Scholar]
Lehmann, E.L.; Romano, J.P. Testing Statistical Hypotheses; Springer: New York, NY, USA, 2005. [Google Scholar]
Mexia, J.T. Controlled Heteroscedasticity, Quocient Vector Spaces and F Tests for Hypothesis on Mean Vectors. Ph.D. Thesis, Universidade Nova de Lisboa, Lisbon, Portugal, 1987. [Google Scholar]
Bartlett, M.S.; Kendall, D.G. The Statistical Analysis of Variance Heterogeneity and the Logarithmic Transformation. Suppl. J. R. Stat. Soc. 1946, 8, 128–138. [Google Scholar] [CrossRef]
Eurostat Database. Available online: https://ec.europa.eu/eurostat/databrowser/view/tipsho60/default/table?lang=en&category=prc.prc_hpi (accessed on 1 February 2025).

Table 1. Crop yields under different fertilizers.

Crop	Fertilizer
Plot	Fertilizer 1	Fertilizer 2	Fertilizer 3	Fertilizer 4
1	$y_{1}$	$y_{5}$	$y_{9}$	$y_{13}$
2	$y_{2}$	$y_{6}$	$y_{10}$	$y_{14}$
3	$y_{3}$	$y_{7}$	$y_{11}$	$y_{15}$
4	$y_{4}$	$y_{8}$	$y_{12}$	$y_{16}$

Table 2. Variance component estimates for different blocks and the entire model.

Block/Model	${\tilde{θ}}_{γ}$	${\tilde{σ}}^{2}$
Block 1 (Plots 1 and 2)	1.0408	1.0165
Block 2 (Plots 3 and 4)	0.9834	0.9997
Entire Model (all plots)	1.0162	1.0049

Table 3. Standardized house-price-to-income ratio—annual data.

Year	Croatia	Spain	Ireland	Poland
2020	93.91	92.71	99.36	97.92
2021	92.67	95.79	86.80	90.25
2022	94.97	93.30	100.87	96.56
2023	100.69	95.23	89.99	87.88

Table 4. Variance component estimates for different blocks and the wntire model.

Block/Model	${\tilde{θ}}_{γ}$	${\tilde{σ}}^{2}$
Block 1 (2020 and 2021)	15.4309	3.5482
Block 2 (2022 and 2023)	19.2157	1.6213
Entire Model (all years)	17.2496	2.6584

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ferreira, D. Linear Models with Nested Random Effects. Symmetry 2025, 17, 374. https://doi.org/10.3390/sym17030374

AMA Style

Ferreira D. Linear Models with Nested Random Effects. Symmetry. 2025; 17(3):374. https://doi.org/10.3390/sym17030374

Chicago/Turabian Style

Ferreira, Dário. 2025. "Linear Models with Nested Random Effects" Symmetry 17, no. 3: 374. https://doi.org/10.3390/sym17030374

APA Style

Ferreira, D. (2025). Linear Models with Nested Random Effects. Symmetry, 17(3), 374. https://doi.org/10.3390/sym17030374

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Linear Models with Nested Random Effects

Abstract

1. Introduction

2. Variance Components

Special Case

3. Multiple Regression Designs

4. Numerical Application

4.1. Simulated Data

4.2. Real Data

5. Limitations and Future Directions

6. Final Remarks

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI