You are currently viewing a new version of our website. To view the old version click .
Risks
  • Article
  • Open Access

3 November 2025

Chain Ladder Under Aggregation of Calendar Periods

School of Risk and Actuarial Studies, University of New South Wales, Randwick, NSW 2052, Australia

Abstract

The chain ladder model is defined by a set of assumptions about the claim array to which it is applied. It is, in practice, applied to claim arrays whose data relate to different frequencies, e.g., yearly, quarterly, monthly, weekly, etc. There is sometimes a tacit assumption that one can shift between these frequencies at will, and that the model will remain applicable. It is not obvious that this is the case. One needs to check whether a model whose assumptions hold for annual data will continue to hold for a quarterly (for example) representation of the same data. The present paper studies this question in the case of preservation of calendar periods, i.e., (in the example) annual calendar periods are dissected into quarters. The study covers the two most common forms of chain ladder model, namely the Tweedie chain ladder and Mack chain ladder. The conclusion is broadly, if not absolutely, negative. Certain parameter sets can indeed be found for which the chain ladder structure is maintained under a change in data frequency. However, while it may be technically possible to maintain the chain ladder model under such a change to the data, it is not possible in any reasonable, practical sense.

1. Introduction

1.1. Purpose of the Paper

The present paper is concerned with the effect of data granularity on the chain ladder (; , ; ). The limitations of this model have been pointed out by many authors, e.g., () (non-stochastic model); () (model specification error); () (parameter uncertainty); () (rapid change in exposure, changing speed of claim settlement); () (changing business mix, unusual events, trends and shocks); () (outlier observations).
None of the nominated limitations involve data granularity. Seemingly, only one contribution to the prior literature does so. This is (), who considers the difference between the conventional chain ladder, applied to discrete data, and a continuous form of the model.
The effect of data granularity on the performance of the chain ladder was investigated by (, ). Performance was defined in terms of variance of forecast loss reserve. Granularity was defined in terms of the duration of the cells of the claim array; shorter durations equated to greater granularity.
The first of the two cited papers considered the EDF chain ladder, and showed that increased granularity always reduced variance. The second paper dealt with the Mack chain ladder, and considered the special case in which reductions in granularity amounted to coalescence of development periods. The same coalescences occurred over all accident periods and, in this sense, the changes in granularity “preserved development periods”.
In this case, it was found that increased granularity “usually” reduced variance, where a precise meaning was given to the qualifier “usually”.
Both of the cited papers pointed out that changes of granularity can be defined so as to preserve either development periods or calendar periods. The former case considered by () naturally prompts a question as to how the performance of chain ladder forecasts are affected by changes of granularity that preserve calendar periods.
As discussed in the prior papers, there is one preliminary issue that requires consideration before any examination of the performance of the chain ladder under changed granularity. Suppose one commences with a claim array that satisfies chain ladder assumptions, and decreases granularity by aggregating cells in one way or another.
This will define a new (aggregated) array, and one needs to consider whether or not it still satisfies chain ladder assumptions. If it does not, one cannot (validly) apply the chain ladder, and questions about chain ladder forecast variance become meaningless.
(, ) showed that chain ladder assumptions continued to hold under aggregation when that aggregation preserved development periods. It was also shown in () that this will not occur for the special case of a Poisson chain ladder unless the model parameters satisfy quite specific constraints. If these constraints are not met, the chain ladder will no longer be applicable, and questions about change in forecast variance will not be sensible.
However, the implications of these constraints were not followed up in the earlier paper. Although those constraints were stated definitively, the statement was abstract, and its full implications not made manifest.
The purpose of the present paper is to fill this gap by examining the parameter constraints in detail, and to determine whether the restriction they impose on calendar period aggregation is a mere minor nuisance or a fundamental impediment.
In practical cases, one might (or might not) find that aggregation or disaggregation of calendar periods creates numerically small inconsistencies with the chain ladder (see e.g., Table 1 and related discussion), and that these are inconsequential. This may even be true as far as point forecasts are concerned. However, even under such a pragmatic empirical approach, it is advisable that one be fully aware of the properties of the model in use.
Table 1. EDF chain ladder parameters.
Moreover, the further one delves into the fine detail of a model implementation, the more significant these inconsistencies might become. For example, suppose one wishes to bootstrap a chain ladder (). It will be assumed that chain ladder residuals are identically distributed, but this will not be so if the chain ladder assumption on cell mean structure is violated. Is this still inconsequential? The answer is now less clear.
Thus, while the headline results of the model might be accompanied by some robustness against small failings in the model formulation, other quantities of more delicate and sophisticated construction might be more sensitive.

1.2. Layout of the Paper

The remainder of the paper is structured as follows. Section 2 introduces notation and fundamental mathematical entities. It also defines granularity in precise terms, specifically in terms of mesh size. Section 3 introduces the two forms of chain ladder model with which the paper is concerned. These are the Tweedie and Mack models.
Section 4.1 narrows the focus of the Tweedie chain ladder to the CMVR Tweedie model from which the widely familiar chain ladder algorithm emerges. The effects of aggregation of calendar periods are considered in this context. Section 4.1.1 shows that certain of the chain ladder assumptions that hold for a given data array continue to hold when calendar periods are aggregated.
However, Section 4.1.2 finds that the cell mean algebraic structure assumed by the chain ladder is preserved under aggregation only if some constraints on the model parameters are satisfied. The sub-section proceeds to analyze in detail the extent to which these constraints compress the natural parameter space.
Section 4.2 then contains an analysis parallel to that of Section 4.1, but this time in relation to the Mack chain ladder model. Some concluding remarks appear in Section 5.

2. Notation and Mathematical Preliminaries

The notational and mathematical apparatus required here is much the same as in (, ), and the majority of it is taken from those sources.

2.1. Fundamentals

Let i = 1 ,   2 , ,   I denote accident period and j = 1 ,   2 , ,   I development period. Let Y i j , with mean μ i j , denote the random variable representing the amount of claim payments during development period j of accident period i . These quantities are usually referred to as incremental claim payments. Let D denote the entire I × I data array: D = Y i j : i , j = 1 ,   2 , , I . A realization of Y i j will be denoted y i j .
A calendar period consists of all those combinations of i and j such that i + j is constant. For definiteness, label the calendar t if i + j 1 = t .
It will also be convenient to define cumulative claim payments. Thus define
Z i j =   k = 1 j Y i k ,
which is the random variable representing the amount of cumulative claim payments up to the end of development period j of accident period i . Further, let z i j be defined in the same way in terms of the y i k , so that z i j is a realization of Z i j .
A q -dimensional Euclidean space will be denoted by R q , and the interior of its positive orthant by R + q . The unit matrix of dimension N will be denoted by I N , the N -dimensional column vector with all entries unity by 1 N , and indicates the Kronecker matrix product, defined as follows: if A is an m × n matrix with entries a i j and B a p × q matrix, then A B is the m p × n q matrix
A B = a 11 B a 12 B a 1 n B a 21 B a 11 B a m 1 B a m n B .

2.2. Tweedie Distribution

The Tweedie family of distributions was introduced by () as a sub-family of the exponential dispersion family (“EDF”) (). A member of the Tweedie family has the pdf
p Y = y ; μ , ϕ = a y , ϕ e x p y μ 1 p 1 p μ 2 p 2 p ϕ ,
for some function a . , . , and where μ = E Y , ϕ > 0 is a dispersion parameter, and p is a parameter with p , 0 [ 1 , ) that defines the shape of the distribution. It will be said that Y   ~   T w μ , ϕ , p . For such Y ,
V a r Y = ϕ μ p .
There are known special cases for specific values of p . One of particular interest that was discussed in () is the case p = 1 , which yields the over-dispersed (“ODP”) Poisson distribution.
There is a further sub-family of the Tweedie that will be of interest here. This is the constant mean-variance ratio (“CMVR”) sub-family studied by (). In relation to a given data array D , the Y i j D will lie within the CMVR Tweedie sub-family if Y i j   ~ T w μ i j , ϕ i j , p and E Y i j / V a r Y i j = c o n s t . , independent of i , j . By Equation (3), this requires that
ϕ i j μ i j 1 p = c o n s t .
Note that, for ODP distributed observations ( p = 1 ), the requirement is that ϕ i j = ϕ ,   c o n s t .
This CMVR sub-family will be denoted C M V R p , K , where K > 0 is the common mean-variance ratio. Thus, Y i j D C M V R p , K . [With a slight abuse of notation, the random variable Y i j will be notated as belonging to the sub-family if its distribution does].
Theorem 2.5 and related comment in () amount to the following.
Proposition 1.
The sub-family  C M V R p , K  is closed under addition of stochastically independent random variables.

2.3. Mesh Size

Section 2.1 is phrased in terms of accident, development and calendar periods, without any specification of the meaning of “period”. Often, in the literature, this unit is a year. But it need not be; it might be a quarter, month, week, or any other convenient length of time. This will be referred to as the mesh size of the data array. Changes of mesh are described in Section 2.2.1 of (), from which the following explanation is extracted.
Suppose that I = N q for some strictly positive integers N , q . The mesh size can be changed from one unit of time to q units. There will then be N accident and development periods, instead of the original I .
An example would be the case in which the units of time in a data set are quarters and I = 40 , N = 10 , q = 4 . Here, the mesh size is changed from quarters to years, and the aggregated data set contains 10 accident years and 10 development years.
It also contains aggregated calendar periods, each consisting of q unaggregated calendar periods. In the case of general N , q , denote these by t k * = t = k 1 q + 1 , k 1 q + 2 , , k q ,   k = 1 ,   2 , . Accident periods will also be aggregated with the same mesh size, i.e., i k * = i = k 1 q + 1 , k 1 q + 2 , , k q ,   k = 1 ,   2 , .
The aggregated triangle will then consist of cells defined by the ordered pairs i k * , t l * . These cells can be expressed in the more familiar accident period/development period form by defining j * = t * i * + 1 . The cell defined by i * , t * can also be defined by the ordered pair i * , j * with j * defined as just stated.
This cell consists of all i , t such that q i * 1 < i q i * ,   q t * 1 < t q t * . Consider the last of these inequalities, which is equivalent to
i + j 1 q i * + j * 1 ,   or j q j * 1 + q i * i + 1 .
The penultimate inequality may be similarly treated, so that the cell i * , j *   consists of all i , j such that
q i * 1 < i q i * ,   and q j * 2 + q i * i + 1 < j q j * 1 + q i * i + 1 ,
where it is required that j 1 .
Note that, by reference to the constraints on i in relations (6) the quantity q i * i + 1 satisfies the following:
1 q i * i + 1 q .
This aggregation from single-unit to q -unit periods is commonly used in commercial practice. It is illustrated in Figure 1, in which a 40 × 40 quarterly array is collapsed to a 10 × 10 yearly array ( q = 4 ). The upper triangle, representing past data in a loss reserving context, is shaded yellow, the lower green. Calendar years are delineated by the red diagonals.
Figure 1. Change in mesh size from quarterly to yearly. Yellow denotes past cells; green future. Blue and purple mark yearly cells under aggregation of calendar quarters. The construction of these cells for development year 1 is exceptional, and these are shaded orange.
Development years 5 and 6 are shaded blue and purple, respectively. Note their nature, as staggered aggregations of accident quarters. Development year 1 is also shaded orange to illustrate its exceptional nature.
As an example of the application of relations (6), consider the case i * , j * = 2 , 5 . Inequalities (6) become 4 < i 8 , 21 i < j 25 i , which yields i , j cells 5 , 17   t o   20 , 6 , 16   t o   19 , 7 , 15   t o   18 and 8 , 14   t o   17 , and this reconciles with the diagram.
The same application of relations (6) to the case j * = 1 produces non-positive values of j , and these require exclusion. This explains the exceptional nature of j * = 1 in the diagram.
With this definition of aggregated cells, payments in these cells can be defined. Let Y i * j * denote claim payments made in the aggregated i * , j * cell, where square backets indicate that the suffixes are to be read in the context of the enlarged mesh. Then, by the above reasoning,
Y i * j * = i = q i * 1 + 1 q i * t = q t * 1 + 1 q t * Y i j = i = q i * 1 + 1 q i * j = q j * 2 + q i * i + 2 q j * 1 + q i * i + 1 Y i j ,
where each double summation includes q 2 summands. Let D * denote the entire N × N array under the increased mesh: D q * = Y i * j * : i * , j * = 1 ,   2 , , N .
Similarly, let Z i * j * denote cumulative claim payments to the end of aggregated cell i * , j * . For accident period i within i * , the final unaggregated development period in j * is q j * 1 + q i * i + 1 = q t * i + 1 , and so
Z i * j * = i = q i * 1 + 1 q i * Z i , q t * i + 1 = i = q i * 1 + 1 q i * j = 1 q t * i + 1 Y i j ,
where the last equality follows from Equation (1).

3. Chain Ladder Models

3.1. Tweedie Chain Ladder

Chapter 6 of () discusses the EDF chain ladder, a version of the chain ladder in which the distributions of all observations are members of the EDF. The Tweedie family of distributions was defined in Section 2.2, as was the CMVR sub-family for a given data set. Since these are sub-families of the EDF, chain ladder models may be defined on them, as special cases of the EDF chain ladder.
Of interest here is the CMVR Tweedie chain ladder, a special case in which the distributions of all observations are members of C M V R p , K for fixed p , K . The model is defined by the following assumptions.
Assumption 1.
For  Y i j D , a data array, all  Y i j C M V R p , K  for fixed  p , K .
Assumption 2.
The  Y i j D  are stochastically independent.
Assumption 3.
For all  Y i j D ,  μ i j = a i b j  for some parameters  a i , b j > 0 ,   i ,   j = 1 , ,   I .
This last assumption immediately yields the following result, which will be useful later.
Proposition 2.
Under Assumption 3,  μ i , j + 1 / μ i j = b j + 1 / b j , which depends on  j  but not  i .

3.2. Mack Chain Ladder

The Mack chain ladder was defined by (). The same definition will be used here, and so will be repeated below.
Assumption 4 (Mack assumption (1)).
E Z i , j + 1 Z i 1 , , Z i j = Z i j f j  for all  Z i j D  with  j < I .
The factor f j > 0 is usually referred to as an age-to-age factor or a link ratio.
Assumption 5 (Mack assumption (2)).
Different accident periods are stochastically independent, i.e.,
Z i 1 , , Z i I , Z k 1 , , Z k I   are   independent   for   i k .
Assumption 6 (Mack assumption (3)).
V a r Z i , j + 1 Z i 1 , , Z i j = Z i j σ j 2  for all  C i j D  with  j < I .
This is the totality of Mack’s assumptions. () added several others, which were necessary for consideration of estimation of chain ladder parameters and application of the model to the forecast of a loss reserve. However, they are not required here.

4. Change in Mesh Size Under Preservation of Calendar Periods

This section commences with a data array D that satisfies one of the two sets of chain ladder assumptions. The mesh size is then increased, while preserving calendar periods in the sense of Section 2.3 to form a new (aggregated) data array D q * , and the question of whether this array satisfies chain ladder assumptions is considered. Section 4.1 and Section 4.2 consider the CMVR Tweedie and Mack chain ladder models, respectively.

4.1. CMVR Tweedie Chain Ladder

4.1.1. Cell Distributions and Independence

Proposition 3.
For  Y i * j * D q * , all  Y i * j * C M V R p , K  for the same fixed  p , K  as in Proposition 1.
Proof. 
Follows immediately from Assumption 1, Equation (8) and Proposition 1. □
Proposition 4.
The  Y i * j * D q *  are stochastically independent.
Proof. 
First note that the summands of the various Y i * j * in Equation (8) form disjoint subsets of D . The proposition then follows immediately from Assumption 2. □
Thus, at least Assumptions 1 and 2, suitably modified, hold for the data array D q * .

4.1.2. Cell Means

This sub-section considers whether Assumption 3 continues to hold for the data array D q * .
By Equation (8) and Assumption 3,
μ i * j * = i = q i * 1 + 1 q i * j = q j * 2 + q i * i + 2 q j * 1 + q i * i + 1 μ i j = i = q i * 1 + 1 q i * a i j = q j * 2 + q i * i + 2 q j * 1 + q i * i + 1 b j .
By convention, the non-positive values of j that occur as summation indexes when j * = 1 are ignored.
To lighten the notation, it will be useful to write A i * = q i * 1 ,   B j * = q j * 1 + 1 . Then the lower limit of the summation over j in Equation (10) may be expressed as B j * + A i * i + 1 . Now set i = A i * + k , which reduces Equation (10) to
μ i * j * = k = 1 q a A i * + k j = B j * k + 1 B j * k + q b j = k = 1 q a A i * + k j = 1 q b B j * k + j = k = 1 q a A i * + k c B j * k ,    
where
c B j * k = l = 1 q b B j * k + j .
It is convenient to re-express Equation (11) in vector form:
μ i * j * = a A i * T c B j * ,
where the upper T denotes transposition and the vectors a A i * , c B j * are defined as a A i * = a A i * + 1 , , a A i * + q T ,   c B j * = c B j * 1 , , c B j * q T . Note that d i m a A i * = d i m c B j * = q .
If an assumption parallel to T3 is to hold for D q * , then a result parallel to Proposition 2 must also hold, i.e.,
μ i * j * + 1 μ i * j * = a A i * T c B j * + 1 a A i * T c B j *   i s   i n d e p e n d e n t   o f   i *   f o r   a l l   i * = 1 , , N ;   j * = 1 , , N 1 ,
where the equality follows from Equation (13).
Even without further analysis, this is seen to be a highly restrictive condition, as it is a constraint on the allowable model parameters. However, there is benefit in evaluating the extent of the restriction. This will be taken up later in the present sub-section.
For the moment, the following numerical example deals with a case in which Equation (14) does not hold.
Numerical example. Consider the case of q = 4 ,   N = 10 ,   j * = 2 . One of the implications of Equation (14) is
a A 1 * T c B 3 * a A 1 * T c B 2 * = a A 2 * T c B 3 * a A 2 * T c B 2 * .
By definition of A i * , B j * , this last equality is
a 0 T c 9 a 0 T c 5 = a 4 T c 9 a 4 T c 5 ,
and then a 0 = a 1 , , a 4 T ,   c 5 = c 4 , , c 1 T , where
c 5 k = j = 1 4 b 5 k + j   ,   k = 1 , , 4 ,
with similar definitions of the other vectors in Equation (14).
Now suppose that the relevant values of a i , b j are as in Table 1, where the values of c j , calculated from Equation (12), are also displayed.
The ratio in Equation (14) can be calculated from the tabulated values, using the expressions in Equation (16), and it is found that
a A 1 * T c B 3 * a A 1 * T c B 2 * = 1.679 ,   a A 2 * T c B 3 * a A 2 * T c B 2 * = 1.663 ,
and so Equation (14) fails to hold for this set of parameters a i , b j . In this case, if the original 40 × 40 array satisfies the chain ladder assumptions, then its 10 × 10 reduction under lesser granularity fails to satisfy them.
For the purpose of investigating precisely which sets of parameters a i , b j satisfy requirement Equation (14), it will be convenient to re-express this requirement in the following equivalent form:
For each j * = 1 , , N 1 ,
μ i * j * + 1 μ i * j * = μ k * j * + 1 μ k * j *   f o r   a l l   i * ,   k * = 1 , , N .
Substitution of Equation (13) into this relation yields, for each j * = 1 , , N 1 ,
a A i * T C j * a A k * = 0 ,   i * ,   k * = 1 , , N ,
where
C j * = c B j * + 1 c B j * T c B j * c B j * + 1 T ,   j * = 1 , , N 1 .
It is evident that C j * is a q × q real anti-symmetric matrix.
Before a systematic analysis is conducted, some properties of the C j * and special cases of parameter sets that satisfy Equation (14) (equivalently Equation (19)) can be identified.
Proposition 5.
If  c B j * + 1 = K c B j *  for some constant  K , then  C j * = 0  (rank zero), and in this case Equation (19) is satisfied for any  a A i * , a A k * R + q . Otherwise,  r a n k C j * = 2 .
Proof. 
See Appendix B.1. □
Corollary 1.
C j *  can be of full rank only in the case  q = 2 .
Proposition 6.
(seasonal variation in exposures). Suppose that, in Assumption 3,
a A i * + k = γ i * δ k ,   i * = 1 , , N ;   k = 1 , , q   for   constants   γ i * ,   δ k .
Then Equation (19) holds.
Proof. 
See Appendix B.2. □
Remark 1.
Note the form of Equation (21). In the absence of  γ i *  , the row effect  a A i * + k  follows a cycle with period  q . This might be viewed as seasonal variation in risk exposure. The multiplier  γ i *  adds variation from one  q -unit accident period to another.
Equation (19) places an apparent N 2 constraints on the vectors a A i * for known matrices C j * . However, these constraints are not all independent. Proposition 7 determines which of the N 2 constraints must be retained.
Proposition 7.
To establish Equation (18) for any given  j *  , it will be sufficient to prove that
μ i * j * + 1 μ i * j * = μ i * + 1 j * + 1 μ i * + 1 j *   f o r   a l l   i * = 1 , , N 1 ,
and, similarly, requirement Equation (19) may be restricted to the following:
a A i * T C j * a A i * + 1 = 0 ,   i * = 1 , , N 1 ,
No smaller set of equalities than these  N 1  is sufficient.
Proof. 
See Appendix B.3. □
Now consider the conditions under which Equation (23) holds. For this purpose, it will be convenient to re-express it in the following equivalent form:
a T Q j * i * a = 0 ,   i * ,   j * = 1 , , N 1 ,
where a T is the N q -vector
a T = a A 1 T , , a A N T
and Q j * i * is the N × N block matrix that consists of q × q blocks with C j * in the i * , i * + 1 and i * + 1 , i * block positions, thus
Q j * i * = 0 0 0 0 0 0 C j * 0 0 C j * 0 0 0 0 0 0 i * i * + 1 .
It is seen that Equation (24) imposes the necessary N 1 constraints on the N q -vector a .
A key matrix in the following is
C = C 1 C N 1 .
Let
r a n k C = r .
Remark 2.
Although, according to Proposition 5,  r a n k C j * 2  for each  j * , it is quite possible, even likely in practice, that  C  will be of full rank, i.e.,  r a n k C = q .
Proposition 8.
Consider the condition Equation (19) and suppose that the values of  c B j * , c B j * + 1  are fixed and known. Suppose further that  N 2 . Then the solutions of Equation (19) for  a R + N q  form a set  V M R + N q  where:
1. 
V R N q  is the  N q N 1 r -dimensional linear subspace on which  Q a = 0  for  a V  where  Q  is defined by (A9); and
2. 
M  is a smooth  N q s -dimensional sub-manifold of  R N q    where  s = N 1   m i n   N 1 , r
This set of solutions is smooth except possibly where the closure of M intersects V .
Proof. 
See Appendix B.4. □
It is of interest to examine the dimensions of the sub-manifolds emerging in Proposition 8.
Remark 3 (the case r = 0).
By Proposition 5, this is the case  c B j * + 1 = K c B j *  for all  j * = 1 , , N 1 . Proposition 8 then yields  R + N q  as the set of solutions in  a  , i.e., there is no restriction on the vector  a .
Remark 4.
More generally, one might consider the proportion of dimensions available to  a  that are lost from its natural parameter space in consequence of the constraints Equation (19). Here, natural parameter space means that which would be available to  a  in the absence of the constraints, namely  R + N q . The proportion lost is
  • 1 1 N r q  in respect of  V ;
  • 1 1 N m i n   N 1 q , r q  in respect of  M .
As it is possible that  r = q  (Remark 2), the loss of dimensionality in  V   a n d   M  can be very severe indeed.
Example 1.
Consider again the numerical example earlier in the present sub-section in which  q = 4 ,   N = 10  . As pointed out in Section 2.3, this can relate to the aggregation of a  40 × 40  quarterly claim array to form a  10 × 10  annual array. Suppose further that  r = q = 4 , whence  V   a n d   M  are subsets of  R 40  each of dimension 4.
This example demonstrates how, in a realistic setting, the solution set of Equation (19) may be extremely restricted. There is an almost vanishingly small proportion of cases in which the CMVR Tweedie chain ladder model remains valid after aggregation of calendar periods.
The analysis of the present sub-section has been conducted on the basis that the matrix C (based on parameters b j ) is known and one must find a vector a (based on parameters a i ) that satisfies Equation (19). However, it is apparent from Assumption 3 that the two a and b parameter sets contribute exchangeably to μ i j . This is a fact that has been noted in the previous actuarial literature.
It follows that any procedure applied to these two parameter sets will yield the same results if applied to their exchange, i.e., with a i b j ,   b j a i . Here, it would be necessary to replace C j *  by
A i * = a A i * + 1 a A i * T a A i * a A i * + 1 T ,   i * = 1 , , N 1 ,
and replace a by c where
c T = c B 1 T , , c B N T .
Then, for given a i , an aggregated claim array will satisfy CMVR Tweedie chain ladder assumptions if and only if (compare Equation (19)) b j satisfies
c B j * T A i * c B k * = 0 ,   j * ,   k * = 1 , , N ,
It is also necessary to replace the matrices Q and C by R and A , respectively, by means of the appropriate replacements in Equations (26) and (27) and denote r a n k A by r .
All of the results from Proposition 5 to Example 1 follow with these replacements. It is worthwhile, in particular, providing the re-statement of the main result, Proposition 8.
Proposition 9.
(Proposition 8 re-stated for exchanged parameter sets). Consider the condition Equation (31), and suppose that the values of  a A i * , a A i * + 1  are fixed and known. Suppose further that  N 2 . Then the solutions of Equation (31) for  c R + N q  form a set  V M R + N q  where:
(i) 
V R N q  is the  N q N 1 r -dimensional linear subspace on which  R a = 0  for  a V ; and
(ii) 
M  is a smooth  N q s -dimensional sub-manifold of  R N q  where  s = N 1   m i n   N 1 , r
This set of solutions is smooth except possibly where the closure of  M  intersects  V .
Remark 5.
All above results concerning the maintenance of a chain ladder model under a change in mesh size have been expressed in terms of enlargement of mesh size (aggregation of calendar periods). However, each such result can be re-stated in an alternative and equivalent form in terms of reduction in mesh size. That is to say, if chain ladder structure is (is not) maintained on enlargement of mesh size, then it is (is not) on the inverse change (reduction) of mesh size.

4.2. Mack Chain Ladder

As in Section 4.1, aggregation of a data array D to form a new array D q * will be considered.

4.2.1. Row Independence

Proposition 10.
Different accident periods of  D q *  are stochastically independent, i.e.,
Z i * 1 , , Z i * N ,   Z k * 1 , , Z k * N   are   independent   for   i * k * .
Proof. 
Note that the accident periods contributing to distinct aggregated periods i * and k * in Equation (9) form disjoint subsets. The stated result then follows from Assumption 5. □
Thus, at least Assumption 5, suitably modified, continues to hold for the data array D q *

4.2.2. Cell Means

Consider whether a modified version of Assumption 4 holds for D q * . Specifically, consider whether parameters f j * ,   j * = 1 , , N 1 exist such that E Z i * j * Z i * 1 , , Z i * j * 1 = Z i * j * 1 f j * 1 for all Z i * j * D q * with j * = 2 , , N .
The cell i * , j * is defined in terms of unaggregated cells i , j as set out in Equation (11), whence
E Z i * j * Z i * 1 , , Z i * j * 1 = k = 1 q Z A i * + k , B j * k j = 0 q 1 f B j * k + j .
Note that
Z i * j * 1 = k = 1 q Z A i * + k , B j * k ,
and so
E Z i * j * Z i * 1 , , Z i * j * 1 = Z i * j * 1 f i * j * 1 ,
where
f i * j * 1 = k = 1 q Z A i * + k , B j * k F B j * k : B j * k + q k = 1 q Z A i * + k , B j * k ,
and
F m : m + n = l = 0 n 1 f m + l ,
a compound age-to-age factor.
Equation (35) can be expressed in the more informative form
f i * j * 1 = k = 1 q w A i * + k , B j * k F B j * k : B j * k + q ,
where the w terms are weights
w A i * + k , B j * k =   Z A i * + k , B j * k k = 1 q Z A i * + k , B j * k ,
and are subject to
0 < w A i * + k , B j * k < 1 ,
k = 1 q w A i * + k , B j * k = 1 .
It will be convenient to represent the F terms from Equation (37) in vector form. Specifically, let
w A i * j * 1 = w A i * + 1 , B j * 1 , w A i * + 2 , B j * 2 , , w A i * + q , B j * q T ,   and
w A j * 1 = w A 1 j * 1 T , w A 2 j * 1 T , , w A N j * 1 T T .
Moreover, let
F B j * 1 = F B j * 1 : B j * 1 + q , F B j * 2 : B j * 2 + q , , F B j * q : B j * T ,
whence Equation (37) may be re-expressed as
f i * j * 1 = w A i * j * 1 T F B j * 1 .
This may be compared with Equation (13), and one important difference may be noted. Whereas the leading vector in Equation (13) depended on only i * , the leading vector in Equation (44) depends on both i *   a n d   j * 1 . This will be of significance in the subsequent analysis.
Now Assumption 4 requires the existence of an age-to-age factor f j , independent of i for each j . The corresponding assumption for aggregated data would be that f i * j * 1 be independent of i * . The following proposition states the condition that ensures this.
Proposition 11.
Suppose that the values of  F B j *  are fixed and known. Suppose further that  N 2 . Then, for any fixed  j * = 2 , , N , the vectors  w A i * j * 1 , i * = 1 , , N  must be chosen from an  N q 1 + 1 -dimensional linear subspace of  0 , 1 N q  if the condition that  f i * j * 1  be independent of  i *  is to be satisfied.
Proof. 
See Appendix B.5. □
Remark 6.
In parallel with Propositions 8 and 9, the requirement that the Mack chain ladder remain applicable to a claim array when calendar periods are aggregated requires a substantial reduction in dimension of the natural parameter space (as defined in Remark 4) available to  w A i * j * 1 , i * = 1 , , N , namely  0 , 1 N q . The proportionate reduction in dimensions is  1 1 N 1 q . This occurs for every column of the claim array.
Remark 7.
Just as in the case of the CMVR Tweedie chain ladder, the loss of parameter dimensionality required for maintenance of the Mack chain ladder in the case of calendar period aggregation can be severe. However, the situation is markedly worse in the case of the Mack chain ladder, as the constrained vectors  w A i * j * 1  are stochastic. Constraints on stochastic quantities will almost never be satisfied.

5. Discussion and Conclusions

(, ) considered the question of whether increased data granularity reduced prediction uncertainty arising from application of chain ladder models. The answer was very largely affirmative, but those papers considered only the case in which increased granularity was achieved by dissecting development periods.
A natural extension of that work consists of a similar examination of the increase in granularity achieved by the dissection of calendar periods. Before embarking on this study, one needs to check whether the chain ladder assumptions that hold for some specific data set continue to hold under a change in granularity arising from either dissection or aggregation of calendar periods.
() had already noted that maintenance of these assumptions was not automatic and, at least in the case of a Poisson chain ladder, occurred only when the model’s parameter set was subject to a set of constraints. Here, that finding has been examined in detail in relation to both the CMVR Tweedie and Mack chain ladders.
The general shape of the admissible parameter set under aggregation of calendar periods is established in Propositions 8 (CMVR Tweedie chain ladder) and 11 (Mack chain ladder). The properties of these two parameter sets are different but, in both cases, their dimensionalities are found to be very substantially less than those of the natural parameter spaces (the spaces before imposition of constraints).
In other words, the parameter constraints that are required to maintain chain ladder as a valid model are extremely strict. The consequence is that, while it may be technically possible to maintain the chain ladder model under aggregation of calendar periods, it is not possible in any reasonable, practical sense.
This conclusion is especially strong in the case of the Mack chain ladder. In this case, the constraints apply to stochastic quantities (Remark 7), and their fulfilment will depend not on the model’s parametric structure but on the happenstance of the drawing of the data set.
What does this mean for a practitioner? Consider a case in which the practitioner is pondering the question of whether to work with a quarterly triangle or collapse into yearly. Before considering the implications of the current paper, it should not be forgotten that the quarterly form of data can be expected to produce the smaller forecast error (, ).
But consider the implications of the current paper on top of this. The practitioner may well be entitled to assume that a chain ladder adequately represents either the yearly or the quarterly triangle, but not both. There will probably be little harm in selecting one or the other and working it through to broad conclusions. However, any comparative results for the alternative forms of data will be invalid.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

Some Results in Differential Geometry

The following theorem will be required for the proof given in Appendix B.4.
The theorem is a generalization of the pre-image theorem. In its full generality (), it is a result in differentiable topology, but only the special case related to Euclidean space is required here.
Theorem A1 (Regular value theorem).
Let  S  be an open connected  n -dimensional subset of  R n  and  F : S R p  be differentiable with Jacobian matrix  J x , x S . A sufficient condition for the zero set  M = x S : F x = 0  to be a smooth  n m -dimensional sub-manifold with  m = r a n k J  for all  x S  and  1 m m i n   n , p  is that  M  is non-empty and a regular level set, i.e., for each  x M  ,  J x  has rank  m  on a neighbourhood of  x .
Theorem A2 (Application to quadratic functions).
Let  F : R n R p  with  F x T = F 1 x , , F p x  and  F k x = x T Q k x ,   k = 1 , , p  for known  n × n  symmetric matrices  Q k . Let  Q  be the  p n × n  matrix  Q 1 Q p T  and let  q = r a n k Q ,   0 q n . Then the zero set of  F  is  V M  where:
(i) 
V R n  is the  n q -dimensional linear subspace on which  Q x = 0  for  x V ; and
(ii) 
M  is either empty or a smooth  n r -dimensional sub-manifold of  R n  where  r = m i n   p , q
This zero set is smooth except possibly where the closure of M intersects V .
Proof. 
First evaluate the Jacobian matrix J k x of F k as
J k x = 2 x T Q k .
Stack these to obtain the Jacobian matrix of F :
J x = 2 x T Q 1 x T Q p = x T x T Q = I p x T Q ,
where Q is as in the statement of the theorem.
Now consider the two cases according to whether J is or is not equal to zero.
Case I:  J = 0 . This case occurs if and only if, by (A1),
x T Q k = 0   f o r   a l l   k = 1 , ,   p ,
or, equivalently,
Q x = 0 .
The values of x that satisfy this condition form a linear subspace V R n of dimension n q . It is evident that, when (A3) holds,
F k x = 0 ,
and so the zero set of F includes V .
Case II:  J 0 . This is the open set R n \ V . Consider the rank of J on this set, which is given by
r a n k J = m i n   r a n k I n x T , r a n k Q .
From the block form of I n x T given in (A2), this is seen to be a p × p n matrix whose rows are linearly independent, and then its rank is p . Substitution of this result into (A6) yields
r a n k J = m i n   p , q = r , s a y .
Theorem A1 may now be applied to conclude that the zero set of F on R n \ V is either empty or a smooth n r -dimensional sub-manifold M .
Full zero set of  F . The zero set of F on R n is V M , which is smooth except possibly where the closure of M intersects V . □

Appendix B

Proofs of Results.

Appendix B.1

Proof of Proposition 5.
By Equation (20), each column of C j * is a linear combination of the two vectors c B j * and c B j * + 1 , and so r a n k C j * 2 . Indeed, the k -th column is c B j * k c B j * + 1 c B j * + 1 k c B j * . Suppose the two vectors are linearly independent. Then the k -th column of C j * will be zero if and only if c B j * k = c B j * + 1 k = 0 . All columns will be zero if and only if c B j * = c B j * + 1 = 0 , which contradicts their linear independence. So r a n k C j * 0 .
Consider whether r a n k C j * may equal unity. This would require all columns to be scalar multiples one of another, which cannot occur if c B j * and c B j * + 1 are linearly independent. In conclusion, linear independence of these vectors implies r a n k C j * = 2 .
If c B j * and c B j * + 1 are linearly dependent, then c B j * + 1 = K c B j * for some constant K , and then Equation (20) immediately reduces to C j * = 0 (rank zero). □

Appendix B.2

Proof of Proposition 6.
Let δ T denote the vector δ 1 , , δ q , whence Equation (21) yields a A i * = γ i * δ . Substitution of this in Equation (14) gives
μ i * j * + 1 μ i * j * = δ T c B j * + 1 δ T c B j * ,
which is independent of i * as required. □

Appendix B.3

Proof of Proposition 7.
Let Γ be an acyclic undirected complete graph with a set S of N vertices. Since it is complete, it contains an edge between every pair of vertices, with i * k * denoting the edge between the i * -th and k * -th vertices. Let the set of edges be denoted by E , so that now Γ = Γ S , E .
For fixed j * , associate the ratio μ i * j * + 1 / μ i * j * , from Equation (18), with the i * -th vertex. Now suppose that Equation (18) is satisfied, and let the generic equality in Equation (18) be associated with the edge i * k * . The graph now represents the N 2 equalities expressed by Equation (18). The question concerns the smallest subset of these that implies all N 2 .
A spanning tree of Γ S , E is a sub-graph Γ 1 S , E 1 of Γ with E 1 E which is a tree, i.e., contains a unique path between any two given vertices. There may be more than one spanning tree, but each contains N 1 edges ().
It is clear that Γ 1 is minimal in the sense that deletion of any of its edges will eliminate the path between at least one pair of vertices. Thus, any spanning tree will identify a minimal subset of equalities in Equation (18) that imply the whole of (Equation (18). One such spanning tree is Γ 1 S , E 1 with E 1 = i * i * + 1 ,   i * = 1 , , N 1 and the typical member of this set is associated in Equation (18) with the following specific equality from Equation (18):
μ i * j * + 1 μ i * j * = μ i * + 1 j * + 1 μ i * + 1 j * .

Appendix B.4

Proof of Proposition 8.
Apply Theorem A2 with F : R N q R N 1 2 and the substitutions x a , Q k Q j * i * . In this case, Q becomes
Q = Q 1 1 Q N 1 1 Q 1 2 Q N 1 2 Q 1 N 1 Q N 1 N 1 .
Note that, in view of Equation (26), each block column of Q contains each C j * ,   j = 1 , , N 1 listed N 1 times, and so that column has the same rank as C , i.e., r . The N 1 block columns are all linearly independent in Q , and so r a n k Q = N 1 r .
Thus, Theorem A2 may be applied with the replacements n N q , p N 1 2 , q N 1 r . In consequence, the zero set of F is V M R + N q where:
(i)
V R N q is the N q N 1 r -dimensional linear subspace on which Q a = 0 for a V ; and
(ii)
M is either empty or a smooth N q s -dimensional sub-manifold of R N q where s = N 1   m i n   N 1 , r .
This zero set is smooth except possibly where the closure of M intersects V .
There remains the question of whether M = . Consider a of the form
a = 1 N a 1
for arbitrary (positive) a 1 . This is a special case Equation (21) and so, by Proposition 6, it is a solution of Equation (19).
Now either a M or not. If so, then immediately M . So now consider the case a M , whence a V . Then Q a = 0 and, by (A9), (A10) and Equation (26), C j * a 1 = 0 for each j * .
If each C j * is of full rank, this cannot occur. If all are rank deficient, select any j * , and suppose that r a n k C j * = ρ < q . Then C j * will map ρ dimensions of its domain R q to a ρ -dimensional vector space and the remaining q ρ to zero. Thus, it is possible to select v R q such that C j * v 0 provided that ρ > 0 .
It is not known whether all components of v are positive as required, so make the replacement a 1 a 1 + K v for constant K > 0 . Then C j * a 1 + K v = K C j * v 0 . Allow K 0 , and for sufficiently small K , all components of v will be positive.
Thus, unless r a n k C j * = 0 for all j * it is always possible to find a member a V in the zero set, and hence a M , and M . Finally, in the case r a n k C j * = 0 for all j * , the zero set consists of the whole of R + N q , by Proposition 5. □

Appendix B.5

Proof of Proposition 11.
The vector w A i * j * 1 R + q may be expressed in the form
w A i * j * 1 = ξ i * j * 1 F B j * 1 + η i * j * 1 F B j * 1 ,
where F B j * 1 R + q , F B j * 1 F B j * 1 , the subspace of R q that is orthogonal to F B j * 1 , and ξ i * j * 1 ,   η i * j * 1 are scalars.
By substitution of (A11) into Equation (44), and taking account of orthogonality,
f i * j * 1 = ξ i * j * 1 F B j * 1 T F B j * 1 .
Similarly for f k * j * 1 , and this will be the same as f i * j * 1 if and only if
ξ i * j * 1 = ξ k * j * 1 .
Note that there is no restriction on η i * j * 1 .
Now rotate co-ordinates to a new set in which one co-ordinate axis is in the direction of F B j * 1 and the remaining q 1 axes all lie within F B j * 1 . By (A11), w A i * j * 1 has co-ordinate ξ i * j * 1 in the F B j * 1 direction. Co-ordinates in this direction are subject to the restriction (A13), which means that the N dimensions that would be available to the ξ i * j * 1 ,   i * = 1 , , N in the absence of (A13) are reduced to just one dimension, a loss of N 1 dimensions. It follows that the total number of dimensions available to the vector w A j * 1 is N q N 1 = N q 1 + 1 . □

References

  1. Berquist, James R., and Richard E. Sherman. 1977. Loss reserve adequacy testing: A comprehensive, systematic approach. Proceedings of the Casualty Actuarial Society 64: 123–84. [Google Scholar]
  2. Bondy, John Adrian, and Uppaluri Siva Ramachandra Murty. 2008. Graph Theory. GTM 244. Berlin and Heidelberg: Springer. [Google Scholar]
  3. England, Peter, and Richard Verrall. 1999. Analytic and bootstrap estimates of prediction errors in claims reserving. Insurance: Mathematics and Economics 25: 281–93. [Google Scholar]
  4. Hiabu, Munir. 2017. On the relationship between classical chain ladder and granular reserving. Scandinavian Actuarial Journal 2017: 708–29. [Google Scholar] [CrossRef]
  5. Lo, Joseph. 2011. Implementing the Mack Model. Paper presented at GIRO Conference and Exhibition, Liverpool, UK, October 11–14; London: Institute and Faculty of Actuaries. Available online: https://www.actuaries.org.uk/system/files/documents/pdf/a9-implementing-mack-model-v1-0-giro-site.pdf (accessed on 30 September 2025).
  6. Mack, Thomas. 1993. Distribution-free calculation of the standard error of chain ladd, Ler reserve estimates. ASTIN Bulletin 23: 213–25. [Google Scholar] [CrossRef]
  7. Nelder, John Ashworth, and Robert W. M. Wedderburn. 1972. Generalised linear models. Journal of the Royal Statistical Society, Series A 135: 370–84. [Google Scholar] [CrossRef]
  8. Sloma, Przemyslaw. 2019. Generalized Mack Chain-Ladder Model of Reserving with Robust Estimation. Variance 12: 226–48. [Google Scholar]
  9. Taylor, Greg. 1985. Claim Reserving in Non-Life Insurance. Amsterdam: North-Holland. [Google Scholar]
  10. Taylor, Greg. 2000. Loss Reserving: An Actuarial Perspective. Boston: Kluwer Academic Publishers. [Google Scholar]
  11. Taylor, Greg. 2021. A special Tweedie sub-family with application to loss reserving prediction error. Insurance: Mathematics and Economics 101B: 262–88. [Google Scholar]
  12. Taylor, Greg. 2025a. The EDF chain ladder and data granularity. Risks 13: 65. [Google Scholar]
  13. Taylor, Greg. 2025b. The Mack chain ladder and data granularity for preserved development periods. Risks 13: 132. [Google Scholar] [CrossRef]
  14. Tu, Loring W. 2010. An Introduction to Manifolds. New York: Springer. [Google Scholar]
  15. Tweedie, Maurice C. K. 1984. An index which distinguishes between some important exponential families. In Statistics: Applications and New Directions, Paper presented at Indian Statistical Golden Jubilee International Conference, Calcutta, 16–19 December 1981. Edited by Jayanta K. Ghosh and Jogabrata Roy. Calcutta: Indian Statistical Institute, pp. 579–604. [Google Scholar]
  16. Verrall, Richard J. 2000. An investigation into stochastic claims reserving models and the chain-ladder technique. Insurance: Mathematics and Economics 26: 91–99. [Google Scholar]
  17. Wüthrich, Mario V., and Michael Merz. 2008. Stochastic Claims Reserving Methods in Insurance. Chichester: John Wiley & Sons Ltd. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.