An Entropy-based Approach to Path Analysis of Structural Generalized Linear Models: a Basic Idea

A path analysis method for causal systems based on generalized linear models is proposed by using entropy. A practical example is introduced, and a brief explanation of the entropy coefficient of determination is given. Direct and indirect effects of explanatory variables are discussed as log odds ratios, i.e., relative information, and a method for summarizing the effects is proposed. The example dataset is re-analyzed by using the method.


Introduction
Path analysis [1] is often applied to causal systems of continuous variables through the linear structural equations model (LISREL) [2,3].In the LISREL approach, causal relationships among variables are described by a path diagram and translated into linear equations of the variables.Causal effects can then be calculated by regression and correlation coefficients obtained for the linear equations.

OPEN ACCESS
In contrast, path analysis of categorical variables is more complex than that of continuous variables because the causal system under consideration cannot be described by linear regression equations.Goodman [4][5][6][7] considered path analysis of binary variables by using logit models and discussed the effects of explanatory variables, though without discussing direct and indirect effects.Hagenaars [8] discussed path analysis of categorical variables by using a log-linear model, but here as well without discussion of direct and indirect effects.Eshima et al. [9] proposed a path analysis method for categorical variables in logit models.Kuha and Goldthorpe [10] gave a two-stage path analysis method for generalized linear models (GLMs) that uses log odds ratios.In their approach, first the total, direct and indirect effects are defined for mean differences of response variables, and then the method is applied to measuring the effects on the basis of log odds ratios.However, additive decomposition of the total effect into the direct and indirect effects only approximately reflects reality, and assessing effects in categorical (polytomous) variable systems become more complicated as the numbers of variable categories are increased [10].Albert and Nelson [11] proposed a path analysis method to calculate pathway effects for causal systems on the basis of GLMs, but not all pathway effects are identifiable.As in the two-stage cases, when factors, intermediate variables, and response variables are categorical, pathway effects become very complicated because the variable effects are defined for mean differences of response variables.
There are many examples of response variables in practical data that are not normally distributed in various fields of study.There is need for a method of path analysis with responses that are not normally distributed, especially categorical responses, and it is useful to discuss a path analysis approach for causal systems of GLMs [12,13].When describing causal systems of the variables by GLMs, regression parameters or coefficients are related to log odds ratios [14][15][16], and so it is natural to consider the effects of factors (explanatory variables) according to odds or log odds ratios.However results become more complicated as the number of categories of the variables increases.In such cases, it is suitable to summarize the effects of factors in GLMs.For this purpose, we use the entropy coefficient of determination (ECD), one of the entropy-based measures of predictive power for GLMs [15,16].
The remainder of this paper is organized as follows: Section 2 presents a practical example of causal systems-British mobility data [10]-and re-analyzes them by a new method of path analysis.Section 3 considers the relation between the log odds ratio and entropy, and ECD is briefly reviewed.Section 4 introduces a path analysis method for causal systems described by GLMs, and in Section 5 a method for testing effects of variables is given.The British mobility data are re-analyzed by the proposed approach in Section 6.Finally, Section 7 provides some discussion and conclusions for the present approach.

Practical Example
British mobility data describe the effects of education on social class mobility [10].There are three variables, which are causally ordered as shown in Figure 1: parents' social class, X; individual social class, Y; and education, Z, which intermediates between X and Y.The three variables are discrete.Social classes X and Y have three categories, "salariat and employers", "middle class", and "working class"; education Z has seven levels.While the effects of X and Z on Y can be discussed through log odds ratios, the results are complicated because the number of variable categories is large.It is important to

Log Odds Ratio and Information
Let X and Y be a p × 1 explanatory-variable vector and a response variable, respectively, and let ( ) be the conditional probability or probability density function of Y given that x X = .
The function ( ) is assumed to belong to the following family of exponential distributions: where θ and φ are parameters, and ( )

Remark 1.
In general, the systematic component can be extended to be a function of explanatory variable vector x.Then, the model is referred to as a generalized nonlinear model.For the sake of simplicity, the function is denoted by . The discussion below is applicable to this case.

Let X
μ and Y μ be the means of X and Y , respectively.Then, let us consider the following log odds ratio: The first and second terms of the right hand side of the above equation are the relative information with respect to response variable Y , so the log odds ratio is the change of the relative information in explanatory variable vector X .In GLMs, taking the expectation of the above log odds ratio with respect to X and Y, it is reduced to be that of X .Then: If the variables in the above are discrete, the related integrals are replaced by the summations.The ECD is then defined as: Then, ECD can also be expressed as: The measure is interpreted as the proportion of the variation in entropy of Y that is explained by X [15,16].As shown above, GLMs explain the entropy of response variables, so it is suitable to measure the effects of explanatory variables based on entropy.

Remark 2. Applying ECD to the linear regression model, ECD is the usual coefficient of determination 2
R .

Measuring the Total, Direct, and Indirect Effects in Recursive GLM Systems
For simplicity, in the recursive case with i X ( ) , where i X precedes , we discuss the effects of 1 X and 2 X on 3 X .Let i μ be the expectation of i X ( ) . Then, for a GLM with the conditional density or probability function of 3 X when ( ) ( ) given by (1), the total effect of ( ) ( ) can be defined by using the following log odds ratio: Taking the expectation of the above effect with respect to 1 X , 2 X and 3 X , we have: The above KL information is the (summary) total effect of explanatory variables ( ) μ and ( ) x μ be the conditional expectations for 2 X and 3 X , respectively, given that 1 1 x X = .The log odds ratio with respect to 2 x and 3 x for a given 1 x is: x X = is defined by the above log odds ratio because the effect expresses the total effect of 2 2 x X = on 3 3 x X = when the effect of the preceding variable 1 1 x X = is excluded.From this, the total effect of By taking the expectation of the above information with respect to 1 X , 2 X and 3 X , the (summary) total effect of 1 X on 3 X is given by The second term implies the effect of 2 X by itself, that is, the effect of 2 X on 3 X when the effect of 1 X is excluded, and is defined as the (summary) total effect of 2 X on 3 X .The direct effect of can be understood according to the following odds ratio: The above effect is derived by excluding the effect of 2 2 x X = , so it is defined as the direct effect of 1 1 x X = on 3 3 x X = .Taking the expectation of the above effect, we have the (summary) direct effect of 1 X on 3 X , expressed as follows: is defined as in (2).The above quantity is the amount of entropy of 3 X explained by 1 X alone, that is, excluding the effect of 2 X .By subtracting the direct effect of from the total effect, we have the indirect effect of Taking the expectation of the above effect, the (summary) indirect effect is given by As in the previous section, to standardize the above effects by ECD, we define the standardized total, direct, and indirect effects of 1 X and 2 X on 3 X as follows: The total effect of 1 X and 2 X on 3 X is: ( ) The direct effect of 1 X on 3 X : ( ) The indirect effect of 1 X on 3 X : ( ) ( ) ( ) In this case: A general approach based on the above discussion is given below.Let i X ( ) be variables such that the parents of k X are ( ) , that is, i X precedes be the conditional density or probability of K X given ( ) Explaining response variable K X in a GLM framework by explanatory variables

X
, the effects of the explanatory variables on the response variable can be treated in terms of entropy as discussed above.From this the standardized (summary) total effect of 1 X on K X is defined by: ( Second, the total effect of 2 X is defined as: Then, we can find the total effects of i X by induction, which yields: can be defined as in (2).In the above formulae, we have: (5) and: Remark 3. The total effect of i i x X = on ( ) ( ) , respectively.
Let ( ) ( ) be parent variables of K X excluding i X .The direct effect of i X on K X is defined by: ( ) From this, we have the indirect effect of i X : For canonical links: , we have: From (5) we have: ( The direct effect of i X on K X is given by: and the indirect effect is calculated by ( 8) minus (9).The present approach is different from the usual approach for linear equation models and from the approach in [10], because it is based on the log odds ratio and entropy by using all the variables concerned.[10] are defined with the marginal distributions of response variables and explanatory variables.Meanwhile the present approach defines the total effects of explanatory variables based on a recursive structure of all the variables concerned and we have (6).Remark 6. Indirect effects are defined by the total effects minus the direct effects as ( 3), ( 4) and ( 7); however the interpretation can be done in terms of entropy.On the other hand, direct and indirect effects are defined in an approach by [10], though the sum of the effects does not equal to the total effect.Remark 7. Assessing the model identification and testing the goodness-of-fit of the model are based on the discussion of GLMs.

Statistical Test for Effects
, respectively.
A similar result presented in Eshima & Tabata [16] can be used to show that: is asymptotically distributed according to a chi-squared distribution with the degrees of freedom equal to the number of parameters in the conditional independent model with

X
minus that with ( ) By using statistic (10), the total effects can be tested.Similarly, the statistic: is asymptotically distributed according to a chi-squared distribution with degrees of freedom equal to the number of regression coefficients (parameters) related to variable i X .
The following statistic is asymptotically distributed according to a non-central chi-squared distribution with degree of non-centrality: λ and an appropriate degrees of freedom ν, found as the number of parameters in the conditional independent model with

X
minus that with ( ) χ is asymptotically distributed according to the chi-squared distribution with ' ν degrees of freedom.As ' ν becomes large, the chi-squared distribution tends to a normal distribution with mean ' ν and variance ' 2ν .From this, for sufficiently large sample sizes n , the statistic: ν and variance 2 2 ' 2 n c ν [17].For sufficiently large n , we have that: ) From this, the asymptotic standard error (ASE) of n T 2 χ is n T 2 2 χ .Similarly, the asymptotic standard error of: is asymptotically equal to a normal distribution with mean: . By using the above results, ASEs of the estimates of the summary total and direct effects can be calculated.

Path Analysis of the British Morbility Data
The British mobility data described in Section 2 were analyzed in detail by using log odds ratios [10].Here, the proposed path analysis method is applied to summarize the effects of parental class X and education Z on destination class Y, measured by log odds ratios as in the previous section, and to give a simple interpretation from the summary effects of X and Z on Y .The three variables are random, and the GLM system can be composed of logit models.In this example, the employed logistic model can be expressed as follows.Let X be a categorical factor; Z a score that take levels {1,2,3} and {1,2,…,7}, respectively, and let Y be a categorical response variable with levels {1,2,3}.Let: are identified with categorical variables X and response Y , respectively.From this, the systematic component of the above model can be expressed as follows: where  u implies the summation over all u.Then, from Table 4 in [10], the estimated regression parameters for men are calculated as follows: Similarly, we have the estimated parameters for women as follows: From Tables 1 and 5 in [10], the joint distributions of parental class X and education Z for men and women are calculated, respectively, in Table 1.Similarly, the effects of X and Z on Y for women can be calculated.The results are omitted to avoid redundancy of the discussion.The standardized summary effects are shown in Table 5.For men, the total effect of X and Z on Y is 0.276, and so 27.6% of the variation of Y's entropy is explained by X and Z.The indirect effect of X is about twice the direct effect, and the total (direct) effect of Z on Y is about 1.5-fold that of X.Therefore, the effect of education Z on the destination class Y is large.For women, the total effect of X and Z on Y is 0.289, meaning that 28.9% of the variation of Y's entropy is explained by X and Z.The indirect effect of X on Y is about 6-fold that of the direct effect, and the direct effect is small.The total effect of Z on Y is about 2.7-fold that of X.The effect of Z on Y is more pronounced for women than for men.
In a comparison of men and women, the effect of Z on Y for women is about 1.3-fold the effect for men, and, contrarily, the effect of X on Y for men is about 1.4-fold the effect for women.For both men and women, the direct effects of X on Y are mostly very small, and this decomposition of effects shows that education plays an important role in determining social class as an adult.

Discussion
In the usual path analysis of continuous variable systems, use of the regression coefficients allows straightforward calculation of total, direct and indirect effects, and the total effect can be expressed by the sum of the direct and indirect effects.However such techniques cannot be applied to structural GLMs with categorical variables or variables that are not normally distributed.Moreover, multiple variable categories make the problem more complicated in comparison with linear equation models for continuous variables.In the present paper, a path analysis approach for structural GLM models was proposed, and calculation of the direct and indirect effects was discussed.Although the analysis of effects of explanatory variables on response variables can be discussed in detail by using log odds ratios, and the effects can be interpreted as changes of relative information, the results are generally quite complicated as demonstrated in Tables 2-4.The present path analysis summarizes the effects, as measured by log odds ratios, and the standardized summary total, direct, and indirect effects are interpreted in the framework of entropy.The present path analysis approach has potential for wide application in practical data analyses of causal systems represented as GLMs, and is particularly well suited to categorical data analysis.The present study has provided a basic idea for path analysis of recursive systems with GLMs, where all the variables concerned are causally ordered, and further studies are needed for performing path analysis of more complicated recursive GLM systems and assessing spurious effects.
summarize causal effects measured with log odds ratios, especially in such practical examples, to assess the intermediate effect of education on social class mobility.

Figure 1 .
Figure 1.Path diagram of social class mobility.
as a symmetric type of the Kullback-Leibler (KL) information between a GLM based on (1) and the null model with 0 β = T [15]; thus, we denote in this paper.Let ( ) y f be the density or probability function for null model 0 β = T and let ( ) x g

Table 1 .
The estimated joint distributions of parental class X and education level Z .On the basis of the estimated parameters shown above and the estimated joint distribution of X and Z in Table1, the joint distributions of X, Y, and Z by sex can be estimated.The effects of X and Z on Y for men are shown in Tables2-4, for example, the effects of

Table 2 .
The effects of X and Y on S = Y .

Table 3 .
The effects of X and Y on I = Y .

Table 4 .
The effects of X and Y on W Y = .

Table 5 .
Summary Direct, Indirect, and Total Effects of X and Z on Y.
* The numbers in parentheses are the standard errors.