# Use of Information Measures and Their Approximations to Detect Predictive Gene-Gene Interaction

## Abstract

## 1. Introduction

## 2. Measures of Interaction

#### 2.1. Interaction Information Measure

**Proposition**

**Proposition**

#### 2.2. Other Nonparametric Measures of Interaction

**Proposition**

**Proposition**

**Proposition**

**Proposition**

#### 2.3. Estimation of the Interaction Measures

## 3. Modeling Gene-Gene Interactions

#### 3.1. Logistic Modeling of Gene-Gene Interactions

**Proposition**

#### 3.2. ANOVA Model for Binary Outcome

#### 3.3. Behavior of Interaction Indices for Logistic Models

#### 3.4. Behavior of Interaction Indices When ${X}_{1}$ and ${X}_{2}$ Are Independent

#### 3.5. Behavior of Interaction Indices When ${X}_{1}$ and ${X}_{2}$ Are Dependent

## 4. Tests for Predictive Interaction

- ${t}_{0},{t}_{1}$, the number of observations in controls ($Y=0$) and cases ($Y=1$), set equal in our experiments and $n={t}_{0}+{t}_{1}$, the total number of observations. Values of $n=500$ and $n=1000$ were considered.
- MAF, the minor allele frequency for ${X}_{1}$ and ${X}_{2}$. We set $MAF=0.25$ for both loci.
- copula, the function that determines the cumulative distribution of $\left({X}_{1},{X}_{2}\right)$ based on its marginal distributions.
- $p(i,j):=\mathbb{P}\left(Y=1|{x}_{i},{x}_{j}\right)$, the prevalence mapping, which in our experiments was either additive logistic or logistic with nonzero interaction.

#### 4.1. Behavior of Interaction Tests When ${X}_{1}$ and ${X}_{2}$ Are Independent

#### 4.1.1. Type I Errors for Models $M0$–$M2$

#### 4.1.2. Power for Additive Logistic Models

#### 4.1.3. Power for the Logistic Model with Interactions

#### 4.2. Behavior of the Interaction Tests When ${X}_{1}$ and ${X}_{2}$ Are Dependent

#### 4.2.1. Type I Errors for Model $M0$

#### 4.2.2. Power for Additive Logistic Models When ${X}_{1}$ and ${X}_{2}$ Are Dependent

#### 4.2.3. The Powers for Logistic Models with Interaction When ${X}_{1}$ and ${X}_{2}$ Are Dependent

#### 4.3. Real Data Example

## 5. Discussion

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## Appendix A

#### Appendix A.1. Distribution of X_{1},X_{2}

**Table A1.**Mass function $p\left({x}_{i},{x}_{j}\right)$ of $\left({X}_{1},{X}_{2}\right)$ for $MAF=0.25.$ The upper panel corresponds to independent ${X}_{1}$ and ${X}_{2}$ and the lower to the Frank copula with $\theta =-1$.

$\mathit{i}\backslash \mathit{j}$ | 1 | 2 | 3 | ∑ |
---|---|---|---|---|

Independent ${X}_{1}$ and ${X}_{2}$ | ||||

1 | $0.0039$ | $0.0235$ | $0.0351$ | $0.0625$ |

2 | $0.0234$ | $0.1406$ | $0.2110$ | $0.3750$ |

3 | $0.0352$ | $0.2109$ | $0.3164$ | $0.5625$ |

∑ | $0.0625$ | $0.3750$ | $0.5625$ | 1 |

Frank Copula with $\theta =-1$ | ||||

1 | $0.0024$ | $0.0180$ | $0.0421$ | $0.0625$ |

2 | $0.0180$ | $0.1232$ | $0.2339$ | $0.3750$ |

3 | $0.0421$ | $0.2339$ | $0.2865$ | $0.5625$ |

∑ | $0.0625$ | $0.3750$ | $0.5625$ | 1 |

#### Appendix A.2. Prevalence Mapping with the Logistic Regression Model

## References

**Figure 1.**Theoretic values of interaction measures for the additive logistic model and independent SNPs.

**Figure 2.**Theoretic values of the interaction measures for the logistic model with the interaction and independent SNPs.

**Figure 3.**Theoretic values of the interaction measures for the additive logistic model and dependent SNPs.

**Figure 4.**Theoretic values of the interaction measures for the logistic model with the interaction and dependent SNPs.

**Figure 5.**Actual Type I error rates against the nominal error α rate when ${X}_{1}$ and ${X}_{2}$ are independent for models $M{0}_{1},M{1}_{1}$ and $M{2}_{1}.$

**Figure 6.**Actual Type I error rates against nominal error rate α when ${X}_{1}$ and ${X}_{2}$ are dependent for model $M{0}_{1}.$

**Figure 7.**Powers for models $M{3}_{\lambda}$ and $M{4}_{\lambda}$ against λ when ${X}_{1}$ and ${X}_{2}$ are independent.

**Figure 8.**Powers for models $M{3}_{\lambda}$ and $M{4}_{\lambda}$ against λ when ${X}_{1}$ and ${X}_{2}$ are dependent.

**Figure 9.**The power for models $M0\left(\gamma \right),M3\left(\gamma \right)$ and $M4\left(\gamma \right)$ against γ when ${X}_{1}$ and ${X}_{2}$ are dependent.

**Figure 10.**Powers for models $M0\left(\gamma \right),M3\left(\gamma \right)$ and $M4\left(\gamma \right)$ against γ when ${X}_{1}$ and ${X}_{2}$ are independent.

**Figure 11.**Probability mass function for the pair (rs1131854,rs7374) (

**a**) and for corresponding conditional mass functions given $Y=1$ (

**b**) and $Y=0$ (

**c**).

**Table 1.**Coefficients for additive logistic models with parameter λ (cf. (39)). In each model, intercept μ was chosen, such that prevalence $P(Y=1)$ is equal to approximately $0.1$.

Model\Coefficients | ${\mathit{\alpha}}_{1}$ | ${\mathit{\alpha}}_{2}$ | ${\mathit{\alpha}}_{3}$ | ${\mathit{\beta}}_{1}$ | ${\mathit{\beta}}_{2}$ | ${\mathit{\beta}}_{3}$ |
---|---|---|---|---|---|---|

$M{0}_{\lambda}$ | 0 | 0 | 0 | 0 | 0 | 0 |

$M{1}_{\lambda}$ | λ | 0 | 0 | 0 | 0 | 0 |

$M{2}_{\lambda}$ | λ | λ | 0 | 0 | 0 | 0 |

$M{3}_{\lambda}$ | λ | 0 | 0 | λ | 0 | 0 |

$M{4}_{\lambda}$ | λ | λ | 0 | λ | λ | 0 |

**Table 2.**Values of the ${\gamma}_{ij}$ coefficients from Model (40) for the same constant value of $\gamma .$

$\mathit{i}\backslash \mathit{j}$ | 1 | 2 | 3 |
---|---|---|---|

1 | γ | γ | 0 |

2 | γ | γ | 0 |

3 | 0 | 0 | 0 |

