# Penalized Variable Selection for Lipid–Environment Interactions in a Longitudinal Lipidomics Study

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data and Model Settings

#### 2.2. Generalized Estimating Equations

#### 2.3. Penalized Identification

#### 2.4. Computational Algorithms

- (1)
- Set the initial coefficient vector ${\beta}^{(0)}$ using LASSO;
- (2)
- (3)
- Repeat Step (2) until the convergence criterion is satisfied.

## 3. Results

#### 3.1. Simulation

#### 3.2. Real Data Analysis

## 4. Discussion

## Author Contributions

## Funding

## Conflicts of Interest

## Abbreviations

GEE | Generalized estimating equation |

AE | Exercise and ad libitum feeding |

PE | Exercise and pair feeding |

DCR | Sedentary and 20% dietary calorie restriction |

TG | Triacylglycerol |

DG | Diacylglycerol |

LASSO | Least absolute shrinkage and selection operator |

PGEE | Penalized generalized estimating equation |

PQIF | Penalized quadratic inference function |

MCP | Minimax concave penalty |

SCAD | Smoothly clipped absolute deviation |

SNP | Single nucleotide polymorphisms |

CNV | Copy number variations |

QIF | Quadratic inference function |

## Appendix A

$\mathit{n}=60$ | $\mathit{p}=30$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 13.6(2.5) | 4.7(2.7) | 7.4(0.8) | 2.1(1.6) | 6.2(2.1) | 2.5(2.6) |

A2 | 13.6(2.5) | 4.8(2.8) | 7.3(0.8) | 2.2(1.6) | 6.2(2.1) | 2.6(2.6) | |

A3 | 13.7(2.5) | 4.9(3.0) | 7.4(0.7) | 2.1(1.6) | 6.3(2.1) | 2.7(2.7) | |

A4 | 11.1(2.6) | 5.4(2.8) | 6.4(1.1) | 1.1(1.0) | 4.6(1.9) | 4.3(2.3) | |

A5 | 11.1(2.6) | 5.4(2.8) | 6.4(1.1) | 1.1(1.0) | 4.6(1.9) | 4.3(2.3) | |

A6 | 11.1(2.5) | 5.5(2.8) | 6.5(1.2) | 1.1(1.0) | 4.7(1.8) | 4.4(2.3) | |

$\rho $ = 0.8 | A1 | 13.2(2.2) | 4.4(2.9) | 7.5(0.6) | 2.4(1.7) | 5.7(2.1) | 1.9(2.1) |

A2 | 13.2(2.2) | 4.4(2.9) | 7.5(0.6) | 2.4(1.7) | 5.7(2.1) | 2.0(2.1) | |

A3 | 13.4(2.0) | 4.4(3.0) | 7.5(0.6) | 2.4(1.7) | 5.9(1.9) | 2.0(2.1) | |

A4 | 11.0(2.4) | 5.5(2.5) | 6.5(1.4) | 1.3(1.2) | 4.5(1.8) | 4.2(2.1) | |

A5 | 11.0(2.4) | 5.6(2.6) | 6.5(1.4) | 1.3(1.2) | 4.5(1.8) | 4.2(2.2) | |

A6 | 11.1(2.4) | 5.8(2.7) | 6.5(1.4) | 1.4(1.3) | 4.5(1.8) | 4.3(2.2) |

$\mathit{n}=60,\mathit{p}=30$ | ||||||
---|---|---|---|---|---|---|

$\mathit{\rho}=0.5$ | $\mathit{\rho}=0.8$ | |||||

MSE | NMSE | TMSE | MSE | NMSE | TMSE | |

A1 | 0.9352 | 0.1928 | 0.2732 | 0.9820 | 0.2108 | 0.2944 |

A2 | 0.9387 | 0.1924 | 0.2733 | 0.9809 | 0.2105 | 0.2940 |

A3 | 0.9324 | 0.1914 | 0.2717 | 1.0098 | 0.2063 | 0.2933 |

A4 | 1.9732 | 0.1560 | 0.3528 | 1.9910 | 0.1488 | 0.3484 |

A5 | 1.9709 | 0.1556 | 0.3523 | 1.9887 | 0.1487 | 0.348 |

A6 | 1.9629 | 0.1543 | 0.3502 | 1.9795 | 0.1474 | 0.3458 |

**Table A3.**Data simulated based on the underlying main effect only model. Identification results for $n=250,p=75,\rho =0.8$ with an actual dimension of 304.

Overall | Main | Interaction | |||||||
---|---|---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | MSE | NMSE | TMSE | |

A1 | 7.7(0.9) | 0.7(1.7) | 7.7(0.9) | 0.0(0.0) | 0.0(0.0) | 0.7(1.7) | 0.1025 | 0.0000 | 0.0014 |

A2 | 7.8(0.6) | 0.4(1.3) | 7.8(0.6) | 0.0(0.2) | 0.0(0.0) | 0.4(1.3) | 0.0730 | 0.0000 | 0.0010 |

A3 | 7.9(0.3) | 0.5(1.2) | 7.9(0.3) | 0.3(0.7) | 0.0(0.0) | 0.2(0.8) | 0.0288 | 0.0000 | 0.0004 |

A4 | 7.3(1.1) | 0.8(0.9) | 7.3(1.1) | 0.0(0.0) | 0.0(0.0) | 0.8(0.9) | 0.2530 | 0.0000 | 0.0034 |

A5 | 7.2(1.1) | 0.9(1.1) | 7.2(1.1) | 0.0(0.0) | 0.0(0.0) | 0.9(1.1) | 0.2273 | 0.0001 | 0.0031 |

A6 | 7.5(0.7) | 1.2(1.1) | 7.5(0.7) | 0.0(0.2) | 0.0(0.0) | 1.2(1.1) | 0.1932 | 0.0001 | 0.0027 |

$\mathit{n}=250$ | $\mathit{n}=500$ | |||||||
---|---|---|---|---|---|---|---|---|

$\mathit{p}=75$ | $\mathit{p}=150$ | $\mathit{p}=150$ | $\mathit{p}=300$ | |||||

$\mathbf{\rho}=0.5$ | $\mathbf{\rho}=0.8$ | $\mathbf{\rho}=0.5$ | $\mathbf{\rho}=0.8$ | $\mathbf{\rho}=0.5$ | $\mathbf{\rho}=0.8$ | $\mathbf{\rho}=0.5$ | $\mathbf{\rho}=0.8$ | |

A1 | 0.00(0.00) | 0.03(0.18) | 0.03(0.18) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) |

A2 | 0.03(0.10) | 0.03(0.18) | 0.30(0.70) | 0.10(0.31) | 0.00(0.00) | 0.00(0.00) | 0.03(0.18) | 0.00(0.00) |

A3 | 0.13(0.51) | 0.17(0.44) | 0.97(1.47) | 0.77(0.81) | 0.10(0.40) | 0.50(0.20) | 0.10(0.31) | 0.10(0.25) |

A4 | 0.00(0.00) | 0.03(0.18) | 0.03(0.18) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) | 0.00(0.00) |

A5 | 0.03(0.10) | 0.03(0.18) | 0.30(0.70) | 0.10(0.31) | 0.00(0.00) | 0.00(0.00) | 0.03(0.18) | 0.00(0.00) |

A6 | 0.13(0.51) | 0.17(0.44) | 0.97(1.47) | 0.77(0.81) | 0.10(0.40) | 0.50(0.20) | 0.10(0.31) | 0.10(0.25) |

**Table A5.**Stability selection percentages for all 17 true effects in the simulated data when $n=250$, $p=75$, $\rho =0.8$ with an actual dimension of 304.

True Effect | A1 | A2 | A3 | A4 | A5 | A6 |
---|---|---|---|---|---|---|

1 | 1 | 1 | 1 | 1 | 1 | 1 |

2 | 0.73 | 1 | 1 | 0.82 | 0.98 | 1 |

3 | 1 | 0.80 | 1 | 1 | 1 | 1 |

4 | 1 | 1 | 1 | 1 | 1 | 1 |

5 | 1 | 0.45 | 1 | 1 | 0.93 | 0.98 |

6 | 0.13 | 0.14 | 0.38 | 0.65 | 0.98 | 0.98 |

7 | 0.58 | 0.65 | 1 | 0.99 | 1 | 0.92 |

8 | 0.61 | 0.25 | 0.45 | 0.89 | 1 | 1 |

9 | 1 | 0.84 | 1 | 0.46 | 0.02 | 0.10 |

10 | 1 | 0.86 | 1 | 0.07 | 0.01 | 0.10 |

11 | 1 | 0.83 | 1 | 0.7 | 0.66 | 0.84 |

12 | 0.77 | 0.91 | 0.72 | 0.36 | 0.87 | 0.01 |

13 | 0.77 | 0.91 | 0.73 | 0.39 | 0.94 | 0.45 |

14 | 0.75 | 0.94 | 0.77 | 0.48 | 1 | 0.98 |

15 | 0.81 | 0.82 | 0.98 | 0.30 | 0.55 | 1 |

16 | 0.80 | 0.86 | 0.99 | 0.98 | 0.75 | 0.99 |

17 | 0.80 | 0.87 | 0.99 | 0.66 | 0.93 | 1 |

**Table A6.**Validation methods. Identification results for $n=250$, $p=75$ with an actual dimension of 304.

$\mathit{n}=250$ | $\mathit{p}=75$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 14.1(2.1) | 4.6(3.1) | 7.0(0.8) | 1.1(0.8) | 7.0(1.8) | 3.5(2.9) |

A2 | 14.2(2.1) | 4.7(3.1) | 7.0(0.9) | 1.1(0.9) | 7.1(1.8) | 3.6(2.8) | |

A3 | 14.4(1.7) | 4.6(3.2) | 7.1(0.8) | 1.1(0.9) | 7.2(1.5) | 3.5(3.0) | |

A4 | 13.1(1.1) | 6.1(2.8) | 6.9(0.8) | 1.0(0.8) | 6.1(0.9) | 5.3(2.6) | |

A5 | 13.1(1.1) | 6.4(2.8) | 6.9(0.8) | 1.0(0.8) | 6.1(0.9) | 5.6(2.5) | |

A6 | 13.0(1.2) | 6.7(3.1) | 6.9(0.8) | 1.0(0.8) | 6.1(1.0) | 5.9(2.9) | |

$\rho $ = 0.8 | A1 | 13.7(2.6) | 4.7(2.9) | 7.2(0.8) | 1.4(0.9) | 6.5(2.3) | 3.2(2.5) |

A2 | 13.8(2.6) | 4.6(3.1) | 7.3(0.8) | 1.4(1.0) | 6.6(2.3) | 3.1(2.6) | |

A3 | 13.8(2.5) | 5.1(3.0) | 7.3(0.7) | 1.5(0.8) | 6.5(2.1) | 3.6(2.9) | |

A4 | 12.9(2.1) | 5.7(2.5) | 7.3(0.8) | 1.3(0.9) | 5.6(1.6) | 4.5(2.1) | |

A5 | 12.9(2.1) | 5.8(2.6) | 7.3(0.8) | 1.3(1.0) | 5.6(1.6) | 4.5(2.2) | |

A6 | 12.9(2.2) | 6.8(2.7) | 7.3(0.7) | 1.4(0.9) | 5.6(1.8) | 5.5(2.5) |

**Table A7.**Validation methods. Estimation accuracy results for $n=250$, $p=75$ with an actual dimension of 304.

$\mathit{n}=250,\mathit{p}=75$ | ||||||
---|---|---|---|---|---|---|

$\mathit{\rho}=0.5$ | $\mathit{\rho}=0.8$ | |||||

MSE | NMSE | TMSE | MSE | NMSE | TMSE | |

A1 | 0.1126 | 0.0074 | 0.0120 | 0.1205 | 0.0085 | 0.0134 |

A2 | 0.1095 | 0.0071 | 0.0115 | 0.1200 | 0.0085 | 0.0133 |

A3 | 0.1082 | 0.0071 | 0.0115 | 0.1245 | 0.0090 | 0.0140 |

A4 | 0.2344 | 0.0051 | 0.0150 | 0.2610 | 0.0060 | 0.0171 |

A5 | 0.2335 | 0.0050 | 0.0149 | 0.2627 | 0.0060 | 0.0171 |

A6 | 0.2302 | 0.0048 | 0.0146 | 0.2565 | 0.0058 | 0.0166 |

## References

- Verbeke, G.; Fieuws, S.; Molenberghs, G.; Davidian, M. The analysis of multivariate longitudinal data: A review. Stat. Methods Med. Res.
**2014**, 23, 42–59. [Google Scholar] [CrossRef] [PubMed][Green Version] - Bandyopadhyay, S.; Ganguli, B.; Chatterjee, A. A review of multivariate longitudinal data analysis. Stat. Methods Med. Res.
**2011**, 20, 299–330. [Google Scholar] [CrossRef] [PubMed] - Fan, J.; Lv, J. A selective overview of variable selection in high-dimensional feature space. Stat. Sin.
**2010**, 20, 101–148. [Google Scholar] [PubMed] - Wu, C.; Ma, S. A selective review of robust variable selection with applications in bioinformatics. Brief. Bioinform.
**2014**, 16, 873–883. [Google Scholar] [CrossRef] [PubMed][Green Version] - Wang, L.; Zhou, J.; Qu, A. Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics
**2012**, 68, 353–360. [Google Scholar] [CrossRef] - Ma, S.; Song, Q.; Wang, L. Simultaneous variable selection and estimation in semiparametric modeling of longitudinal/clustered data. Bernoulli
**2013**, 19, 252–274. [Google Scholar] [CrossRef][Green Version] - Cho, H.; Qu, A. Model selection for correlated data with diverging number of parameters. Stat. Sin.
**2013**, 23, 901–927. [Google Scholar] [CrossRef] - Berridge, M.J. Inositol trisphosphate and diacylglycerol: Two interacting second messengers. Annu. Rev. Biochem.
**1987**, 56, 159–193. [Google Scholar] [CrossRef] - Goñi, F.M.; Alonso, A. Structure and functional properties of diacylglycerols in membranes. Prog. Lipid Res.
**1999**, 38, 1–48. [Google Scholar] [CrossRef] - Barona, T.; Byrne, R.D.; Pettitt, T.R.; Wakelam, M.J.; Larijani, B.; Poccia, D.L. Diacylglycerol induces fusion of nuclear envelope membrane precursor vesicles. J. Biol. Chem.
**2005**, 280, 41171–41177. [Google Scholar] [CrossRef][Green Version] - Thiam, A.R.; Farese, R.V., Jr.; Walther, T.C. The biophysics and cell biology of lipid droplets. Nat. Rev. Mol. Cell Biol.
**2013**, 14, 775–786. [Google Scholar] [CrossRef] [PubMed][Green Version] - Markgraf, D.; Al-Hasani, H.; Lehr, S. Lipidomics—Reshaping the analysis and perception of type 2 diabetes. Int. J. Mol. Sci.
**2016**, 17, 1841. [Google Scholar] [CrossRef] [PubMed][Green Version] - Zhou, X.; Mao, J.; Ai, J.; Deng, Y.; Roth, M.R.; Pound, C.; Henegar, J.; Welti, R.; Bigler, S.A. Identification of plasma lipid biomarkers for prostate cancer by lipidomics and bioinformatics. PLoS ONE
**2012**, 7, e48889. [Google Scholar] [CrossRef] [PubMed] - Stephenson, D.J.; Hoeferlin, L.A.; Chalfant, C.E. Lipidomics in translational research and the clinical significance of lipid–based biomarkers. Transl. Res.
**2017**, 189, 13–29. [Google Scholar] [CrossRef] [PubMed] - King, B.S.; Lu, L.; Yu, M.; Jiang, Y.; Standard, J.; Su, X.; Zhao, Z.; Wang, W. Lipidomic profiling of di–and tri–acylglycerol species in weight-controlled mice. PLoS ONE
**2015**, 10, e0116398. [Google Scholar] [CrossRef][Green Version] - Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. (Stat. Methodol.)
**2006**, 68, 49–67. [Google Scholar] [CrossRef] - Wu, C.; Cui, Y.; Ma, S. Integrative analysis of gene–environment interactions under a multi-response partially linear varying coefficient model. Stat. Med.
**2014**, 33, 4988–4998. [Google Scholar] [CrossRef][Green Version] - Wu, C.; Zhong, P.S.; Cui, Y. Additive varying-coefficient model for nonlinear gene-environment interactions. Stat. Appl. Genet. Mol. Biol.
**2018**, 17. [Google Scholar] [CrossRef] - Fan, Y.; Qin, G.; Zhu, Z. Variable selection in robust regression models for longitudinal data. J. Multivar. Anal.
**2012**, 109, 156–167. [Google Scholar] [CrossRef] - Liang, K.Y.; Zeger, S.L. Longitudinal data analysis using generalized linear models. Biometrika
**1986**, 73, 13–22. [Google Scholar] [CrossRef] - Zhou, F.; Ren, J.; Li, X.; Wu, C.; Jiang, Y. Interep: Interaction Analysis of Repeated Measure Data, Version 0.3.0; 2019. Available online: https://rdrr.io/cran/interep/ (accessed on 26 November 2019).
- Zhang, C.H. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat.
**2010**, 38, 894–942. [Google Scholar] [CrossRef][Green Version] - Wu, C.; Jiang, Y.; Ren, J.; Cui, Y.; Ma, S. Dissecting gene–environment interactions: A penalized robust approach accounting for hierarchical structures. Stat. Med.
**2018**, 37, 437–456. [Google Scholar] [CrossRef] [PubMed] - Lockhart, R.; Taylor, J.; Tibshirani, R.J.; Tibshirani, R. A significance test for the lasso. Ann. Stat.
**2014**, 42, 413–468. [Google Scholar] [CrossRef] [PubMed][Green Version] - Taylor, J.; Tibshirani, R.J. Statistical learning and selective inference. Proc. Natl. Acad. Sci. USA
**2015**, 112, 7629–7634. [Google Scholar] [CrossRef] [PubMed][Green Version] - Lee, J.D.; Sun, D.L.; Sun, Y.; Taylor, J.E. Exact post-selection inference, with application to the lasso. Ann. Stat.
**2016**, 44, 907–927. [Google Scholar] [CrossRef] - Meinshausen, N.; Bühlmann, P. Stability selection. J. R. Stat. Soc. Ser. (Stat. Methodol.)
**2010**, 72, 417–473. [Google Scholar] [CrossRef] - Briggs, M.; Petersen, K.; Kris-Etherton, P. Saturated fatty acids and cardiovascular disease: Replacements for saturated fat to reduce cardiovascular risk. Healthcare
**2017**, 5, 29. [Google Scholar] [CrossRef][Green Version] - Ouyang, P.; Jiang, Y.; Doan, H.M.; Xie, L.; Vasquez, D.; Welti, R.; Su, X.; Lu, N.; Herndon, B.; Yang, S.; et al. Weight Loss via exercise with controlled dietary intake may affect phospholipid profile for cancer prevention in murine skin tissues. Cancer Prev. Res.
**2010**, 3, 466–477. [Google Scholar] [CrossRef][Green Version] - Bowden, J.A.; Heckert, A.; Ulmer, C.Z.; Jones, C.M.; Koelmel, J.P.; Abdullah, L.; Ahonen, L.; Alnouti, Y.; Armando, A.; Asara, J.M.; et al. Harmonizing lipidomics: NIST interlaboratory comparison exercise for lipidomics using standard reference material 1950 metabolites in frozen human plasma. J. Lipid Res.
**2017**. [Google Scholar] [CrossRef][Green Version] - Stegemann, C.; Pechlaner, R.; Willeit, P.; Langley, S.R.; Mangino, M.; Mayr, U.; Menni, C.; Moayyeri, A.; Santer, P.; Rungger, G.; et al. Lipidomics profiling and risk of cardiovascular disease in the prospective population-based Bruneck study. Circulation
**2014**, 129, 1821–1831. [Google Scholar] [CrossRef][Green Version] - Jiang, Y.; Ma, H.; Su, X.; Chen, J.; Xu, J.; Standard, J.; Lin, D.; Wang, W. IGF-1 mediates exercise-induced phospholipid alteration in the murine skin tissues. J. Nutr. Food Sci.
**2012**, 2, 1–6. [Google Scholar] [CrossRef][Green Version] - Wenk, M.R. The emerging field of lipidomics. Nat. Rev. Drug Discov.
**2005**, 4, 594. [Google Scholar] [CrossRef] [PubMed] - Kujala, M.; Nevalainen, J. A case study of normalization, missing data and variable selection methods in lipidomics. Stat. Med.
**2015**, 34, 59–73. [Google Scholar] [CrossRef] - Checa, A.; Bedia, C.; Jaumot, J. Lipidomic data analysis: Tutorial, practical guidelines and applications. Anal. Chim. Acta
**2015**, 885, 1–16. [Google Scholar] [CrossRef] [PubMed] - Filzmoser, P.; Liebmann, B.; Varmuza, K. Repeated double cross validation. J. Chemom. J. Chemom. Soc.
**2009**, 23, 160–171. [Google Scholar] [CrossRef] - Cordell, H.J. Epistasis: What it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum. Mol. Genet.
**2002**, 11, 2463–2468. [Google Scholar] [CrossRef][Green Version] - Wu, M.; Ma, S. Robust genetic interaction analysis. Brief. Bioinform.
**2018**, 20, 624–637. [Google Scholar] [CrossRef] - Choi, N.H.; Li, W.; Zhu, J. Variable selection with the strong heredity constraint and its oracle property. J. Am. Stat. Assoc.
**2010**, 105, 354–364. [Google Scholar] [CrossRef] - Bien, J.; Taylor, J.; Tibshirani, R. A lasso for hierarchical interactions. Ann. Stat.
**2013**, 41, 1111–1141. [Google Scholar] [CrossRef] - Li, J.; Lu, Q.; Wen, Y. Multi-kernel linear mixed model with adaptive lasso for prediction analysis on high-dimensional multi-omics data. Bioinformatics
**2019**, 1–10, in press. [Google Scholar] - Wu, C.; Zhou, F.; Ren, J.; Li, X.; Jiang, Y.; Ma, S. A selective review of multi-level omics data integration using variable selection. High-Throughput
**2019**, 8, 4. [Google Scholar] [CrossRef][Green Version] - Qu, A.; Lindsay, B.G.; Li, B. Improving generalised estimating equations using quadratic inference functions. Biometrika
**2000**, 87, 823–836. [Google Scholar] [CrossRef] - Schaid, D.J.; Sinnwell, J.P.; Jenkins, G.D.; McDonnell, S.K.; Ingle, J.N.; Kubo, M.; Goss, P.E.; Costantino, J.P.; Wickerham, D.L.; Weinshilboum, R.M. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies. Genet. Epidemiol.
**2012**, 36, 3–16. [Google Scholar] [CrossRef] [PubMed] - Wu, C.; Cui, Y. Boosting signals in gene–based association studies via efficient SNP selection. Brief. Bioinform.
**2013**, 15, 279–291. [Google Scholar] [CrossRef] [PubMed][Green Version] - Wu, C.; Li, S.; Cui, Y. Genetic association studies: An information content perspective. Curr. Genom.
**2012**, 13, 566–573. [Google Scholar] [CrossRef] [PubMed][Green Version] - Mukherjee, B.; Ahn, J.; Gruber, S.B.; Chatterjee, N. Testing gene–environment interaction in large-scale case-control association studies: Possible choices and comparisons. Am. J. Epidemiol.
**2011**, 175, 177–190. [Google Scholar] [CrossRef] [PubMed][Green Version] - Wu, C.; Cui, Y. A novel method for identifying nonlinear gene–environment interactions in case–control association studies. Hum. Genet.
**2013**, 132, 1413–1425. [Google Scholar] [CrossRef] - Wu, M.; Zhang, Q.; Ma, S. Structured gene–environment interaction analysis. Biometrics
**2019**, 1–13, in press. [Google Scholar] [CrossRef] - Xu, Y.; Wu, M.; Ma, S.; Ejaz Ahmed, S. Robust gene–environment interaction analysis using penalized trimmed regression. J. Stat. Comput. Simul.
**2018**, 88, 3502–3528. [Google Scholar] [CrossRef] - Wu, C.; Shi, X.; Cui, Y.; Ma, S. A penalized robust semiparametric approach for gene–environment interactions. Stat. Med.
**2015**, 34, 4016–4030. [Google Scholar] [CrossRef][Green Version] - Wu, M.; Ma, S. Robust semiparametric gene–environment interaction analysis using sparse boosting. Stat. Med.
**2019**, in press. [Google Scholar] [CrossRef] [PubMed] - Ren, J.; Zhou, F.; Li, X.; Chen, Q.; Zhang, H.; Ma, S.; Jiang, Y.; Wu, C. Semi-parametric Bayesian variable selection for gene–environment interactions. Stat. Med.
**2019**, 1–51, in press. [Google Scholar] - Li, J.; Wang, Z.; Li, R.; Wu, R. Bayesian group LASSO for nonparametric varying-coefficient models with application to functional genome–wide association studies. Ann. Appl. Stat.
**2015**, 9, 640–664. [Google Scholar] [CrossRef] [PubMed] - Ahn, J.; Mukherjee, B.; Gruber, S.B.; Ghosh, M. Bayesian semiparametric analysis for two-phase studies of gene–environment interaction. Ann. Appl. Stat.
**2013**, 7, 543–569. [Google Scholar] [CrossRef] [PubMed][Green Version]

**Figure 1.**Plot of the identification results for $n=250$, $p=75$ with an actual dimension of 304. $p=150$ with an actual dimension of 604. A1–A3: methods accommodating the lipid–environment interactions with exchangeable, AR(1), and independence working correlations, respectively. A4–A6: methods not accommodating the lipid–environment interactions with exchangeable, AR(1), and independence working correlations, respectively.

**Figure 2.**Plot of the identification results for $n=500$, $p=150$ with an actual dimension of 604. $p=300$ with an actual dimension of 1204. A1–A3: methods accommodating the lipid–environment interactions with exchangeable, AR(1), and independence working correlations, respectively. A4–A6: methods not accommodating the lipid–environment interactions with exchangeable, AR(1), and independence working correlations, respectively.

$\mathit{n}=250$ | $\mathit{p}=75$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 14.5(1.9) | 4.8(3.1) | 7.2(0.8) | 1.7(1.2) | 7.4(1.5) | 3.1(2.6) |

A2 | 14.7(1.8) | 5.0(3.2) | 7.2(0.9) | 1.7(1.3) | 7.5(1.4) | 3.2(2.6) | |

A3 | 14.7(1.7) | 5.0(3.3) | 7.2(0.8) | 1.8(1.4) | 7.6(1.3) | 3.2(2.6) | |

A4 | 13.3(1.5) | 6.6(4.2) | 7.2(0.7) | 1.6(1.4) | 6.1(1.1) | 5.1(3.3) | |

A5 | 13.3(1.5) | 6.8(4.4) | 7.2(0.8) | 1.7(1.4) | 6.1(1.1) | 5.2(3.5) | |

A6 | 13.3(1.5) | 7.3(4.7) | 7.2(0.8) | 1.8(1.5) | 6.1(1.1) | 5.5(3.7) | |

$\rho $ = 0.8 | A1 | 13.7(2.3) | 4.1(2.8) | 7.2(0.8) | 1.5(1.0) | 6.5(2.1) | 2.7(2.4) |

A2 | 13.9(2.4) | 4.1(2.8) | 7.2(0.8) | 1.5(1.0) | 6.6(2.1) | 2.7(2.4) | |

A3 | 14.2(2.3) | 4.5(2.9) | 7.2(0.7) | 1.6(1.0) | 7.0(2.2) | 2.9(2.5) | |

A4 | 12.9(1.9) | 5.5(2.7) | 7.2(0.7) | 1.1(1.0) | 5.6(1.6) | 4.5(2.3) | |

A5 | 12.9(1.9) | 5.8(2.9) | 7.2(0.7) | 1.1(0.9) | 5.7(1.6) | 4.7(2.5) | |

A6 | 13.0(1.8) | 6.5(3.5) | 7.2(0.7) | 1.2(0.9) | 5.8(1.4) | 5.5(3.2) |

$\mathit{n}=250$ | $\mathit{p}=150$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 13.9(2.3) | 5.0(3.0) | 7.2(0.7) | 1.7(1.1) | 6.7(2.0) | 3.3(2.6) |

A2 | 14.0(2.2) | 5.0(3.0) | 7.2(0.7) | 1.7(1.1) | 6.8(1.9) | 3.3(2.6) | |

A3 | 14.4(2.2) | 5.1(3.2) | 7.3(0.7) | 1.8(1.2) | 7.1(1.9) | 3.3(2.8) | |

A4 | 12.9(1.9) | 5.7(2.5) | 7.3(0.8) | 1.4(0.9) | 5.6(1.5) | 4.4(2.3) | |

A5 | 13.0(1.8) | 5.9(2.6) | 7.2(0.8) | 1.4(0.9) | 5.7(1.4) | 4.5(2.3) | |

A6 | 13.0(1.8) | 6.4(2.7) | 7.2(0.8) | 1.4(1.0) | 5.8(1.5) | 5.0(2.5) | |

$\rho $ = 0.8 | A1 | 13.5(2.0) | 5.3(3.0) | 7.2(0.9) | 2.1(1.2) | 6.3(1.9) | 3.2(2.4) |

A2 | 13.5(2.0) | 5.4(3.2) | 7.2(0.9) | 2.2(1.3) | 6.3(1.9) | 3.2(2.5) | |

A3 | 13.4(2.1) | 6.0(3.0) | 7.1(0.9) | 2.4(1.3) | 6.2(1.9) | 3.6(2.7) | |

A4 | 12.5(1.9) | 7.6(3.3) | 7.3(0.7) | 1.8(1.2) | 5.2(1.7) | 5.7(2.7) | |

A5 | 12.6(1.8) | 7.8(3.4) | 7.3(0.7) | 1.9(1.2) | 5.3(1.6) | 5.9(2.8) | |

A6 | 12.6(1.8) | 8.4(4.1) | 7.3(0.8) | 1.9(1.2) | 5.4(1.7) | 6.5(3.6) |

$\mathit{n}=500$ | $\mathit{p}=150$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 15.7(1.4) | 2.7(1.9) | 7.7(0.5) | 1.3(0.7) | 8.0(1.4) | 1.4(1.7) |

A2 | 15.8(1.3) | 2.7(2) | 7.7(0.5) | 1.3(0.7) | 8.1(1.3) | 1.3(1.8) | |

A3 | 16.2(1.2) | 2.7(1.9) | 7.8(0.4) | 1.3(0.8) | 8.4(1.2) | 1.3(1.6) | |

A4 | 14.7(1.0) | 2.5(1.7) | 7.8(0.4) | 0.9(0.8) | 6.9(1.0) | 1.6(1.4) | |

A5 | 14.7(1.1) | 2.6(1.7) | 7.8(0.4) | 0.9(0.7) | 6.9(1.0) | 1.7(1.4) | |

A6 | 14.9(1.0) | 2.7(2.0) | 7.8(0.4) | 0.8(0.7) | 7.0(0.9) | 1.8(1.6) | |

$\rho $ = 0.8 | A1 | 15.5(1.7) | 3.0(2.9) | 7.7(0.6) | 1.1(0.8) | 7.9(1.5) | 1.9(2.2) |

A2 | 15.4(1.7) | 2.9(2.8) | 7.7(0.6) | 1.1(0.8) | 7.8(1.5) | 1.8(2.2) | |

A3 | 15.7(1.6) | 2.6(2.6) | 7.7(0.5) | 1.2(0.9) | 8.0(1.4) | 1.4(2.1) | |

A4 | 14.8(1.4) | 3.7(1.8) | 7.5(0.6) | 1.2(0.7) | 7.2(1.2) | 2.5(1.5) | |

A5 | 14.7(1.3) | 3.6(1.9) | 7.5(0.5) | 1.1(0.7) | 7.2(1.2) | 2.5(1.5) | |

A6 | 15.0(1.3) | 3.8(1.9) | 7.7(0.6) | 1.1(0.7) | 7.4(1.1) | 2.7(1.6) |

$\mathit{n}=500$ | $\mathit{p}=300$ | Overall | Main | Interaction | |||
---|---|---|---|---|---|---|---|

TP | FP | TP | FP | TP | FP | ||

$\rho $ = 0.5 | A1 | 16.1(1.2) | 3.2(2.4) | 7.6(0.6) | 1.4(0.8) | 8.5(1.0) | 1.8(2.2) |

A2 | 16.3(1.1) | 3.2(2.4) | 7.7(0.5) | 1.4(0.8) | 8.5(0.9) | 1.8(2.2) | |

A3 | 16.3(1) | 2.9(2.2) | 7.8(0.5) | 1.4(0.8) | 8.6(0.8) | 1.5(1.9) | |

A4 | 14.8(0.8) | 2.9(2.1) | 7.8(0.4) | 1.0(0.8) | 7.0(0.8) | 1.9(1.7) | |

A5 | 14.8(0.9) | 3.1(2.3) | 7.8(0.4) | 1.0(0.8) | 7.0(0.8) | 2.0(1.9) | |

A6 | 14.9(0.9) | 3.3(2.6) | 7.8(0.4) | 1.0(0.8) | 7.1(0.9) | 2.3(2.1) | |

$\rho $ = 0.8 | A1 | 15.9(1.2) | 3(2.6) | 7.6(0.5) | 1.5(0.8) | 8.3(1.1) | 1.5(2.2) |

A2 | 15.9(1.3) | 3.0(2.7) | 7.6(0.5) | 1.5(0.9) | 8.2(1.1) | 1.5(2.2) | |

A3 | 15.8(1.4) | 3.1(2.8) | 7.7(0.5) | 1.6(1.0) | 8.1(1.2) | 1.6(2.2) | |

A4 | 14.5(1.2) | 4.5(3.0) | 7.8(0.6) | 1.0(0.7) | 6.8(1.0) | 3.5(2.6) | |

A5 | 14.5(1.2) | 4.7(3.3) | 7.8(0.6) | 1.1(0.8) | 6.7(0.9) | 3.6(2.9) | |

A6 | 14.5(1.1) | 4.9(3.6) | 7.8(0.6) | 1.0(0.8) | 6.7(0.8) | 3.8(3.3) |

**Table 5.**Estimation accuracy results for $n=250$, $p=75$ with an actual dimension of 304. $p=150$ with an actual dimension of 604.

$\mathit{n}=250$ | |||||||
---|---|---|---|---|---|---|---|

$\mathit{p}=75$ | $\mathit{p}=150$ | ||||||

MSE | NMSE | TMSE | MSE | NMSE | TMSE | ||

$\rho $ = 0.5 | A1 | 0.1055 | 0.0026 | 0.0043 | 0.1264 | 0.0045 | 0.0072 |

A2 | 0.1042 | 0.0026 | 0.0042 | 0.1259 | 0.0045 | 0.0072 | |

A3 | 0.1030 | 0.0026 | 0.0042 | 0.1174 | 0.0041 | 0.0066 | |

A4 | 0.2321 | 0.0018 | 0.0056 | 0.2435 | 0.0032 | 0.0084 | |

A5 | 0.2304 | 0.0018 | 0.0055 | 0.2402 | 0.0031 | 0.0082 | |

A6 | 0.2288 | 0.0018 | 0.0055 | 0.2346 | 0.0030 | 0.0080 | |

$\rho $ = 0.8 | A1 | 0.1187 | 0.0087 | 0.0135 | 0.129 | 0.0048 | 0.0075 |

A2 | 0.1163 | 0.0085 | 0.0132 | 0.1295 | 0.0048 | 0.0075 | |

A3 | 0.1066 | 0.0075 | 0.0118 | 0.1319 | 0.0049 | 0.0077 | |

A4 | 0.2410 | 0.0060 | 0.0162 | 0.2531 | 0.0038 | 0.0092 | |

A5 | 0.2426 | 0.0060 | 0.0162 | 0.2487 | 0.0038 | 0.0091 | |

A6 | 0.2335 | 0.0058 | 0.0157 | 0.2431 | 0.0037 | 0.0089 |

**Table 6.**Estimation accuracy results for $n=500$, $p=150$ with an actual dimension of 604. $p=300$ with an actual dimension of 1204.

$\mathit{n}=500$ | |||||||
---|---|---|---|---|---|---|---|

$\mathit{p}=150$ | $\mathit{p}=300$ | ||||||

MSE | NMSE | TMSE | MSE | NMSE | TMSE | ||

$\rho $ = 0.5 | A1 | 0.0754 | 0.0026 | 0.0042 | 0.0660 | 0.0010 | 0.0017 |

A2 | 0.0731 | 0.0026 | 0.0041 | 0.0659 | 0.0010 | 0.0017 | |

A3 | 0.0648 | 0.0022 | 0.0035 | 0.0663 | 0.0010 | 0.0017 | |

A4 | 0.1872 | 0.0015 | 0.0055 | 0.1635 | 0.0007 | 0.0024 | |

A5 | 0.1837 | 0.0015 | 0.0054 | 0.1612 | 0.0007 | 0.0024 | |

A6 | 0.1792 | 0.0013 | 0.0052 | 0.1603 | 0.0007 | 0.0024 | |

$\rho $ = 0.8 | A1 | 0.0708 | 0.0023 | 0.0037 | 0.0688 | 0.0010 | 0.0018 |

A2 | 0.0716 | 0.0023 | 0.0038 | 0.0688 | 0.0011 | 0.0018 | |

A3 | 0.0704 | 0.0025 | 0.0039 | 0.0718 | 0.0012 | 0.0020 | |

A4 | 0.1480 | 0.0013 | 0.0049 | 0.1949 | 0.0007 | 0.0028 | |

A5 | 0.1492 | 0.0013 | 0.0045 | 0.1945 | 0.0007 | 0.0028 | |

A6 | 0.1479 | 0.0012 | 0.0044 | 0.1899 | 0.0007 | 0.0027 |

**Table 7.**Real data analysis result from method A1 (method accommodating the lipid–environment interactions with exchangeable working correlation).

Lipid | AE | PE | DCR | |
---|---|---|---|---|

C16:0/16:1 | 0 | 0.0117 | −0.0239 | −0.0057 |

C18:2/16:1 | 0 | 0.1544 | 3.3322 | 0.3924 |

C18:1/16:1 | 0 | 0.4857 | −0.6299 | −0.5559 |

C20:1/16:1 | 0.5966 | −2.9145 | 0.1299 | −1.4836 |

C16:0/16:0 | 0 | 1.3742 | −0.8817 | −1.8070 |

C20:6/16:0 | 0.0369 | 0 | 0 | 0 |

C20:0/18:3 | −1.3628 | 0 | 0 | 0 |

C18:0/18:2 | −1.6154 | 0 | 0 | 0 |

C22:6/18:1 | 1.1717 | 1.7526 | 0.2287 | −0.4079 |

C18:2/20:4 | 1.1497 | 0 | 0 | 0 |

C18:1/20:4 | 0.8490 | 0 | 0 | 0 |

C20:1/20:4 | 0 | −0.2169 | −0.6096 | 3.0537 |

**Table 8.**Real data analysis result from method A4 (method not accommodating the lipid–environment interactions with exchangeable working correlation).

Lipid | AE | DCR | PE | |
---|---|---|---|---|

C16:0/16:1 | 0 | 0 | −0.0024 | 0 |

C18:2/16:1 | −2.1856 | 0 | 3.2306 | 0 |

C18:1/16:1 | 0 | 0 | −1.4641 | −2.3563 |

C20:1/16:1 | 0.0042 | −2.6768 | 0 | −1.7757 |

C16:0/16:0 | 0 | 2.8757 | −0.9389 | −2.6791 |

C18:2/16:0 | 0 | 0 | 0 | −1.7688 |

C20:6/16:0 | 0.1481 | −0.1276 | 0 | 0 |

C18:1/18:3 | 0 | 0 | 1.2917 | 0 |

C20:0/18:3 | −1.6171 | 0 | 0 | 0 |

C18:0/18:2 | −1.7695 | 0 | 0 | 0 |

C22:6/18:1 | 0.8851 | 3.4714 | 0.4809 | 0 |

C18:1/18:0 | 0 | −1.2901 | 0 | 0 |

C22:7/18:0 | 0 | −0.9839 | 0 | 0 |

C18:2/20:4 | 2.5871 | 0.6150 | 0 | 1.9327 |

C18:1/20:4 | 0 | 0 | −0.0031 | 0 |

C20:1/20:4 | 0.7542 | −1.1147 | 0 | 3.5396 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhou, F.; Ren, J.; Li, G.; Jiang, Y.; Li, X.; Wang, W.; Wu, C. Penalized Variable Selection for Lipid–Environment Interactions in a Longitudinal Lipidomics Study. *Genes* **2019**, *10*, 1002.
https://doi.org/10.3390/genes10121002

**AMA Style**

Zhou F, Ren J, Li G, Jiang Y, Li X, Wang W, Wu C. Penalized Variable Selection for Lipid–Environment Interactions in a Longitudinal Lipidomics Study. *Genes*. 2019; 10(12):1002.
https://doi.org/10.3390/genes10121002

**Chicago/Turabian Style**

Zhou, Fei, Jie Ren, Gengxin Li, Yu Jiang, Xiaoxi Li, Weiqun Wang, and Cen Wu. 2019. "Penalized Variable Selection for Lipid–Environment Interactions in a Longitudinal Lipidomics Study" *Genes* 10, no. 12: 1002.
https://doi.org/10.3390/genes10121002