# Compositional Data Analysis in Time-Use Epidemiology: What, Why, How

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

:

## 1. Introduction: The Time-Use Epidemiology Framework

## 2. Time-Use Data Convey Relative Information

## 3. The Rationale and Methods of CoDA

#### 3.1. The Descriptive Analysis of Compositional Data

## 4. Understanding the Results of CoDA Studies

#### Compositional Regression Analysis

## 5. Challenges for CoDA

#### 5.1. Zero Values

#### 5.2. Multicollinearity

#### 5.3. Non-Linearity

## 6. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

**Figure 1.**The ternary diagram: The Simplex sample space for a three-part composition is a triangle. Time-use data are from the Longitudinal Study of Australian Children, birth cohort, Wave 6. Black dot represents the compositional center of the time-use dataset, surrounded by 75%, 95% and 99% predictive regions from fitting a logratio normal distribution, which reflects the relative scale of compositional data [31] (see Section 3.1 for further details). PA = physical activity; SB = sedentary behavior.

**Figure 2.**Evenly spaced distance contours around the center of the Longitudinal Study of Australian Children Wave 3 birth cohort time-use compositions (red). The left panel shows contours defined by relative (Aitchison) distance; the right panel shows contours defined by absolute distance (Euclidean). The center of Wave 6 time-use composition is shown in blue. Grey dots represent data from 50 randomly sampled Wave 6 participants. PA = physical activity; SB = sedentary behavior.

**Figure 3.**Five randomly selected time-use compositions from the Longitudinal Study of Australian Children (birth cohort, Wave 6). Top panel shows the compositional mean (blue) and arithmetic mean (red) in the ternary diagram. Bottom panel shows real space isometric logratio representation, with their compositional mean (blue) and arithmetic mean (red). PA = physical activity; SB = sedentary behavior.

**Figure 4.**Estimated zBMI response surface. Arrows indicate reallocation of time to the part in the corner, taking it equally from the remaining parts. PA = physical activity; SB = sedentary behavior; zBMI = estimated body mass index z-score.

**Figure 5.**Isotemporal substitution of 2 h. Reallocation to PA from SB (white dot) and to SB from PA (grey dot), starting from the compositional mean (black dot). Data are from the Longitudinal Study of Australian Children, Wave 6, birth cohort. All analyses adjusted for sex, age and socioeconomic position. PA = physical activity; SB = sedentary behavior, zBMI = estimated body mass index z-score.

**Figure 6.**Relationship between daily activity behaviors (specifically SB) and zBMI. Panel A: Difference in zBMI associated with difference in SB-to-remaining activities expressed as a pivot coordinate, as estimated by compositional linear regression. Panel B: Difference in zBMI associated with difference in SB-to-remaining activities expressed in min/d, as estimated by compositional linear regression. Panel C: Difference in zBMI associated with difference in SB (min/d), as estimated by linear regression. Data are from the Longitudinal Study of Australian Children, Wave 6, birth cohort. All analyses adjusted for sex, age and socioeconomic position. zBMI = body mass index z-score; SB = sedentary behavior.

Mean Variation of the Pairwise Logratio | Center | ||||
---|---|---|---|---|---|

Sleep | SB | PA | (min/d) | ||

Numerator of logratio | Sleep | 0.13 | 0.39 | 617.5 | |

SB | −0.11 | 0.78 | 553.1 | ||

PA | −0.83 | −0.72 | 269.4 | ||

Mean of the pairwise logratio |

**Table 2.**Regression of pivot coordinates against body mass index z-score among n = 3228 children, Wave 6 LSAC birth cohort.

Pivot | Estimate | SE | t | p |
---|---|---|---|---|

Sleep vs Remaining | −0.21 | 0.11 | −1.96 | 0.045 |

SB vs Remaining | 0.19 | 0.07 | 2.56 | 0.010 |

PA vs Remaining | 0.02 | 0.05 | 0.37 | 0.708 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Dumuid, D.; Pedišić, Ž.; Palarea-Albaladejo, J.; Martín-Fernández, J.A.; Hron, K.; Olds, T.
Compositional Data Analysis in Time-Use Epidemiology: What, Why, How. *Int. J. Environ. Res. Public Health* **2020**, *17*, 2220.
https://doi.org/10.3390/ijerph17072220

Dumuid D, Pedišić Ž, Palarea-Albaladejo J, Martín-Fernández JA, Hron K, Olds T.
Compositional Data Analysis in Time-Use Epidemiology: What, Why, How. *International Journal of Environmental Research and Public Health*. 2020; 17(7):2220.
https://doi.org/10.3390/ijerph17072220

Dumuid, Dorothea, Željko Pedišić, Javier Palarea-Albaladejo, Josep Antoni Martín-Fernández, Karel Hron, and Timothy Olds.
2020. "Compositional Data Analysis in Time-Use Epidemiology: What, Why, How" *International Journal of Environmental Research and Public Health* 17, no. 7: 2220.
https://doi.org/10.3390/ijerph17072220