Slow and Steady, or Hard and Fast? A Systematic Review and Meta-Analysis of Studies Comparing Body Composition Changes between Interval Training and Moderate Intensity Continuous Training

Purpose: To conduct a systematic review and multilevel meta-analysis of the current literature as to the effects of interval training (IT) vs moderate intensity continuous training (MICT) on measures of body composition, both on a whole-body and regional level. Methods: We searched English-language papers on PubMed/MEDLINE, Scopus, CINAHL, and sportrxiv for the following inclusion criteria: (a) randomized controlled trials that directly compared IT vs MICT body composition using a validated measure in healthy children and adults; (b) training was carried out a minimum of once per week for at least four weeks; (c) published in a peer-reviewed English language journal or on a pre-print server. Results: The main model for fat mass effects revealed a trivial standardized point estimate with high precision for the interval estimate, with moderate heterogeneity (−0.016 (95%CI −0.07 to 0.04); I2 = 36%). The main model for fat-free mass (FFM) effects revealed a trivial standardized point estimate with high precision for the interval estimate, with negligible heterogeneity (−0.0004 (95%CI −0.05 to 0.05); I2 = 16%). The GRADE summary of findings suggested high certainty for both main model effects. Conclusions: Our findings provide compelling evidence that the pattern of intensity of effort and volume during endurance exercise (i.e., IT vs MICT) has minimal influence on longitudinal changes in fat mass and FFM, which are likely to minimal anyway. Trial registration number: This study was preregistered on the Open Science Framework.


Introduction
The relative components of fat mass and fat-free mass in the body, collectively termed body composition, has important implications for human health. Excessive levels of body fat show a high correlation with a panoply of disease states, including cardiovascular diseases, metabolic disorders, certain cancers, osteoarthritis, and respiratory conditions [1]. Alternatively, low levels of fat-free mass are associated with a loss of strength, functional capacity, and reduced bone mineral density [2][3][4], impairing both the quality and quantity of life [1]. There is an interaction between these two components, whereby the combination of low levels of fat-free mass (FFM) and high levels of body fat potentiate each other, maximizing their impact on disability, morbidity, and mortality [5].
Exercise is commonly recommended as an intervention to improve body composition [6,7]. Interventional strategies often employed for this purpose include the following patterns.
Moderate intensity continuous training (MICT), herein operationally defined as moderate intensity of effort exercise (<80% peak heart rate or aerobic capacity) performed over a longer (relative to interval training bouts) single bout.
Interval training (IT), herein operationally defined as exercise performed in multiple shorter (relative to continuous training) bouts interspersed with recovery periods either at lower intensities of effort, or as complete rest.
IT is often subclassified into high intensity interval training (HIIT), herein operationally defined as high intensity of effort exercise (approximately >80% peak heart rate or aerobic capacity) performed in multiple shorter bouts interspersed with recovery periods either at lower intensities of effort or as complete rest, and sprint interval training (SIT), herein operationally defined as maximal intensity of effort exercise ('all out' sprint) performed in multiple shorter bouts interspersed with recovery periods either at lower intensities of effort or as complete rest.
Although both MICT and IT show efficacy in improving body composition, controversy exists as to whether one strategy is superior to the other for this purpose. For example, an earlier meta-analysis by Keating et al. [8] reported little difference between MICT and IT for body fat reduction, highlighting that, over the short term, neither intervention produced clinically meaningful changes. Following this, Viana et al. [9] conducted a meta-analysis with results showing that IT produced a 28.5% greater reduction in fat mass than MICT. However, the paper was criticized for various methodological issues [10], ultimately leading to its retraction. More recently, Sultana et al. [11] carried out a meta-analysis that included a comparison of IT vs MICT. The analysis did not find a benefit to low-volume IT on measures of body composition when compared with MICT. However, they limited their analysis to only single measures per study of the constructs of interest (i.e., total body fat mass, body fat percentage, and lean body mass), whereas many studies often report several measures (e.g., regional measures). Furthermore, although several studies have also compared the effects of IT and MICT in younger populations, they limited the analysis to adults. Additionally, it is not clear from their analysis which pre-post test correlations were imputed and used for effect size calculations. The magnitude of pre-post test correlations used in calculations of pre-post control group design effect sizes using pooled baseline standard deviations can impact the heterogeneity determined in the meta-analysis [12]. Thus, although the standardized point estimates of Sultana et al. [11] models generally suggested little difference between conditions, the accompanying interval estimates for most outcomes included small effects in favor of either IT or MICT. Furthermore, their models had essentially no heterogeneity, although this may be the result of imputation of pre-post correlations that were relatively low. Application of multilevel meta-analytic models with robust variance estimation to handle multiple effects per study might yield a greater precision of estimates [13], and thus help to confirm whether small differences do in fact exist, and if so, in which direction. Additionally, extraction of information to permit calculation of pre-post test correlations within groups (i.e., see Higgins et al. [14]) would allow for a better estimate of the population pre-post test correlations and may reveal heterogeneity not identified in previous analyses. Lastly, although Sultana et al. [11] explored 'within-condition' effects for IT in studies that included a non-exercising control condition, they did not similarly explore this outcome for MICT training.
It also has been speculated that specific exercise-induced effects might occur for hypertrophy and regional fat mass. Endurance exercise may have beneficial effects on muscle hypertrophy, similar to that of resistance training [15], and some researchers highlight that IT, in particular, may produce a potent anabolic stimulus [16]. Furthermore, it has been suggested that IT may be more effective than MICT for abdominal fat mass reduction [17]. However, to our knowledge, no previous review has pooled data from research that directly compares changes in FFM between IT and MICT, nor specifically examined regional effects on changes in fat mass.
Lastly, although prior meta-analyses have considered between-conditions comparison of mean intervention effects [11], whether or not differences in the variance of treatment Sports 2021, 9, 155 3 of 28 responses are present has been relatively less explored. A recent meta-analysis of aerobic exercise in overweight and obese children and adolescents found no evidence of 'true' interindividual response variation in fat loss [18]. However, numerous studies have purported that there may be inter-individual response variation to IT and MICT for a range of outcomes [19][20][21], and indeed it has been argued that such variation may mask differences between IT and MICT for fat loss [22]. Thus, we also sought to examine whether there is evidence of 'true' inter-individual response variation for body composition outcomes for both IT and MICT [23,24].
Given the gaps in the current literature, the purpose of this paper was to conduct a systematic review and multilevel meta-analysis of the current literature as to the effects of IT vs MICT on measures of body composition, both on a whole-body and regional level. Secondarily, we sought to determine if intensity of effort influences exercise adherence and/or adverse events, as well as whether inter-individual response to IT and MICT influences changes in body composition.

Material and Methods
This systematic review was conducted in accordance with the guidelines of the "Preferred Reporting Items for Systematic Reviews and Meta-Analyses" (PRISMA) [25]. The study was preregistered on the Open Science Framework (https://osf.io/dq784), where the detailed prespecified methodological protocol can be viewed.

Inclusion/Exclusion Criteria
We included studies that met the following criteria: (a) randomized controlled trials (both within-and between-group designs) that directly compared IT vs MICT (both with and without adjuvant dietary interventions) for body composition using a validated measure (DXA, BodPod, hydrostatic weighing, BIA, skinfolds, ultrasound, magnetic resonance imaging, and computerized tomography) in healthy children and adults; (b) training was carried out a minimum of once per week for at least four weeks; (c) published in a peer-reviewed English language journal or on a pre-print server. We excluded studies that employed: (a) participants with co-morbidities that might impair aerobic capacity (respiratory conditions, musculoskeletal injury); and (b) an unbalanced resistance training component (e.g., one group performs resistance training whereas the other does not). Note, our original pre-registration failed to specify the particular intensity of effort and operationalization of this variable for determination of whether an IT intervention could be considered 'HIIT'. However, a small number of studies identified employed intensities of >75% of peak heart rate or aerobic capacity for their IT conditions. Given our omission of specificity in pre-registration, we felt that these studies should be included, as there was still a reasonable difference in intensity of effort compared with the MICT conditions (typically <60%).

Search Strategy
We carried out a comprehensive search of the PubMed/MEDLINE, Scopus, CINAHL, and sportrxiv databases using the following Boolean string: (interval training OR intermittent training OR high intensity OR sprint interval training OR aerobic interval training OR HIIT OR HIIE OR high intensity interval training OR high-intensity interval training OR high intensity interval exercise OR high intensity intermittent exercise OR high-intensity intermittent exercise OR high intensity intermittent training OR high-intensity intermittent training) AND (continuous training OR moderate-intensity continuous exercise OR moderate intensity continuous exercise OR moderate-intensity continuous training OR moderate intensity continuous training OR endurance training) AND (body fat OR adiposity OR body composition OR abdominal fat OR visceral fat OR adipose tissue OR fat mass OR fat-free mass OR lean body mass OR lean mass OR muscle mass). Moreover, we screened the reference lists of articles retrieved to uncover any additional studies that might meet Sports 2021, 9,155 4 of 28 inclusion criteria, as described by Greenhalgh and Peacock [26]. The search was finalized on 6 March 2021; Figure 1 illustrates a flow chart of the search process.  [26]. The search was finalized on 6 March 2021; Figure 1 illustrates a flow chart of the search process.

Screening/Coding of Studies
Search/screening was carried out separately by two researchers (DP and AR). These researchers read all titles and abstracts and then reviewed full texts for papers deemed relevant based on their title and abstract. Decisions then were made as to whether a study warranted inclusion based on the stated criteria. Any disputes on the inclusion of a given study were settled by a third researcher (MCM).
After determining which studies met inclusion, two researchers (DV and HZ) separately coded the following variables for each study: authors, title and year of publication, sample size, sex, body mass index (BMI), training status, age, description of the training intervention (duration, intensity, frequency, modality), work matched (yes/no), nutrition controlled (yes/no), method for body comp assessment (e.g., DXA, BodPod, BIA, hydrostatic weighing, skinfolds, MRI, CT, ultrasound), number of adverse effects associated with the training intervention, adherence to the given training program, mean pre-and post-study body composition values in addition to pre-post change scores with the corresponding standard deviation or standard error, and where change score standard deviations were not reported we extracted information to allow their calculation, including confidence intervals for change scores or within-group pre-post t statistics or p values (where p values were reported only to the studies' level of alpha (e.g., p < 0.05) we took this as a conservative value). In cases where body composition data were not reported numerically, we either extracted the data from graphs when available via online software, or attempted to contact the study's authors. Coding was cross-checked between reviewers,

Screening/Coding of Studies
Search/screening was carried out separately by two researchers (DP and AR). These researchers read all titles and abstracts and then reviewed full texts for papers deemed relevant based on their title and abstract. Decisions then were made as to whether a study warranted inclusion based on the stated criteria. Any disputes on the inclusion of a given study were settled by a third researcher (MCM).
After determining which studies met inclusion, two researchers (DV and HZ) separately coded the following variables for each study: authors, title and year of publication, sample size, sex, body mass index (BMI), training status, age, description of the training intervention (duration, intensity, frequency, modality), work matched (yes/no), nutrition controlled (yes/no), method for body comp assessment (e.g., DXA, BodPod, BIA, hydrostatic weighing, skinfolds, MRI, CT, ultrasound), number of adverse effects associated with the training intervention, adherence to the given training program, mean preand post-study body composition values in addition to pre-post change scores with the corresponding standard deviation or standard error, and where change score standard deviations were not reported we extracted information to allow their calculation, including confidence intervals for change scores or within-group pre-post t statistics or p values (where p values were reported only to the studies' level of alpha (e.g., p < 0.05) we took this as a conservative value). In cases where body composition data were not reported numerically, we either extracted the data from graphs when available via online software, or attempted to contact the study's authors. Coding was cross-checked between reviewers, with any discrepancies resolved by mutual consensus. Consistent with the guidelines of Cooper et al. [27], 30% of the included studies were randomly selected for re-coding to assess for potential coder drift by a third researcher (BM). Agreement was calculated by dividing the number of variables coded the same by the researchers by the total number of variables; acceptance required a mean agreement of 0.90 to avoid re-extraction entirely, and Sports 2021, 9, 155 5 of 28 after this was met, only those with differing codes were checked and updated. Extracted data was also double-checked after this process by the lead author (JS) prior to analysis.

Methodological Quality and Certainty of Evidence
Two of the authors independently evaluated each study (JG and BJS) using the 11-point Physiotherapy Evidence Database (PEDro) scale, which has been validated to assess the methodologic quality of randomized trials [28] with acceptable inter-rater reliability [29]. Any discrepancies in agreement on a given scale item were settled by mutual agreement between the researchers. Given that it is infeasible to blind participants and investigators in supervised exercise interventions, we opted to remove assessment items specific to blinding (numbers 5, 6, and 7 in the scale). After eliminating these items, this created a modified 8-point PEDro scale with a maximum value of 7 (the first item is excluded from the total score). The qualitative methodological ratings were amended, similar to those used in previous exercise-related systematic reviews [30], as follows: "excellent" (6-7 points); "good" (5 points); "moderate" (4 points); and "poor" (0-3 points). We also followed the Grading of Recommendations, Assessment, Development and Evaluations (GRADE) framework [31] for evaluating the certainty of evidence with respect to our primary pre-registered outcomes (absolute fat mass, and absolute lean/fat free mass). We used the GRADEpro online tool [32] for this assessment and generation of the summary of findings table. It should be noted though that we did not pre-register the use of the GRADE approach to evaluate the evidence presented but decided a posteriori that the assessment would enhance the ability to draw practical inferences from the data.

Statistical Analyses
Quantitative synthesis of data was performed with the 'metafor' [33] package in R (v 4.0.2; R Core Team, https://www.r-project.org/). All analysis code and data are openly available in the Supplementary Materials (https://osf.io/6karz/). Studies were grouped by design (i.e., within-or between-group), and depending on reporting in individual studies, either post or delta comparisons, or pre-post comparison designs [12] for the purposes of appropriate calculation of standardized effects (Hedges' g) using the escalc function in metafor were carried out. We used the pooled group baseline standard deviation as the numerator as per Morris (29). Standardized effect sizes were interpreted as per Cohen's [34] thresholds: trivial (<0.2), small (0.2 to <0.5), moderate (0.5 to <0.8), and large (≥0.8). Standardized effects were calculated in such a manner that a positive effect size value favored the IT conditions.
As there was a nested structure to the effect sizes calculated from the studies included (i.e., multiple effects nested within groups and nested within studies), multilevel mixed effect meta-analyses with both study and intra-study groups included as random effects in the model were performed. Cluster robust point estimates and the precision of those estimates using 95% compatibility (confidence) intervals (CIs) were produced, weighted by the inverse sampling variance to account for the within-and between-study variance (τ 2 ). Restricted maximal likelihood estimation was used in all models. Two main models were produced for both pre-registered main outcomes (absolute fat mass and FFM), including all standardized effect sizes, to provide a general estimate of the comparative treatment effects. All other models were considered secondary and exploratory analyses.
For all models, we avoided dichotomizing the existence of an effect for the main results and therefore did not employ traditional null hypothesis significance testing, which has been extensively critiqued [35,36]. Instead, we considered the implications of all results compatible with these data, from the lower limit to the upper limit of the interval estimates, with the greatest interpretive emphasis placed on the point estimate. Given the large number of included studies and effects, the main models are visualized here using ordered caterpillar plots to aid interpretation, as opposed to traditional forest plots containing study characteristics. Note that all study characteristics are available in the data file in the The risk of small study bias was examined visually through contour-enhanced funnel plots. Q and I 2 statistics were also produced and reported [37]. A significant Q statistic is typically considered indicative of effects likely not being drawn from a common population. I 2 values indicate the relative degree of heterogeneity in the effects that are not due to sampling variance and are qualitatively interpreted as: 0-40%: not important, 30-60%: moderate heterogeneity, 50-90%: substantial heterogeneity, and 75-100%: considerable heterogeneity [38]. For within-participant effects, pre-post correlations for measures are often not reported in original studies; thus, for those studies where we had standard deviations for pre-, post-, and change scores (or were able to calculate the latter from confidence intervals, t statistics, or p values) we calculated the pre-post correlations directly as: and imputed the median correlation coefficient to studies as an appropriate estimate of the population parameter.
In addition to the main models, we secondarily produced models for relative fat and FFM (i.e., as a percentage of body mass), and refit all models using delta scores (i.e., changes) of outcomes in the raw units of measurement (i.e., kilograms and percentages) to facilitate interpretation in a complementary fashion. We also produced models where studies included a non-training control arm that examined the between-condition treatment effects for both IT vs CON, and MICT vs CON, to determine the 'within-condition' effect estimates on both their standardized and raw scales, i.e., the true treatment effect of performing IT or MICT alone.
We planned to conduct exploratory subgroup and moderation analyses across standardized effects for the following: work matched/unmatched, modality of training (ambulatory, cycling, or other), sex (proportion of sample as males), age (years), BMI (kg·m 2 ), intervention characteristics including level of intensity of effort for IT (i.e., SIT vs HIIT), withinsession IT interval number and duration and their interaction, duration of MICT sessions, the difference (i.e., MICT minus IT) in total weekly exercise duration (frequency × duration), and duration of interventions (weeks), method of body composition measurement (DXA, BIA, skinfolds, etc.), body composition region of measurement (upper, lower, trunk), and whether nutrition was controlled or uncontrolled. Note, we originally mentioned exploration of moderators for both standardised and unstandardised effects in our preregistration. However, we ultimately opted to just explore standardised effects for absolute fat mass and FFM outcomes to compliment and explore heterogeneity in our main models. Furthermore, we adapted the operationalization of some moderators (e.g., intervention characteristics such as total weekly exercise duration) and some we could not explore fully given the number of effects available for certain sub-groups (these are noted in the analysis code). We also fit further (not pre-registered) models to examine adherence (number of attended sessions as a proportion of number of prescribed sessions) and dropout (number of participants dropped out as a proportion of number of participants randomized) proportions, as well as a Poisson regression model for adverse event count data (per 1000 person-sessions). All exploratory models utilized the same multilevel mixed-effects structure and specifications as the main models.
As a final exploratory (not pre-registered) analysis, we examined the variation in responses between both IT and MICT conditions. We sought to identify whether there was evidence of 'true' inter-individual variation from within-participant variability and/or participant-by-treatment interaction in responses to interventions by comparing the standard deviations for change scores with those of non-exercise control conditions [23,39]. We identified a mean-variance (on both the raw and log-transformed scales) relationship across studies for change scores (see https://osf.io/6zb8y/). Thus, we opted to adjust for this by employing a multilevel meta-regression of the log-transformed change score standard deviations, adjusted for the log-change score mean [40], calculated such that positive values showed that intervention condition (i.e., IT and MICT) variation exceeded control condition variation-thus suggesting evidence of 'true' inter-individual response variation. Where studies did not report change score standard deviations, or we were unable to calculate it directly, this was estimated using the imputed median pre-post correlation coefficient noted above as: Note that, given the different measurement devices used in individual studies, we accepted pragmatically the inherent assumptions built into this comparison of a constant Gaussian measurement error (i.e., that measurement error does not scale in a non-linear fashion with measured scores).

Search Results
From the initially reviewed 2085 search results, a total of 56 studies were determined as meeting the inclusion criteria for our analysis. Two studies stated that body composition measures were performed, but did not report information on this outcome in the manuscript [41,42]. Attempts to obtain the data from the corresponding authors proved unsuccessful. Thus, we analyzed 54 studies that compared the effects of IT and MICT on measures of body composition. Table 1 presents a summary of the methods of the included studies. Table 2 presents descriptive information as to the included studies. Figure 2 shows the contour enhanced funnel plot for all effects from these studies. Inspection of the funnel plot did not reveal any obvious small study bias. MICT: Cycling, 60-70% of peak heart rate IT (a) Cycling, 4HIIT group-4 240 s sets at 85-95% peak heart rate with 180 s recovery period at 50-70% peak heart rate IT (b) Cycling, 1HIIT group-1 set 240 s at 85-95% peak heart rate with 180 s cool down at 60-70% peak heart rate

Methodological Quality
Study quality, as assessed by the PEDro scale, had a mean rating of 5.6, indicating that the overall pool of studies are of good quality. A total of 32 studies were rated as being of excellent quality, 21 studies were rated as being of good quality, and 1 study was rated as being of fair quality; no study in the analysis was deemed to be of poor quality. Individual scoring is available in the online Supplementary Materials (https://osf.io/b28qd/).

Fat Mass
The main model for all fat mass effects (55 across 29 clusters (median = 1, range = 1-6 effects per cluster)) revealed a trivial standardized point estimate with a high precision for the interval estimate (−0.02 (95%CI = −0.07 to 0.04)), with moderate heterogeneity  Figure 3 presents all standardized effects and interval estimates for fat mass outcomes across studies in an ordered caterpillar plot.

Methodological Quality
Study quality, as assessed by the PEDro scale, had a mean rating of 5.6, indicating that the overall pool of studies are of good quality. A total of 32 studies were rated as being of excellent quality, 21 studies were rated as being of good quality, and 1 study was rated as being of fair quality; no study in the analysis was deemed to be of poor quality. Individual scoring is available in the online Supplementary Materials (https://osf.io/b28qd/).

Fat Mass
The main model for all fat mass effects (55 across 29 clusters (median = 1, range = 1-6 effects per cluster)) revealed a trivial standardized point estimate with a high precision for the interval estimate (−0.02 (95%CI = −0.07 to 0.04)), with moderate heterogeneity (Q(54) = 79.08, p = 0.015, I 2 = 36%). Figure 3 presents all standardized effects and interval estimates for fat mass outcomes across studies in an ordered caterpillar plot.

GRADE Summary of Findings for Main Outcomes
For both fat mass and FFM there was a 'high' certainty of evidence with respect to the effects identified. It was deemed that there was no serious risk of bias, inconsistency, indirectness of evidence, or imprecision in estimates, nor were there other clear considerations impacting on certainty of evidence grading. The GRADE summary of findings table for our main outcomes is available in the Supplementary Materials (https://osf.io/pcyvx/).

Secondary Analyses
Between condition treatment effect models on both the raw effect scales, and using relative fat outcomes (relative lean models not run due to limited data), showed similar outcomes to the main models reported. Thus, for brevity, these are presented in the Supplementary Materials along with caterpillar plots (see folder "Outputs and Figures" > "Secondary Outcomes Outputs" > "Additional Between Condition Models" at https://osf.io/6karz/).

Within-Condition Treatment Effects
All within-condition models are also available in the Supplementary Materials (see folder "Outputs and Figures" > "Secondary Outcomes Outputs" > "Within Condition Models" at https://osf.io/6karz/) and here we report just the results for absolute fat and FFM outcomes on standardized and raw scales. In comparison to non-intervention control groups, the IT conditions resulted in small reductions in fat mass (Hedges' g = −0.22 (95%CI = −0.36 to −0.08); kilograms = −0.20 (95%CI = −0.34 to −0.06)), and trivial increases

GRADE Summary of Findings for Main Outcomes
For both fat mass and FFM there was a 'high' certainty of evidence with respect to the effects identified. It was deemed that there was no serious risk of bias, inconsistency, indirectness of evidence, or imprecision in estimates, nor were there other clear considerations impacting on certainty of evidence grading. The GRADE summary of findings table for our main outcomes is available in the Supplementary Materials (https://osf.io/pcyvx/).

Secondary Analyses
Between condition treatment effect models on both the raw effect scales, and using relative fat outcomes (relative lean models not run due to limited data), showed similar outcomes to the main models reported. Thus, for brevity, these are presented in the Supplementary Materials along with caterpillar plots (see folder "Outputs and Figures" > "Secondary Outcomes Outputs" > "Additional Between Condition Models" at https: //osf.io/6karz/).

Sub-Group and Meta-Regression Analyses
Sub-group and meta-regression models were not run for absolute FFM standardized effects, given the negligible heterogeneity in the main model. When exploring subgroup and meta-regression models for absolute fat mass standardized effects, only two moderators-sex (proportion of males in sample; β = 0.0015 (95%CI = 0.00 to 0.0029)) and the number of intervals performed per training session by IT (β = −0.0032 (95%CI = −0.0052 to −0.0013))-appeared to have an effect, albeit this effect was relatively small for both covariates. Again, for brevity, all sub-group and meta-regression models are included in the Supplementary Materials (see folder "Outputs and Figures" > "Secondary Outcomes Outputs" > "Sub-group and Meta-regression Models" at https://osf.io/6karz/).

Inter-Individual Response Variation
There was no clear evidence of 'true' inter-individual variation in responses for either IT or MICT conditions. The difference in intercepts when compared with CON conditions were −0.15 (95%CI = −0.35 to 0.05) and −0.02 (95%CI = −0.22 to 0.18) for IT and MICT, respectively (see figure in Supplementary Materials: https://osf.io/3mazj/).

Discussion
This is the most comprehensive meta-analysis to date comparing IT and MICT on changes of measures of fat mass and FFM. Furthermore, GRADE assessment suggests high certainty in the evidence presented. Our findings provide novel insights into the use of different training strategies to bring about changes in body composition. Below, we discuss the results and practical implications of our data for each outcome.

Changes in Fat Mass
It has been speculated that IT may confer superior fat loss benefits compared to MICT, primarily mediated via a greater excess post-exercise oxygen consumption (EPOC) [97]. However, the overall magnitude of additional energy expenditure attributed to EPOC during IT is modest [98], and thus is unlikely to be of practical meaningfulness from a fat loss standpoint. Other proposed benefits of IT on fat reduction include enhancements in appetite suppression, fat oxidation, and circulating catecholamines and lipolytic hormones [98]. Despite this mechanistic rationale, our results do not support a superiority of IT on reductions in fat mass. Analysis of standardized between-group treatment effects showed similar changes for IT and MICT with both absolute fat mass as our pri-Sports 2021, 9, 155 20 of 28 mary outcome (Hedges' g = (−0.02 (95%CI = −0.07 to 0.04)), and percentage body fat (Hedges' g = −0.04 (95%CI = −0.08 to 0.01)). Raw absolute fat mass changes revealed a trivial point estimate of −0.17 kg favoring MICT, although the interval estimate ranged from −0.66 kg in favor of MICT to 0.31 kg in favor of IT. Comparison of raw relative (%) fat mass changes in fat mass revealed a small point estimate of −0.30% favoring MICT, but again, the interval estimate was imprecise, ranging from −0.63% in favor of MICT to 0.04% in favor of IT. Taken as a whole, these findings suggest that changes in fat loss are not meaningfully influenced by patterns of intensity of effort and duration (i.e., IT vs MICT) during exercise.
When compared to non-exercising controls, IT and MICT produced small reductions in fat mass, with minimal differences between conditions. The raw absolute fat loss amounted to −0.22 kg for IT and −0.25 kg MICT, with standardized Hedges' g ES values of 0.22 and 0.20, respectively. Relative changes in fat mass for IT and MICT showed similarly small decreases vs controls, both on a raw (0.30% and 0.25%, respectively) and standardized (0.28 and 0.24, respectively) basis. None of the studies that included control conditions combined exercise with dietary intervention (i.e., caloric deficit) and thus, collectively, these data suggest that exercise alone induces a small magnitude of fat loss regardless of the patterns of intensity of effort and duration, at least under the methods employed in current research. More extreme volumes of exercise may be necessary to induce meaningful changes, irrespective of the intensity of effort. The observed changes in fat mass (~0.2 kg) in present studies and intervention examined are unlikely to be clinically or aesthetically meaningful in most populations. Indeed, these findings concur with earlier results from Keating et al. [8].
The lack of overall fat loss achieved in both IT and MICT can be attributed, at least in part, to the relatively low weekly exercise dose across studies (IT, median = 28 min duration (range = 3 min to 120 min); MICT, median = 120 min duration (range = 48 min to 250 min), and perhaps is confounded by a corresponding increase in energy intake [99] and/or reduction in non-exercise activity thermogenesis [100]. Tightly controlled research in identical twins shows that prolonged daily aerobic-type exercise can induce marked reductions in fat mass under conditions of constant energy and nutrient intake [101]. However, the time commitment needed to achieve these results (~100 min/day) is infeasible for the majority of the general public and is thus of limited practical relevance. Therefore, our findings underscore the importance of dietary prescription to facilitate weight loss; however, exercise may play an important supplementary role in the process [102].
In contrast to the recent meta-analysis from Sultana et al. [11], we did identify some moderate heterogeneity in our main model, leading us to explore possible moderators. For example, some evidence suggests that IT elicits greater reductions in abdominal adiposity compared to MICT [17]. Given the well-established association between android fat and cardiometabolic disease [103], such an outcome would potentially have major health implications if found to be true. However, our findings refute this contention, demonstrating similar changes in abdominal fat mass between conditions. Moreover, we found that relatively equal, albeit modest, fat loss occurred across the upper body, lower body and trunk regions regardless of condition, indicating that endurance-oriented exercise does not preferentially target specific fat deposits. Indeed, with the exception of sex and the number of intervals performed during IT training sessions, both of which also only had very trivial moderating effects, we did not identify any clear moderators of comparative treatment effects for fat mass.

Changes in Fat-Free Mass
Some researchers have proposed that the performance of aerobic exercise can elicit increases in skeletal muscle hypertrophy that are comparable to resistance exercise training [15]. However, a meta-analysis by Grgic et al. [104] refuted this contention, showing significantly greater hypertrophic adaptations from resistance training vs aerobic training, both at the whole-muscle and myofiber level. However, it should be noted that Grgic et al. [104] did not subanalyze the effects of endurance exercise intensity on hypertrophy outcomes. A recent review speculated that IT may provide sufficient stimulus to enhance muscle growth, particularly in middle-aged and older adults, as well as clinical populations [16]. Furthermore, some emerging evidence suggests that, although traditional resistance training and aerobic modality interventions may produce differing adaptations, when duration and intensity of effort are matched, similar strength and endurance adaptations may occur, although the impact on hypertrophy is less clear [105].
Our results suggest that endurance exercise intensity and duration may not mediate hypertrophic adaptations. Specifically, analysis of changes in FFM, both on an absolute and relative basis, demonstrated similar effects between IT and MICT. Between-condition treatment standardized effects for absolute changes in FFM were essentially zero ((−0.0004 (95%CI = −0.05 to 0.05)), and comparison of effects on the raw scale showed a small point estimate of 0.09 kg favoring IT, yet the interval estimate ranged from −0.18 kg in favor of MICT, to 0.35 kg in favor of IT. There were limited data reporting relative changes in FFM, with only three studies directly comparing MICT vs IT. Pooling of these data revealed a moderate magnitude of effect (−0.98%) favoring MICT. However, due to the lack of data, the confidence intervals around the point estimate were wide (−3.39% to 1.43%), and Hedges' g values indicated a trivial standardized mean difference (0.17) with similarly wide interval estimates (−0.69 to 0.35). From a practical standpoint, these findings collectively suggest there may not be a meaningful difference between MICT and IT on absolute changes in FFM.
Compared to non-exercising controls, our findings indicate trivial standardized effects for improvements in FFM for both conditions (IT, Hedges' g = 0.13 (95%CI = 0.04 to 0.22); MICT, Hedges' g = 0.07 (95%CI = −0.01 to 0.16)). IT showed absolute raw increases of 0.11 kg whereas MICT showed increases of 0.07, although both the lower bounds of the interval estimates included zero and the upper bounds did not reach particularly meaningful values. These data collectively suggest that neither MICT nor IT meaningfully affect FFM under the methods employed across studies, and call into question the claim that endurance-based exercise is a viable interventional strategy for promoting muscle hypertrophy.

Exercise Adherence and Dropouts
Adherence was essentially identical between conditions, with both groups completing 90% of sessions; dropouts were also similar and relatively low at~13-17%. It has been argued that the intensity of effort of exercise influences core affective response [106], and that this is predictive of future intentions and behavior in relation to exercise [107]. However, a recent systematic review suggests that affective response may only differ trivially between IT and MICT, and that enjoyment responses may demonstrate a small effect in favor of IT [108]. Despite varying speculative theories regarding the intensity of effort during exercise, and its impact on affect or enjoyment, and subsequent behaviors, the results here suggest that adherence to IT and MICT is largely similar and relatively high, at least over the duration of the studies and under the conditions in which the interventions were employed. Indeed, it should be noted that exercise sessions in the included studies were carried out with the aid of programming from the respective research teams and were generally performed under direct supervision. It is well-established that programming and supervision have positive effects on exercise adherence [109]. Thus, our findings in this regard cannot necessarily be extrapolated to self-directed exercise programs. Given the high interindividual variability observed in the psychological response to endurance exercise [110], it would seem that allowing for a choice of training intensity would likely help to improve long-term adherence. Future research should endeavor to test this hypothesis under ecologically valid conditions.

Adverse Events
Of the studies reporting adverse events, there was essentially no difference between IT and MICT. On the surface, this would seem to suggest that both conditions are similarly safe in the populations studied. However, most studies failed to report incidences of adverse events. Furthermore, some studies lacked clarity as to whether there was a comprehensive attempt to record all possible adverse events associated with the training intervention. Thus, data on the topic is somewhat limited, precluding the ability to draw strong inferences regarding the safety between protocols.
A recent meta-analysis that examined the effects of supervised IT in patients with cardiovascular disease reported only five associated adverse cardiovascular events in approximately 17,000 training sessions: one major cardiovascular event, one minor cardiovascular, and three incidences of musculoskeletal issues. Although these findings appear to indicate that IT is generally safe, even in populations with non-communicable diseases and other health risks, results may be confounded by underreporting of adverse events in individual studies, and perhaps also by sampling bias for the types of individuals likely to participate in such studies. Researchers are thus encouraged to track and disclose the occurrence of such incidences in future studies on HIIT and MICT so that we can achieve a greater understanding of the risks associated with each strategy.

Inter-Individual Response Variation
Variance of treatment responses to IT and MICT has been relatively underexplored, despite numerous studies purporting that there may be inter-individual response variation to IT and MICT for a range of outcomes [19][20][21]. Indeed, some have argued that such variations may mask differences between IT and MICT for fat loss [22]. Evidence from the HERITAGE Family Study would genetically support this speculation, given that a putative dominant locus accounting for 31% of variance in fat mass changes was found [111]. However, we found no evidence of 'true' inter-individual variability in responses to either IT or MICT. This is in agreement with findings from a recent meta-analysis of aerobic exercise in overweight individuals and children and adolescents with obesity on fat loss [18]. Given our findings, and the relatively low heterogeneity across the main models for outcomes, the majority of apparent differences in study level results and apparent 'response heterogeneity' are likely attributable to sampling variance and random withinsubject variability.

Limitations
The present meta-analysis has several limitations that must be taken into account when attempting to draw practical inferences on the effects of IT vs MICT on measures of body composition. First and foremost, only three studies prescribed dietary energy restrictions for the interventional protocol. Thus, it is not clear whether one exercise strategy may be superior to another when combined with a nutritional intervention. Second, only one study supplemented the exercise intervention with a resistance training component. It is possible that differences in intensity and duration between IT and MICT protocols might alter responses when combined with resistance training. Although recent evidence questions whether there is an interference effect from concurrent training, at least for hypertrophy [112], the specific roles of endurance exercise intensity and duration upon fat mass under these conditions have yet to be elucidated. Third, very few studies involved trained athletes, and the vast majority of subjects would be considered to be overweight/obese. Thus, it remains to be determined how differences in endurance exercise intensity and duration may affect body composition outcomes in lean and athletic populations. Moreover, the majority of included studies examined outcomes in younger to middle-aged adults, limiting our ability to draw conclusions about the effects of IT and MICT on older populations. Fourth, although we were able to separate studies that had included control groups for the purpose of a 'within-condition' analysis of the true treatment effects for IT and MICT, in addition to exploration of interindividual response variability, these were secondary exploratory analyses. Our search strategy and inclusion were not optimized to identify all studies that included either IT or MICT and a non-training CON condition. However, our estimates for within-group IT effects were not dissimilar to those reported by Sultana et al. [11] for IT vs CON, who did include studies with either an MICT or a nontraining CON condition. Finally, our analysis is specific to body composition changes and does not take into account the other potential effects of the different interventional exercise strategies. Some evidence indicates that higher intensities of exercise may confer superior health-related benefits such as improvements in glucose control, blood pressure, vascular function, and cardiorespiratory fitness [113]. Thus, the use of a given endurance exercise strategy should consider individual goals in combination with abilities and preferences.

Conclusions
Our findings provide compelling evidence that the patterns of intensity of effort and duration during endurance exercise has minimal influence on longitudinal changes in fat mass and FFM. From a practical standpoint, this implies that individuals can choose the intensity of effort and duration combination (i.e., IT or MICT) that best suits their needs and lifestyle. As a general rule, there is an efficiency/effort tradeoff along the intensity of effort spectrum, whereby IT requires less time but more effort than MICT to promote alterations in body composition. Given that exercise adherence is of paramount concern, personal preference should thus guide prescription.
Our findings also indicate that structured exercise only has minor effects on fat loss regardless of intensity of effort and duration when performed at relatively modest doses; the amount of exercise required to achieve practically meaningful changes in this outcome seems to be unrealistic for most individuals. It is much easier to create an energy deficit from dietary restriction, which, therefore, should be the focus of weight loss interventions. However, exercise may help to preserve FFM and functional performance during periods of energy restriction [114], as well as facilitate sustenance of weight loss in combination with a dietary intervention [115]. Thus, it should be considered an important adjunct to nutritional approaches for those who endeavor to alter their body composition.
Author Contributions: B.J.S. conceived of the study, designed the methods, and conducted the quality assessment; J.S. assisted in the methods design and carried out statistical analyses; J.G. conducted the quality assessment; D.P., A.R. and M.C.-M. carried out the search; D.V.E., H.Z. and B.M. coded the studies; all authors were meaningfully involved in interpreting data, and drafting and critically revising the manuscript for intellectually important content. All authors have read and agreed to the published version of the manuscript.