Positive Effects of Exercise Intervention without Weight Loss and Dietary Changes in NAFLD-Related Clinical Parameters: A Systematic Review and Meta-Analysis

One of the focuses of non-alcoholic fatty liver disease (NAFLD) treatment is exercise. Randomized controlled trials investigating the effects of exercise without dietary changes on NAFLD-related clinical parameters (liver parameters, lipid metabolism, glucose metabolism, gut microbiota, and metabolites) were screened using the PubMed, Scopus, Web of Science, and Cochrane databases on 13 February 2020. Meta-analyses were performed on 10 studies with 316 individuals who had NAFLD across three exercise regimens: aerobic exercise, resistance training, and a combination of both. No studies investigating the role of gut microbiota and exercise in NAFLD were found. A quality assessment via the (RoB)2 tool was conducted and potential publication bias, statistical outliers, and influential cases were identified. Overall, exercise without significant weight loss significantly reduced the intrahepatic lipid (IHL) content (SMD: −0.76, 95% CI: −1.04, −0.48) and concentrations of alanine aminotransaminase (ALT) (SMD: −0.52, 95% CI: −0.90, −0.14), aspartate aminotransaminase (AST) (SMD: −0.68, 95% CI: −1.21, −0.15), low-density lipoprotein cholesterol (SMD: −0.34, 95% CI: −0.66, −0.02), and triglycerides (TG) (SMD: −0.59, 95% CI: −1.16, −0.02). The concentrations of high-density lipoprotein cholesterol, total cholesterol (TC), fasting glucose, fasting insulin, and glycated hemoglobin were non-significantly altered. Aerobic exercise alone significantly reduced IHL, ALT, and AST; resistance training alone significantly reduced TC and TG; a combination of both exercise types significantly reduced IHL. To conclude, exercise overall likely had a beneficial effect on alleviating NAFLD without significant weight loss. The study was registered at PROSPERO: CRD42020221168 and funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement no. 813781.


Introduction
Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease, affecting around 25% of the world's population [1] and 23% of the European population [2]. It encompasses liver conditions ranging from benign steatosis to non-alcoholic steatohepatitis (NASH), including inflammation with or without fibrosis [3]. Moreover, NASH can progress to liver cirrhosis and liver cancer [4,5]. The occurrence of NAFLD is rising due  [26]. NAFLD: non-alcoholic fatty liver disease; RCT: randomized control trial.

Data Preparation
To make all values across a particular parameter comparable, the data available in varying units were converted into one identical unit using appropriate conversion factors. HDL-C, LDL-C, and TC data were converted from milligrams per deciliter to millimoles per liter using the conversion factor 0.02586. For the conversion of TG data from milligrams per deciliter to millimoles per liter, a conversion factor of 0.01129 was used. A conversion factor of 0.0555 was used to convert milligrams per deciliter values of fasting glucose to millimoles per liter. Insulin data in picomoles per liter were converted to microinternational units per milliliter using a conversion factor of 6.945. HbA1c data from millimoles per mol were converted to a percentage (National Glycohemoglobin Standardization Program) values using an online HbA1c calculator (https://www.hba1cnet.com/hba1c-calculator/) (accessed on 11 December 2020). Thereafter, changes in the mean and standard deviation before and after the intervention were calculated for every parameter separately according to the Cochrane Handbook for Systematic Reviews of Interventions [29]. Briefly, for all paired analyses, the mean difference was calculated by subtracting the mean before and after the intervention. Missing standard deviations were imputed with a correlation coefficient of 0.8, representing the correlation between the baseline and follow-up scores. Thereafter, the change in standard  [26]. NAFLD: non-alcoholic fatty liver disease; RCT: randomized control trial.

Data Analyses 2.5.1. Data Preparation
To make all values across a particular parameter comparable, the data available in varying units were converted into one identical unit using appropriate conversion factors. HDL-C, LDL-C, and TC data were converted from milligrams per deciliter to millimoles per liter using the conversion factor 0.02586. For the conversion of TG data from milligrams per deciliter to millimoles per liter, a conversion factor of 0.01129 was used. A conversion factor of 0.0555 was used to convert milligrams per deciliter values of fasting glucose to millimoles per liter. Insulin data in picomoles per liter were converted to micro-international units per milliliter using a conversion factor of 6.945. HbA1c data from millimoles per mol were converted to a percentage (National Glycohemoglobin Standardization Program) values using an online HbA1c calculator (https://www.hba1cnet.com/hba1c-calculator/) (accessed on 11 December 2020). Thereafter, changes in the mean and standard deviation before and after the intervention were calculated for every parameter separately according to the Cochrane Handbook for Systematic Reviews of Interventions [29]. Briefly, for all paired analyses, the mean difference was calculated by subtracting the mean before and after the intervention. Missing standard deviations were imputed with a correlation coefficient of 0.8, representing the correlation between the baseline and follow-up scores. Thereafter, the change in standard deviation was calculated according to the Cochrane Handbook for Systematic Reviews of Interventions [29]. The original authors of the included studies were contacted if we did not have enough data for performing the statistical analyses.

Statistical Analysis
To investigate the anthropometry changes after the exercise intervention, an unpaired two-tailed t-test was used.

Meta-Analysis
All meta-analyses in this study were performed using R programming software (version 4.0.3.) and the packages "dmetar," "meta," "metafor," and "metaviz." Cumulative meta-analyses were used to explore the effects of exercise on NAFLD-related liver parameters (IHL, ALT, and AST), plasma lipid profile parameters (HDL-C, LDL-C, TC, and TG), and glucose metabolism parameters (fasting glucose, fasting insulin, and HbA1c). As all the outcomes were continuous, the standardized mean difference (SMD) was used as the measure for the effect size and was presented as a 95% confidence interval (CI) of the SMD. The SMDs of these parameters, from the baseline to the endpoint between groups (exercise vs. controls) in all studies, were calculated and pooled using random-effects or fixed-effect models. The heterogeneity across studies was tested using I 2 statistics. The meta-analysis was based on a random-effects model if there was moderate to high variation (I 2 > 25%); otherwise, a fixed-effects model was used.
The results for each of the parameters were represented as forest plots for overall exercise, along with subgroup analyses. The overall exercise included all the studies that measured the clinical parameter. The subgroup analyses were based on the exercise regimen (aerobic exercise/resistance training/combination of aerobic exercise and resistance training).

Detection of Bias
Potential publication bias was assessed using funnel plots. The asymmetry of the funnel plots was assessed by using trim and fill imputation and Egger's regression. Furthermore, a visual inference testing procedure using the line-up protocol with funnel plots was also employed [30]. However, restricting the search to only English-language articles added a language bias to our study.

Outlier Detection
Potential statistical outliers were detected according to Viechtbauer and Cheung (2010) [31]. A study was considered as an outlier if the study's CI did not overlap with the CI of the pooled effect [31]. Briefly, outliers with extremely small effects were detected by identifying all studies for which the upper bound of the 95% CI was lower than the lower bound of the pooled effect CI. Studies in this meta-analysis with extremely large effects were detected by searching for all studies where the lower bound of the 95% CI was higher than the upper bound of the pooled effect CI.

Influence Analysis
Studies in this meta-analysis exerting a high influence on the overall results were identified by conducting an influence analysis based on the leave-one-out method [31]. Briefly, the results of the meta-analysis were recalculated k − 1 times, each time leaving one study out. The results were represented as Baujat plots, forest plots (sorted by effect size and heterogeneity, respectively), and influence plots, measuring the standard residual, DIFFITS value, Cook's distance, covariance ratio, tau-squared, Q-value, hat value, and the weight of each individual study.

Search Results
The electronic search (until 13 February 2020) yielded 1949 potentially relevant studies. After removing duplicates and screening through article titles and abstracts, 73 studies were retained for full-text reading. After excluding 61 ineligible studies based on the exclusion/inclusion criteria, 12 RCTs remained for review. Two more studies were excluded after there was no response from the authors for raw data to be included in the metaanalysis [32,33]. Finally, 10 studies were included for the meta-analysis (Table 1). A flow diagram including all stages is presented in Figure 1. One out of the 10 studies had two exercise intervention arms with different exercise regimes-aerobic exercise and resistance training [34]-which were counted as separate RCT exercise intervention arms. Each intervention arm was compared with the control arm and divided into its subgroups (aerobic exercise alone-six RCTs, resistance training alone-three RCTs, and a combination of aerobic exercise and resistance training-two RCTs). In total, 316 individuals with a mean age ranging from 39.7 to 62 years (from both exercise and control groups) were included in the analysis. This included 136 and 192 participants in the exercise and control groups, respectively ( Table 2). None of the included studies had subjects with ischemic heart disease, whereas two of the included studies recruited participants with type 2 diabetes (T2D) that was treated with metformin [19,35]. However, no information was available about the number of participants with T2D. Details about other concomitant disorders were not available. Some studies had food records, but no diet results were reported [36][37][38][39][40]. All studies stated no diet changes. In addition, neither the search of gut microbiota and metabolites nor relevant NAFLD gene variations resulted in any studies for this review. Anthropometry changes after the exercise intervention were analyzed using an unpaired two-tailed t-test. Of all the anthropometry changes reported in the studies, the body weight, whole-body fat mass, body fat percentage, lean body mass, BMI, and visceral adipose tissue were all non-significantly changed after the different exercise interventions (p > 0.05). The results from only one study [38] showed that the aerobic exercise intervention significantly reduced the waist circumference (p = 0.01), while the others remained non-significantly changed (p > 0.05). There was variability in the number of participants, exercise duration, and settings provided in the interventions. The median (minimum-maximum) sample size was 12  and 11.5 (6-31) for the intervention and control groups, respectively. The studies included in the analysis enrolled between 6 and 33 participants in each arm. Collectively, the exercise intervention length ranged from 8 to 16 weeks. The median duration was 12 weeks. The average exercise session ranged between 30 and 60 min. However, the frequency of exercise was as little as 2-3 times per week in eight studies to 4-5 times per week in the two other studies. The major forms of aerobic exercise studies were walking, cycling, and using a treadmill. For resistance training, full-body workouts were performed. The main study characteristics are shown in Table 2.
The results of the quality assessment for ALT are displayed in Figure 2. Overall, none of the studies were of high risk. Two studies showed some concerns for risk of bias in their randomization process as they did not provide details on the use of a random allocation sequence [19,34]. Due to a high dropout rate after randomization, four studies showed some concerns in their deviation from intended interventions, as well as missing outcome data [36][37][38]42]. In these studies, no sham exercise was adopted in the control groups and thus, participants, carers, and intervention administrators were not blinded and were aware of the participants' assigned intervention. Despite these concerns for risk of bias, all studies were included in the subsequent meta-analysis. The results of the quality assessment for ALT are displayed in Figure 2. Overall, none of the studies were of high risk. Two studies showed some concerns for risk of bias in their randomization process as they did not provide details on the use of a random allocation sequence [19,34]. Due to a high dropout rate after randomization, four studies showed some concerns in their deviation from intended interventions, as well as missing outcome data [36][37][38]42]. In these studies, no sham exercise was adopted in the control groups and thus, participants, carers, and intervention administrators were not blinded and were aware of the participants' assigned intervention. Despite these concerns for risk of bias, all studies were included in the subsequent meta-analysis.

Meta-Analysis
The meta-analysis was performed for liver parameters (IHL, ALT, AST), plasma lipid profile parameters (HDL-C, LDL-C, TC, TG), and glucose metabolism parameters (fasting glucose, fasting insulin, HbA1c). However, not all the included studies reported all of these clinical parameters; their numbers are shown in Table 1.

Intrahepatic Lipid (IHL) Content
All the studies that measured IHL using MRS (n = 8) were included for the metaanalysis. Two other studies were not considered as they used US, which is not comparable with the results of MRS [34,40]. The results showed that IHL was significantly reduced upon exercise intervention compared with the controls (SMD: −0.76, 95% CI: −1.04, −0.48) (Figure 3a). Heterogeneity in the effect of overall exercise on IHL was not detected (I 2 = 0%, τ 2 = 0, p = 0.53). Subgroup analyses of aerobic exercise only (SMD: −0.80, 95% CI: −1.14, −0.46) and a combination of aerobic exercise and resistance training (SMD: −0.80, 95% CI: −1.38, −0.22) showed a significant reduction in IHL compared with the controls. Heterogeneity was non-significant for aerobic exercise (I 2 = 21%, τ 2 = 0.0416, p = 0.28) and the exercise combination (I 2 = 0%, τ 2 = 0, p = 0.60). A subgroup analysis of resistance training was not possible as there was only one study included.

Meta-Analysis
The meta-analysis was performed for liver parameters (IHL, ALT, AST), plasma lipid profile parameters (HDL-C, LDL-C, TC, TG), and glucose metabolism parameters (fasting glucose, fasting insulin, HbA1c). However, not all the included studies reported all of these clinical parameters; their numbers are shown in Table 1.

Intrahepatic Lipid (IHL) Content
All the studies that measured IHL using MRS (n = 8) were included for the meta-analysis. Two other studies were not considered as they used US, which is not comparable with the results of MRS [34,40]. The results showed that IHL was significantly reduced upon exercise intervention compared with the controls (SMD: −0.76, 95% CI: −1.04, −0.48) (Figure 3a). Heterogeneity in the effect of overall exercise on IHL was not detected (I 2 = 0%, τ 2 = 0, p = 0.53). Subgroup analyses of aerobic exercise only (SMD: −0.80, 95% CI: −1.14, −0.46) and a combination of aerobic exercise and resistance training (SMD: −0.80, 95% CI: −1.38, −0.22) showed a significant reduction in IHL compared with the controls. Heterogeneity was non-significant for aerobic exercise (I 2 = 21%, τ 2 = 0.0416, p = 0.28) and the exercise combination (I 2 = 0%, τ 2 = 0, p = 0.60). A subgroup analysis of resistance training was not possible as there was only one study included.

High-Density Lipoprotein Cholesterol (HDL-C)
HDL-C was slightly increased after the exercise intervention compared with the controls (SMD: 0.13, 95% CI: −0.17, 0.43) (Figure 4a). However, such a result was not statistically significant. Heterogeneity in the effect of overall exercise on HDL-C was very low

High-Density Lipoprotein Cholesterol (HDL-C)
HDL-C was slightly increased after the exercise intervention compared with the controls (SMD: 0.13, 95% CI: −0.17, 0.43) (Figure 4a). However, such a result was not statistically significant. Heterogeneity in the effect of overall exercise on HDL-C was very low and non-significant (I 2 = 13%, τ 2 = 0.0194, p = 0.33). The results of the subgroup analyses of aerobic exercise alone (SMD: 0.09, 95% CI: −0.55, 0.73) showed no change in HDL-C. The results from the heterogeneity analysis of the aerobic exercise studies subgroup were also non-significant (I 2 = 47%, τ 2 = 0.1523, p = 0.15). No analysis for the subgroup resistance training and the combination of aerobic exercise and resistance training was performed because there was only one study per group.

Fasting Glucose
Fasting glucose did not change after the exercise intervention compared with the controls (

Detection of Bias: Funnel Pots
In meta-analyses, funnel plots are often used to assess potential publication bias. Visual inspection of the funnel plots can help to improve the objectivity and validity of funnel-plot-based conclusions by guarding the meta-analyst from interpreting patterns in the funnel plot that are plausible by chance [30]. In this meta-analysis, visual examination of the funnel plots showed no studies for HDL-C, LDL-C, and HbA1c; one study for IHL, TC, fasting glucose, and fasting insulin; two studies for ALT and TG; and three studies for AST that were indicated as having a potential publication bias (Figure 6a and Figures S1-S9a). However, visual inspections can lead to the drawing of incorrect conclusions [43].
Statistical tests, such as trim and fill and Egger's regression, were suggested for assessing potential publication bias by establishing objectivity, while at the same time, controlling for type I errors [30,44,45]. In this study, based on Egger's test, publication bias was not detected for the results on IHL (p = 0. However, these statistical tests focused on funnel plot asymmetry were quantified via the association of study effects with standard error [30]. Moreover, the power of the tests is lower when less than 10 studies are included [29]. The Cochrane Handbook for Systematic Reviews of Interventions [29] recommends not interpreting the statistical tests in isolation, but along with visual inspection of the funnel plot. Hence, to control for type I errors and to preserve the explorative nature of the visual inspection, a funnel plot of the observed data was simulated [30,46]. A line-up of 20 funnel plots was created, out of which, 19 showed data simulated under the null hypothesis and 1 with the observed data positioned randomly. In this study, the presence of heterogeneity was subject to visual inference. Hence, the fixed-effect model was used as the null. Moreover, to increase the power of the line-up procedure to detect small study effects, Egger's regression line was drawn in each funnel plot, along with the trim-and-fill-imputed studies that were potentially missing due to publication bias.  After simulating the funnel plots, the position of the real data funnel plot was identified via visual inspection and confirmed by decrypting it using an R code. Based on the selected plot, it was observed that the included studies for HDL-C, LDL-C, and HbA1c did not indicate any potential publication bias, ALT and AST presented two studies, while IHL, TC, TG, fasting glucose, and fasting insulin presented one study each that indicated a potential bias (Figure 6 and Figures S3b-S11b).

Outlier Detection
Statistical outliers were detected by identifying studies with extreme effect sizes [31]. No statistical outliers were detected for both fixed-effect and random-effects models for IHL, HDL-C, LDL-C, and HbA1c. One study (Shojaee-Moradie et al. 2016 [39]) was identified as the statistical outlier for both fixed-effect and random-effects models for AST ( Figure S12), TC ( Figure S13), TG (Figure S14), and fasting insulin ( Figure S15). For the fixed-effect model for ALT, the study by Sullivan et al. (2012) [42] was identified as an outlier ( Figure S16). Furthermore, the work by Cheng et al. (2017) [36] was identified as a statistical outlier for the fixed-effect models for AST ( Figure S12) and fasting glucose ( Figure S17).

Influence Analysis
A study detected as a statistical outlier may not be of much consequence if it exerts little influence on the results. However, some studies in a meta-analysis could exert a high influence on the overall results. As an example, even though the overall effect is nonsignificant, it could be that case that a highly significant effect is be found by removing one or a particular set of studies. Hence, influence analysis was performed on all parameters considered in this study to detect studies that mostly influence the overall estimates of the meta-analysis [31].
The influential analysis of IHL predicted the study by Cheng et al. [36] to be influential. This study, as seen from the Baujat plot, contributed considerably to the overall heterogeneity, as well as being very influential (Figure 7a). The extreme values in the diagnostic tests for influence measures identified this study as an influential case (Figure 7b). No conclusions of influential studies could be drawn based on the heterogeneity, as all I 2 values remained zero when each of the studies was omitted (Figure 7d). However, the lowest overall effect favoring the exercise group was reached by omitting Cheng et al. [36], which again corroborated the finding that this study could be an influential case (Figure 7c). The study by Cuthbertson et al. [37] also moderately contributed to the overall heterogeneity and influence and omitting it had the highest overall effect, favoring the exercise group (Figure 7a,c). Nevertheless, this study did not report any extreme values in the diagnostic tests for influence measures (Figure 7b).
The influence analysis identified the study by Zelberg-Sagi et al. [40] as an influential case for ALT ( Figure S16b). This was shown in the Baujat plot ( Figure S18a). This plot also showed that the study conducted by Sullivan et al. [42] contributed highly to the overall heterogeneity, while at the same time, having a medium influence on the pooled results ( Figure S18a). In addition, the studies from Cuthbertson et al. [37] and Shojaee-Moradie et al. [39] showed moderate influences and moderate contributions to the overall heterogeneity ( Figure S18a). The highest effect favoring the exercise group could be reached after omitting the Zelberg-Sagi et al. study [40] (Figure S18c). Omitting the Sullivan et al. study [42] resulted in having the least heterogeneity ( Figure S18d). However, this study did not have any extreme values in the diagnostic tests for influence measures ( Figure S18b). [36], which again corroborated the finding that this study could be an influential case (Figure 7c). The study by Cuthbertson et al. [37] also moderately contributed to the overall heterogeneity and influence and omitting it had the highest overall effect, favoring the exercise group (Figure 7a,c). Nevertheless, this study did not report any extreme values in the diagnostic tests for influence measures (Figure 7b). The influence analysis identified the study by Zelberg-Sagi et al. [40] as an influential case for ALT ( Figure S16b). This was shown in the Baujat plot ( Figure S18a). This plot also showed that the study conducted by Sullivan et al. [42] contributed highly to the overall heterogeneity, while at the same time, having a medium influence on the pooled results ( Figure S18a). In addition, the studies from Cuthbertson et al. [37] and Shojaee-Moradie et al. [39] showed moderate influences and moderate contributions to the overall heterogeneity ( Figure S18a). The highest effect favoring the exercise group could be reached after omitting the Zelberg-Sagi et al. study [40] (Figure S18c). Omitting the Sullivan et al. study [42] resulted in having the least heterogeneity ( Figure S18d). However, this study did not have any extreme values in the diagnostic tests for influence measures ( Figure S18b).
Out of nine interventions that reported AST levels, four interventions were detected as influential cases-Cheng et al., Cuthbertson et al., Shojaee-Moradie et al., and Zelberg-Sagi et al. [36,37,39,40]. As shown in the Baujat plot, the studies by Cheng et al. [36] and Zelberg-Sagi et al. [40] highly contributed to influencing the pooled results ( Figure S19a). The study by Shojaee-Moradie et al. [39] moderately contributed to influencing the pooled results; however, it highly contributed to the heterogeneity ( Figure S19a). The study by Cuthbertson et al. [37] had a moderate influence on the overall results. These four studies had extreme values in the diagnostic tests for influence measures, and therefore, they were  [36,37,39,40]. As shown in the Baujat plot, the studies by Cheng et al. [36] and Zelberg-Sagi et al. [40] highly contributed to influencing the pooled results ( Figure S19a). The study by Shojaee-Moradie et al. [39] moderately contributed to influencing the pooled results; however, it highly contributed to the heterogeneity ( Figure S19a). The study by Cuthbertson et al. [37] had a moderate influence on the overall results. These four studies had extreme values in the diagnostic tests for influence measures, and therefore, they were identified as influential cases ( Figure S19b). Furthermore, omitting the study by Zelberg-Sagi et al. [40] resulted in having the highest effect favoring the exercise group ( Figure S19c) and omitting the study by Shojaee-Moradie et al. [38] resulted in having the least heterogeneity ( Figure S19d).
The Baujat plot based on the influential analysis of HDL-C showed Sullivan et al. [43] contributed both to the overall influence and heterogeneity ( Figure S20a). However, no study had extreme values in the diagnostic tests for influential measures ( Figure S20b). Omitting this study resulted in the highest effect favoring the control and the least heterogeneity ( Figure S20d).
For LDL-C, the Baujat plot based on the influential analysis showed that the study by Cuthbertson et al. [37] had a high influence and a high contribution to overall heterogeneity ( Figure S21a). This study also had extreme values in the diagnostic tests for influential measures ( Figure S21b). Omitting this study resulted in the highest effect favoring the exercise group and the second least heterogeneity (Figure S21c,d).
The Baujat plots for TC identified the study by Shojaee-Moradie et al. [39] as both influential and having contributed to the overall heterogeneity ( Figure S22a). In addition, Cuthbertson et al. [37] had extreme values in the diagnostic tests for influential measures ( Figure S22b). Omitting the Cuthbertson et al. study [37] resulted in having the highest effect favoring exercise ( Figure S22c) and omitting Shojaee-Moradie et al. [39] resulted in having the lowest effect favoring exercise ( Figure S22d). Furthermore, omitting Shojaee-Moradie et al. [39] made the overall results non-significant.
The Baujat plot results of TG showed that two studies-Cheng et al. and Shojaee-Moradie et al.-had a high influence ( Figure S23a) [36,39]. In addition to being highly influential, the Shojaee-Moradie et al. [39] study also contributed considerably to the overall heterogeneity ( Figure S23a). These two studies also had extreme values in the diagnostic tests for influence measures ( Figure S23b). Omitting these two studies resulted in having the least heterogeneity favoring the exercise group ( Figure S23d). Furthermore, upon calculating the pooled effects after omitting the study by Cheng et al., the highest effect was obtained ( Figure S23c).  [36,39,40] were detected as influential, as shown in the Baujat plot ( Figure S24a). Cheng et al. [36] had the highest influence on the pooled effects and contributed to the highest overall heterogeneity. The study by Shojaee-Moradie et al. [39] had a moderate influence on the pooled effects and a moderate contribution to the overall heterogeneity. Based on its high sample size, Zelberg-Sagi et al. [40] were also identified as having an influence on effect size. All three studies had extreme values in the diagnostic tests for influential measures ( Figure S24b). Omitting the Cheng et al. [36] study resulted in the least effect favoring the control group. Omitting the studies by Zelberg-Sagi et al. [40] and Shojaee-Moradie et al. [39] resulted in the highest effect favoring the exercise group ( Figure S24c). Omitting these three studies resulted in the least heterogeneity ( Figure S24d).
Influence analysis of the fasting insulin parameters found that Shojaee-Moradie et al. [39] and Zelberg-Sagi et al. [40] were influential cases ( Figure S25a,b). Based on the Baujat plot, Shojaee-Moradie et al. [39] had a high influence and heterogeneity, and Zelberg-Sagi et al. [40] had a considerable influence on the pooled results ( Figure S25a). Omitting the Zelberg-Sagi et al. [40] study resulted in the highest effect favoring the exercise group and the second least heterogeneity ( Figure S25c,d). Omitting the study by Shojaee-Moradie et al. [39] resulted in the least effect favoring exercise ( Figure S25d). Furthermore, omitting this study made the overall results non-significant.
The Baujat plot based on the influential analysis for HbA1c showed that two studies-Cheng et al. and Zelberg-Sagi et al. [36,40]-were influential, with the highest influence on the pooled effects ( Figure S26a). However, these studies did not show any extreme values in the diagnostic tests for influential measures ( Figure S26b). Omitting Cheng et al. [36] resulted in the highest effect favoring the exercise group, while omitting Zelberg-Sagi et al. [40] resulted in the least effect favoring the exercise group ( Figure S26c). When each study was omitted, the I 2 value remained zero; therefore, no conclusions regarding influential studies could be drawn based on the heterogeneity ( Figure S26d).

Discussion
This SR and meta-analysis analyzed the effects of different exercise regimens (without diet interventions) on the liver, plasma lipid profile, and glucose metabolism parameters in individuals with NAFLD. To the best of our knowledge, our study is the first SR and metaanalysis to investigate the effects of exercise alone on NAFLD-related clinical parameters, independent of weight loss.
Our findings showed that IHL was significantly reduced after the exercise interventions. Reduced levels of IHL are known to improve systematic inflammation and liver metabolic dysfunctions [47]. In addition, lower levels of NAFLD during the early stages of NAFLD could eventually prevent the disease progression to NASH.
The subgroup analyses also showed IHL reductions for both aerobic exercise alone and the combination of exercises. In addition to our study, many other SRs, which also included non-supervised studies, shorter resistance training sessions, longer aerobic interventions, and higher intensity aerobic interventions, reported similar IHL reductions [16,[48][49][50][51].
For example, a reduction in IHL after aerobic and resistance training was reported by Baker et al. and Wang et al., respectively [49,50]. However, the conclusions of Baker et al. and Wang et al. were based on only one study that was included in the subgroup analysis. Moreover, unlike our study, which focused on the effects of exercise alone, Wang et al. reported a reduction in IHL in subjects undergoing a combined diet and exercise intervention. However, the actual mechanism(s) behind such an improvement remains uncertain. Multifactorial metabolic and molecular pathways for the pathogenesis of NAFLD were proposed [52,53]. One of the mechanisms involves insulin resistance (IR), which is one of the hallmarks of NAFLD. Systemic IR causes a disturbed suppression of adipose tissue lipolysis [54]. This leads to elevated free fatty acid (FFA) levels in the serum of NAFLD patients, which reach the liver [55,56]. In addition, glucose is converted to FFA via the de-novo lipogenesis in the liver that is caused by IR in skeletal muscles [54]. In the liver, these FFAs can either be synthesized to triglycerides and stored as lipid droplets in the liver leading to steatosis [57], excreted in very low-density lipoproteins (VLDL) [58], or oxidized in hepatic mitochondria (β-oxidation) [59]. It was shown that NAFL and NASH patients have increased fatty acid oxidation [60]. This β-oxidation increase may lead to mitochondrial dysfunction, and from this, reactive oxygen species (ROS) are formed and induce lipid oxidation, also called oxidative stress. Moreover, these created toxic metabolites can further contribute to mitochondrial damage [54]. Exercise intervenes in this process by improving the insulin resistance in adipose and muscle tissue [61], resulting in decreased systemic circulated FFA [62]. In addition, exercise downregulates the expression of several genes and proteins that are involved in lipogenesis, followed by reduced levels of FFA [54,63]. However, these findings were more documented in rodents than in humans [54].
Previous studies have shown that the duration of exercise intervention also affects the liver's fat metabolism. For instance, a significant decrease in IHL (compared with the controls) was observed in two studies that conducted supervised aerobic exercise interventions for 8 and 12 weeks, respectively [64,65]. However, no change in IHL was reported by Shojaee-Moradie et al., who conducted an exercise intervention for six weeks [66]. However, this intervention was limited to only three sessions of 20 min per week. An eight-week resistance training intervention was also successful in reducing IHL [19]. Moreover, Bacchi et al. compared the effects of aerobic exercise or resistance training on IHL for four months and observed a 25-30% reduction in IHL from the baseline [67]. These results are consistent with the observations in this study, where it was found that both 12-and 16-week exercise interventions significantly reduced IHL ( Figure S27). However, no conclusions could be drawn from the 8-week intervention as a meta-analysis could not be performed.
Elevated concentrations of liver enzymes, such as ALT and AST, often reflect non-specific liver damage or inflammation of liver cells [68]. This damage is more pronounced in NAFLD patients, making these enzymes appropriate markers for inflammation [68]. Other reasons for elevated liver enzymes include the use of certain medication and high-fat-high-carbohydrate diet consumption [69,70], which can be normalized without the influence of exercise [69,70]. In this study, the exercise intervention reduced ALT and AST slightly but significantly, which was consistent with Wang's and Katsagoni's meta-analyses [10,50]. In our study, aerobic exercise, but not resistance training nor the combination of exercises, reduced the concentrations of ALT and AST. Contradictory results were obtained by Keating's meta-analysis, where shorter resistance training sessions were considered [16]. They found that IHL, but not ALT, was reduced after overall exercise. The limited inclusion of studies on aerobic exercise by Keatings et al. might have contributed to the contradictory results.
The main cause of death in NAFLD individuals is cardiovascular diseases (CVDs) [71]. One reason for developing CVDs is atherogenic dyslipidemia, which is characterized by a low HDL-C concentration and high LDL-C, TC, and TG concentrations [71]. Our analysis with five studies showed that HDL-C did not change after exercise, which is consistent with another SR. LDL-C was significantly decreased after exercise, but no changes were observed in aerobic exercise alone. Wang et al. (2020) reported a decrease after overall exercise and aerobic exercise [50]. Furthermore, one paper documented that high-intensity training (HIT) exercises have the greatest positive influence on LDL-C [72]. Moreover, they found that resistance training with increased repetitions/sets was better at reducing LDL-C levels than resistance training with high weights and a low number of repetitions [72]. These results suggest that LDL-C reduction might depend on the exercise regimen. However, only four papers were included for LDL-C, which cannot confirm any of the outcomes mentioned above. Thus, more exercise studies in NAFLD measuring LDL-C and HDL-C concentrations are warranted.
NAFLD patients have a disturbed cholesterol metabolism that is characterized by elevated cholesterol concentrations, increased cholesterol synthesis, diminished cholesterol absorption, and changed expression of cholesterol metabolism genes [73][74][75]. In this study, TC concentrations were not changed regarding overall exercise but were significantly changed in the resistance training subgroup. A similar non-significant change in TC concentration was confirmed by several other studies, irrespective of the exercise regime used [62,67,76]. In contrast, an exercise and diet intervention in NAFLD patients showed a >10% reduction of TC after exercise [77]. Interestingly, most of the measured cholesterol concentrations in these studies were already in the normal range. This could explain why the aerobic exercise subgroup showed no changes in TC concentrations.
Significantly reduced TG concentrations were noticed for the overall exercise effect and the subgroup for resistance training alone. The significant reductions in TG concentrations in the resistance training subgroup could be partly explained by exercise-induced myokine production, such as irisin, which was negatively correlated with TG, TC, and intrahepatic triglycerides [78,79]. Irisin increases total energy expenditure and modulates lipid metabolism by inhibiting enzymes, such as sterol regulatory element-binding protein-1c and fatty acid synthases in hepatocytes [80]. Irisin decreases in NAFLD patients [78] but can increase in obese people after resistance training, though not with aerobic training [81]. This mechanism could plausibly explain the increase in TG in the resistance training group. However, as irisin was not measured in any of the included studies, further validation was not possible.
In this study, exercise did not have an overall effect on glucose metabolism. The fasting glucose and insulin concentrations did not change after overall exercise and all subgroups, which is also in line with another meta-analysis [50]. Interestingly, Cheng et al., who recruited prediabetic NAFLD patients, reported significantly reduced fasting glucose levels in the exercise group compared with the control group [36]. Cuthbertson et al. and Shojaee-Moradie et al. reported reduced glucose concentrations in both the exercise and control groups [37,39]. In longer interventions, this reduction was no longer significant. This could be explained by a slight increase in physical activity (such as increased walking) in the control group receiving standard lifestyle advice.
Another glucose metabolism parameter, namely, HbA1c, which represents 2-3-month average glucose concentrations, did not change after overall exercise and subgroup exercise interventions. However, only five studies were included in this analysis. Nevertheless, our results were consistent with the previous findings. For example, Cheng et al. found a significant decrease in HbA1c in only the diet plus exercise intervention group but not the exercise-only group [36]. Houghton et al. found no significant decrease after exercise [41]. Additionally, a meta-analysis conducted by Chen et al. [82] also found no significant effect of resistance training on HbA1c. On the other hand, studies that were conducted in T2D subjects showed a significant HbA1c level reduction. One study by Church et al. reported that only a combination of resistance training and aerobic exercise leads to significantly decreased HbA1c levels [83], whereas another study by Yavari et al. found a significant reduction in the aerobic exercise group [84].
Current lifestyle interventions aiming for weight loss were the focus of clinical management of NAFLD treatment [13,85], where 5-10% weight loss is recommended for individuals with NAFLD and NASH [86]. In addition, several guidelines for NAFLD treatment, such as EASL, the American Association of Liver Disease, and the European Society of Clinical Nutrition and metabolism, recommend weight loss to improve NAFLD-related clinical parameters [9,87,88]. In this study, bodyweight loss was not significant after the exercise intervention. Additionally, several other anthropometry changes, such as whole-body fat mass, body fat percentage, lean body mass, BMI, and visceral adipose tissue, were non-significant after the exercise intervention for all the included studies. Only one study [38] showed a significant reduction in waist circumference. The beneficial effects of exercise, despite an absence of weight loss and anthropometry changes, could be attributed to several factors, including the increase in muscle strength, reduction in inflammation and oxidative stress, and changes in organokine concentrations [89].
Recent studies focusing on various risk factors of metabolic diseases have proposed that exercise, in addition to directly influencing metabolic responses, also contributes to a change in the gut microbiota and the composition of the metabolites they produce [90][91][92]. Furthermore, the relationship with NAFLD in humans has not yet been established. Therefore, one of the goals of our meta-analysis was to systematically search the literature for RCTs that investigate the effects of exercise on the NAFLD condition and change in gut microbiota. However, to the best of our knowledge, no study has reported such RCTs, and this research gap demands attention. Furthermore, our study indicated that the impact of exercise alleviating risk and symptoms related to NAFLD has so far not been addressed, with comprehensive metabolite profiling techniques holding the potential to bring out in-depth information related to endogenous metabolic differences and pathways that are responsible for modulating those factors. To understand the metabolic events and consequences behind NAFLD etiology and how exercise and other lifestyle factors may impact those, metabolomics techniques will be useful to accompany the traditional clinical parameters measured. Concerning this research gap, investigations of the relationship between exercise, NAFLD, and gut microbiota and the metabolites they produce are currently ongoing (ClinicalTrials.gov Identifier: NCT03995056). However, more research is needed in this field.
Our study has several strengths. The use of stringent criteria for selecting studies and data for the meta-analysis ensured more consistent and unbiased results. Only RCTs were included in our study, as they are the gold standard for assessing intervention effects, and randomization reduces selection bias [93], making them desirable for pooling results. Furthermore, all exercise regimens in the RCTs included in our study were performed under supervision. Studies where exercise compliance was only retrospectively reported by participants themselves were not considered. Another strength is the use of multiple methods to detect potential publication bias, statistical outliers, and influential cases, making our study less likely to be subjected to the effects of biases. Regarding data included in the meta-analysis, changes in IHL were only included when MRS was used as the quantification method. This study focused specifically on the effects of exercise (without diet intervention) on NAFLD-related clinical parameters. However, the effects of diet interventions or diet-exercise combinations should not be completely neglected.
Some limitations were also identified in our study. The limited number of studies included made it difficult to conduct meta-analyses on the subgroups. Out of the 10 included studies, not every study measured all the different NAFLD-related parameters considered in this study, and in some cases, some parameters were only available in two studies. Moreover, only two of the included studies conducted a combination of aerobic and resistance training. Therefore, a conclusion based on the meta-analysis in these subgroups with few studies might not reflect the actual effect. Furthermore, the intervention durations were only between 8 and 16 weeks, and thus, the long-term effects of exercise on NAFLD-related clinical parameters could not be determined. Only studies in the adult population were included; therefore, the results cannot be generalized to children or adolescents. Regarding statistical analyses, imputations were made for several studies for calculating the effect size to provide overall effect estimates, which involved making assumptions and may not reflect the actual data. Moreover, only studies in the English language were included. Furthermore, the heterogeneity for most of the outcomes, including ALT, AST, TC, TG, fasting glucose, and fasting insulin, was high [94]. This could have possibly arisen due to clinical differences, methodological issues from the original study, including randomization, use of absolute rather than relative measures of risk, and publication bias [95]. Furthermore, the different subgroup analyses that had a different true effect might also have contributed toward the heterogeneity.

Conclusions
Based on our meta-analysis, we concluded that exercise overall likely had a beneficial effect on alleviating NAFLD without significant weight loss. After overall exercise, IHL, ALT, AST, LDL-C, and TG were reduced. The aerobic exercise seemed to alleviate NAFLDrelated liver parameters (IHL, ALT, and AST), while resistance training was more effective at alleviating TG and TC. The combination of aerobic and resistance training alleviated only IHL. These results could allow for the establishment of exercise (or lifestyle modification) guidelines, where weight loss is not the focus but could be a consequence of the lifestyle modification in some patients. However, it must be noted that our conclusions are based on only a small number of studies with small sample sizes. Therefore, more studies are warranted for a comprehensive and meaningful comparison of different exercise regimes, the long-term effects of exercise, and the gut microbiota and metabolites' contributions.  Data Availability Statement: Access to the data, R codes, and/or material can be sought via contacting the responsible authors.