Evaluating the Impact of High Intensity Interval Training on Axial Psoriatic Arthritis Based on MR Images

High intensity interval training (HIIT) has been shown to benefit patients with psoriatic arthritis (PsA). However, magnetic resonance (MR) imaging has uncovered bone marrow edema (BME) in healthy volunteers after vigorous exercise. The purpose of this study was to investigate MR images of the spine of PsA patients for changes in BME after HIIT. PsA patients went through 11 weeks of HIIT (N = 19, 4 men, median age 52 years) or no change in physical exercise habits (N = 20, 8 men, median age 45 years). We acquired scores for joint affection and pain and short tau inversion recovery (STIR) and T1-weighted MR images of the spine at baseline and after 11 weeks. MR images were evaluated for BME by a trained radiologist, by SpondyloArthritis Research Consortium of Canada (SPARCC) scoring, and by extraction of textural features. No significant changes of BME were detected in MR images of the spine after HIIT. This was consistent for MR image evaluation by a radiologist, by SPARCC, and by texture analysis. Values of textural features were significantly different in BME compared to healthy bone marrow. In conclusion, BME in spine was not changed after HIIT, supporting that HIIT is safe for PsA patients.


Introduction
Psoriatic arthritis (PsA) is a chronic inflammatory joint disease associated with skin psoriasis that can manifest in the axial skeleton and peripheral joints, and can include dactylitis and enthesitis [1]. It is systemic inflammation associated with several comorbidities and increased cardiovascular mortality and morbidity [2,3]. The prevalence of PsA ranges from 20 to 670 per 100,000 population [4,5], and 30-50% of PsA patients will develop axial PsA involving the spine or the sacroiliac joints [6].
Physical exercise gives beneficial effects against inflammation, joint damage, and symptoms [7] and is recommended as supplementary treatment to patients with arthritis [8]. In addition to positive effects on the joint disease, exercise can reduce the risk for cardiovascular disease (CVD) [9]. High intensity interval training (HIIT) is effective in decreasing CVD risk factors and inflammation [10]. HIIT involves alternating short periods of high intensity exercise with recovery periods or light exercise [11]. Exercise exerts a molecular systemic anti-inflammatory effect related to the intensity and duration of the physical activity [12]. In [13], patients with axial spondyloarthritis experienced reduced disease activity scores and inflammation and improved cardiovascular health after three months of HIIT. Short and long-term beneficial effects of HIIT on disease activity, patient disease perception, and the risk of CVD were recently reported in PsA patients [14,15]. Exercise led to reduced fatigue and cardiovascular risk factors in terms of truncal fat and maximal oxygen uptake, whereas scores for joint affection and pain were compatible with the control group. This finding is important, as there was no detectable negative impact on the disease burden, and HIIT can be recommended for PsA patients. It is, however, possible that vigorous exercise can increase the burden of the joint disease. Bone marrow edema (BME) has been found in relation to physically demanding work and sports activities [16][17][18][19][20]. Mechanical strain can drive inflammatory activity in joints and has been suggested to play a role in the induction and further development of spondyloarthritis [21,22].
Magnetic resonance (MR) imaging can portray inflammation in the structures involved [23]. MR-based disease activity scores are reliable and sensitive to change, and they may provide information about disease not given by clinical evaluations [24]. Short tau inversion recovery (STIR) is the recommended MR imaging sequence for axial PsA [25,26], presenting edematous lesions with hyperintense signals [27]. Radiological assessment of edematous lesions is based on hyperintensity in STIR MR images, located in two or more sites and/or two or more slices [28]. Lesions can be confirmed as hypointense signals in T1-weighted MR images. A semi-quantitative scoring system of disease activity, such as the Spondyloarthritis Research Consortium of Canada (SPARCC) scoring system, can be highly sensitive to changes in the spine [26]. Additional information from MR images can be quantified by extracting spatial variations of grey-level intensity in an image [29]. Such a texture analysis may be more sensitive to change than visual inspection and has been applied in studies of BME [30][31][32][33][34][35][36]. It has demonstrated the potential to classify BME from MR images [32][33][34], and to detect changes in bone marrow after physical activity [36].
We hypothesized that BME changes could occur in PsA patients after HIIT in spite of no reported changes in disease activity by clinical examinations. The aim of this study was to assess whether HIIT in PsA patients led to detectable changes in the axial skeleton by investigating MR images of the spine for BME. Additionally, we explored the potential of textural features to detect BME changes.

Patient Cohort
The presented study is part of a randomized controlled trial with HIIT as an intervention, conducted at St. Olavs hospital and NTNU-the Norwegian University of Science and Technology, Trondheim, Norway, from 2013 to 2015 [14,15]. Participants fulfilled the ClASsification for Psoriatic ARthritis (CASPAR) criteria [37] and were between 18 and 65 years. Medication histories of participants were collected as has been described previously [14]. The intervention group (N = 19, 4 men, median age 52 years) performed HIIT three times per week for 11 weeks, whereas the control group (N = 20, 8 men, median age 45 years) had no changes from pre-study physical exercise habits. In brief, the intervention was comprised of two weekly supervised stationary bicycling sessions and one weekly self-guided HIIT session. All patients signed informed consent, and the Norwegian Regional Committee for Medical and Health Research Ethics approved the study (Trial registration: NCT02995460).

Disease Activity Scores
Scores for joint affection and pain were assessed at baseline and after 11 weeks, as previously described [14], and included patient global assessment (PGA), high sensitivity C-reactive protein (hs-CRP), Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), and Disease Activity Score in 44 joints (DAS44). There were no significant differences in baseline scores for joint affection and pain between the intervention and control groups (p > 0.05, Wilcoxon signed-rank test).

MR Image Acquisition
MR imaging was performed of the spine in two stations using a STIR and a T1-weighted turbo spin-echo sequence based on a standardized protocol [26] as previously described [38]. For two patients, both in the HIIT group, the second MR imaging was not performed, and both patients were excluded from evaluation by a radiologist and SPARCC scoring, leaving MR images from 37 patients for further analyses. For 10 of the participants, the MR imaging protocol deviated with respect to the spatial resolution in the first (N = 1) or second MR imaging (N = 9), and these images were excluded from analysis by textural features.

•
Radiological evaluation The MR images were evaluated by a radiologist for BME at both time-points. The radiologist was blinded with respect to intervention. BME was identified by hyperintense signals in STIR MR images, supported by hypointense signals in T1-weighted MR images. To be considered positive for BME, the hyperintense signal had to be located in two or more sites and/or two or more slices [28,39]. Images were also assessed with respect to changes from the first to second MR imaging, categorized as stable, increased, or reduced BME.

• SPARCC scoring
The STIR MR images at both time-points were scored by a trained rheumatologist as previously described [26]. In brief, the six most abnormal disco-vertebral levels on the STIR sequence were selected. Three consecutive sagittal slices, which represented the most abnormal slices for each level, were chosen for scoring at that level. The total maximum SPARCC score was 108 for all six levels of the spine. The SPARCC scores were also categorized with respect to changes from the first to second MR imaging, namely stable, increased, or reduced SPARCC score. Categorization to increased or reduced SPARCC score required a minimum score change of five according to the defined minimally important change [40].

•
Image pre-processing and textural feature extraction Vertebral bone marrow, excluding vascular and neural structures, were manually segmented using 3D Slicer (MIT Artificial Intelligence Lab, USA) in MR images from all patients (N = 37), comprising images from both time-points (N = 27), or comprising images from only first (N = 9) or only last (N = 1) time-points. Images of the spinal column were preprocessed using a customized intensity adjustment procedure based on the nonparametric nonuniform intensity normalization (N4) bias field correction algorithm [41]. Pixel intensity values were normalized by matching the histogram extracted from the spinal column to the histogram of a randomly selected atlas image. Image noise was reduced using the in-house implementation of an anisotropic (Perona-Malik) diffusion smoothing filter (iterations = 15, integration constant = 1/7, time step = 0.01, conductance = 1.0) [42]. The BME in all image sets was manually segmented using 3D Slicer. The segments were verified by a trained radiologist.
Three pixel-wise types of image textural features were calculated-seven intensity features, ten gradient features, and four grey level co-occurrence matrix (GLCM) textural features, all described in Appendix A. Intensity features were the grey-level intensity values of the central pixel, and the mean, median, standard deviation, minimum, maximum, and semi-interquartile range of the grey-level intensity values. For the extraction of gradient features, 2-dimensional directional gradients for the x-axis (G_x) and y-axis (G_y) were calculated using a Sobel gradient operator in the imgradientxy function in MATLAB (MathWorks, Natick, MA, USA). GLCM features were extracted using the graycomatrix and graycoprops functions in MATLAB (MathWorks, Natick, MA, USA) at four orientations (0 • , 45 • , 90 • , and 135 • ) with a distance of 1 pixel. The resulting GLCM feature was the mean of the GLCM feature values in these orientations. Feature maps were created for each feature using a sliding window implementation. In this approach, an orthogonal 3-by-3 box/kernel "slides" in the region of interest, in this case the segmented bone marrow. The features were calculated in each orthogonal kernel position and corresponded to the central pixel of the box.

Statistical Analysis
Changes in patient characteristics (PGA, hs-CRP, BASDAI, DAS44, and SPARCC score) before and after intervention were analyzed using the Wilcoxon signed-rank test. Changes in BME status and SPARCC score were categorized, and differences between HIIT and the control group were analyzed by Fisher's exact probability test. Analyses were performed in SPSS (IBM SPSS Statistics v 26), and p-values < 0.05 were considered statistically significant.
Differences in feature values from voxels in BME and healthy voxels were assessed by linear mixed models. We used patient number and scan as random effects, whether the voxel was healthy or pathological as fixed effects, and textural feature as response variables. Further, linear mixed models were used to assess changes between first and second time points and if these changes were different between the HIIT and control groups. For this analysis, average feature values per individual per time point were used as response variables, as it was not possible to match lesions between the two time points. Fixed effects were time (whether the scan was a baseline or 11 week scan), intervention (whether the patient was in the HIIT or the control group), and the interaction term between time and intervention (time*intervention); and patient number was a random effect. The time variable was reference coded to the baseline measurement, and the intervention variable was sum coded.
Bonferroni correction was used to correct p-values for multiple comparisons from all three linear mixed-models. The statistical level of significance was set to p < 0.05. Statistical analyses were performed in in RStudio: Integrated Development for R (RStudio, PBC, Boston, MA, USA) and MATLAB R2019A.

Patient Cohort
Of the 55 patients evaluated for eligibility, 47 were included, and the follow-up at 11 weeks was complete with MR images for 37 patients (Figure 1). Demographic characteristics of study participants in control and HIIT groups, and their scores for joint affection and pain at baseline and after 11 weeks, are shown in Table 1. BASDAI decreased for the HIIT group, and DAS44 decreased for both groups after 11 weeks. Table 1. Demographic characteristics of study participants in high intensity interval training (HIIT) and control groups, and their scores for joint affection and pain at baseline and after 11 weeks [15]. Mean values of scores for joint affection and pain within groups at baseline and 11 weeks were compared by the Wilcoxon signed-rank test.

Image Analysis
•

Radiological evaluation
Examples of acquired MR images are shown in Figure 2, demonstrating the PsA lesions of two patients with low and high disease burden. MR images of the spine from 21 of 37 patients were found negative with respect to BME in the radiological evaluation at both timepoints. Sixteen patients (43%) were identified with BME, consistent with axial manifestation of PsA. The radiologically manifested axial PsA was considered mild to moderate for both groups, and disease burden in terms of BME was stable. The findings are summarized in Table 2. The number of patients with changes of BME after 11 weeks was not significantly different between the HIIT and control group (p-value: 0.50).  Table 2. Bone marrow edema by radiological evaluation of MR images. Results from radiological evaluation for bone marrow edema (BME) in short tau inversion recovery (STIR) and T1-weighted MR images of the participants in high intensity interval training (HIIT) and control groups at baseline and after 11 weeks. Number of participants with detectable changes after 11 weeks were not significantly different for the two groups (Fisher's Exact Probability Test p-value: 0.50).

Baseline 11 Weeks Baseline 11 Weeks
Number, n 17 20 BME detected, n (%) 9 (53) 9 (53) 5 (25) 5 (25) No change, n (%) 17 (100) 17 (85) Increased BME, n (%) 0 (0) 1 (5) Reduced BME, n (%) 0 (0) 2 (10) • SPARCC scoring MR images for 11 of the 37 patients were found negative with respect to BME by the SPARCC scoring at both time-points (Table 3). Twenty-three of the patients (62%) had a positive SPARCC score at both timepoints. Participants in this study had a median baseline SPARCC score of 4.0. The number of patients with changes in SPARCC scores after 11 weeks was not significantly different between the HIIT and control group (p-value: 1). Table 3. Results from evaluation of short tau inversion recovery MR images of the participants in high intensity interval training (HIIT) and controls groups at baseline and after 11 weeks by SPARCC scoring [26]. If SPARCC score changed by five or more to week 11, it was categorized as increased or decreased according to minimally important changes [40]. Number of participants with detectable changes after 11 weeks were not significantly different for the two groups (Fisher's Exact Probability Test p-value: 1). The radiological evaluation and SPARCC scoring were consistent for 49 of the MR images at baseline and 11 weeks. Twenty-three MR images with a positive SPARCC score were BME negative by radiological evaluation. Two MR images with a SPARCC score of 0 were identified as BME positive in the radiological evaluation. SPARCC scoring (Table 3) identified changes from baseline to 11 weeks in more patients than did the radiological evaluation for BME (Table 2).

• Textural features
Mean and standard deviation of features extracted from MR images of pathological (BME lesions) and healthy voxels are presented in Table 4. The mean values for all but one (g7) extracted features were significantly different in pathological compared to healthy voxels. With the exception of one of the GLCM features (f1), mean values for textural features were higher in pathological than healthy voxels. We observed no significant differences between the HIIT group and control group in the changes in textural features of PsA lesions from baseline to week 11 (Table 5). Table 4. Textural features of voxels in bone marrow edema and healthy voxels. Mean values and standard deviation (SD) of features extracted from MR images of voxels in bone marrow edema (BME) and healthy voxels. Differences in feature values for pathological and healthy voxels were examined using linear mixed-effects models.  Table 5. Mean values and standard deviation (SD) of features extracted from BME lesions in MR images of psoriatic arthritis patients. Differences between the intervention and control group, in terms of changes in feature values from the baseline to the 3-month scan, were investigated using a linear mixed effect model (random effects: patient number and lesion number (nested effects), fixed effects: baseline or 3 months scan and control or intervention group). p values are reported.

Discussion
No significant changes were observed in MR images of the spine after HIIT training for 11 weeks. This finding was consistent for radiological evaluation, SPARCC scoring, and textural features of MR images. Values for 20 out of 21 textural features were significantly different in voxels of BME compared to voxels of healthy bone marrow. No textural features of PsA lesions were significantly different when comparing changes in values after 11 weeks between the HIIT and control groups.
Of the study participants, 43% and 62% were found positive for BME by radiological evaluation and SPARCC scoring, respectively (Tables 2 and 3). The fraction of BME positive participants by SPARCC scoring is above the reported 30-50% of PsA patients with axial involvement [6]. More positive findings by SPARCC scoring than by the Assessment in SpondyloArthritis international Society criteria have been reported previously [43]. The difference of BME positive participants between the two methods is probably caused by different readers and principal differences in the methods. Standardized methods for scoring of axial spondyloarthritis (axSpA) are subject to some variation between readers [44]. Images of little active inflammation is more subject to low inter-reader correlation, and the mild to moderate disease burden in the cohort of this study is thus suspected to contribute to the difference of the two methods. Both methods rely on hyperintensity in STIR images, where edema related to inflammation can be detected as a bright signal on a dark background in subchondral bone marrow [45]. The use of T1-weighted images to support the radiological evaluation is prone to rejection of positive findings in STIR images, which may explain fewer positive cases by radiological evaluation. The radiological evaluation identified changes in BME from baseline to week 11 for three patients, which agreed with higher and lower SPARCC scores for these patients. SPARCC scoring identified changes from baseline to week 11 for more patients, but in general of minor magnitude. Applying a SPARCC score threshold of five for minimally important change [40] reduced the number of patients with changes from baseline to week 11 from twenty-three to seven.
Quantitative methods for analysis of STIR MR images have been proven to discriminate between active therapy and placebo after 12 weeks of treatment in clinical trials of ankylosing spondylitis [46,47]. Changes occurring in the spine due to the HIIT should thus be detectable with the current MR imaging protocol and methods for image analysis. MR imaging has previously identified BME in healthy individuals and in athletes, suggesting that mechanical strain contributes to BME [22,43]. Two studies contradict that HIIT may increase disease burden. For this cohort of PsA patients, it has been shown that HIIT has beneficial effects on fatigue and cardiac risk factors without increased joint affection and pain [14,15]. It has also been shown that Ankylosing Spondylitis Disease Activity Score and BASDAI were significantly reduced after 3 months of HIIT in patients with axSpA [13]. Our current study showed no significant changes in BME in the spine from HIIT by radiological evaluation, SPARCC scoring, or texture analysis of MR images (Tables 2, 3, and 5), which supports that HIIT is safe to recommend to patients with PsA.
Mean values of textural features were different in voxels from BME compared to voxels from healthy bone marrow (Table 4). These observed differences are consistent with a previous study, where textural features of MR images have been applied in machine learning to classify active inflammation in sacroiliac joints [35]. Choices of textural features are also important for successful tissue discrimination [30]. We surveyed intensity, gradient, and GLCM textural features, which partly have been utilized in other studies with classification of BME [33,34,48]. These studies also included histogram and run-length matrix features. In studies of osteoarthritis in the knee [34,49,50], most of the texture, histogram, and run-length matrix features were all significantly different between the patient groups. When discriminating the post-radiation lesions edema, fatty conversion, and hemorrhage, Romanos et al. found that GLCM textural features comprised four out of five features in the optimal design of the classification scheme [48]. Sacroiliitis could be classified based on extracted features from STIR MR images and machine learning [35]. The features' maximum pixel values and LH components from the two-level Haar wavelet decompositions, which depict horizontal traits of the image, were important to discriminate instances. The high intensity gray level in inflammation is a plausible cause for the impact of maximum pixel value for classification of sacroiliitis. Quantitative textural analysis has also been suggested to detect bone structure changes due to exercise [36,51]. This study demonstrated that textural features of BME and non-pathological pixels in STIR MR images were different.
Limitations of this study include low disease burden in patients and few patients with manifested PsA in the spine. However, if vigorous exercise leads to increased BME, this would be induced also in patients with low disease burden. Furthermore, data on textural features were reduced, since 10 MR image series were excluded due to MRI protocol deviations. However, as all methods for analysis of MR images demonstrated no increase in BME due to HIIT, the indications of no structural changes are fairly strong. To assess the sensitivity to changes of MR image textural features, a longitudinal interventional study of a larger cohort with a higher disease burden would be required.
The thorough evaluation of spine MR images demonstrate that the PsA is stable regarding BME in the spine under 11 weeks of HIIT. Beneficial effects of HIIT in this patient cohort have been previously reported, with increased maximal oxygen consumption (VO2max), reduced truncal fat, and less fatigue [14,15]. Importantly, joint affection and pain did not differ from the control group. The negative findings also in MR images strongly indicate no structural changes. The evidence of HIIT being safe to conduct for patients with PsA is thus stronger.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Mathematical description of extracted image features. Description of the features that were extracted and quantified from the MR images. Gradient features g 1 -g 10 were computed from the two-dimensional directional gradients for the x-axis (G x ) and y-axis (G y ); L1 norm = |G x | + |G y |, and L2 norm = G 2 x + G 2 y .
Intensity features i 1 Grey-level intensity value of the central pixel i 2 Mean of grey-level intensity values i 3 Median of grey-level intensity values i 4 Standard deviation of grey-level intensity values i 5 Minimum of grey-level intensity values i 6 Maximum of grey-level intensity values i 7 Semi-interquartile range of the grey-level intensity values Gradient features g 1 Sum of L1 norm g 2 Sum of L2 norm g 3 Mean of L1 norm g 4 Mean of L2 norm g 5 Standard deviation of L1 norm g 6 Standard deviation of L2 norm g 7 Median of L1 norm g 8 Minimum of L1 norm g 9 Maximum of L1 norm g 10 Semi-interquartile range of L2 norm GLCM features f 1 energy f 2 contrast f 3 correlation f 5 homogeneity (inverse difference moment) GLCM: grey level co-occurrence matrix