An Application of the Multivariate Linear Mixed Model to the Analysis of Shoulder Complexity in Breast Cancer Patients

In this study, four major muscles acting on the scapula were investigated in patients who had been treated in the last six years for unilateral carcinoma of the breast. Muscle activity was assessed by electromyography during abduction and adduction of the affected and unaffected arms. The main principal aim of the study was to compare shoulder muscle activity in the affected and unaffected shoulder during elevation of the arm. A multivariate linear mixed model was introduced and applied to address the principal aims. The result of fitting this model to the data shows a huge improvement as compared to the alternatives.


Introduction
Wide local excision (WLE) with adjuvant radiotherapy is the standard treatment of breast cancer. However, despite the use of less extensive surgery, there is still morbidity affecting the shoulder [1][2][3]. Local radiotherapy has several known effects on lung parenchyma, vascular and connective tissues [4][5][6]. Findings with respect to the latter suggest that thickening of the tissues may restrict movement of surrounding areolar tissues held within fascial planes. This limited ability to expand together with ischaemia due to changes in the vascular network could have an effect on the efficacy of muscle contraction [7][8][9].
Thus far, the exact nature of shoulder morbidity and its relationship to pain has only been described in terms of glenohumeral movement; ignoring the critical effect of scapulo-thoracic motions. Scapular movements are a product of the delicate inter-play between the rotator cuff muscles; thus, deficiency in one can alter all movements due to associations. The pattern of these associations is not known and has been investigated here. Scapular movements such as abduction, adduction, anterior/posterior tilt, and medial/lateral rotations are a product of the delicate inter-play between the rotator cuff muscles. Deficiency in one may change all other movements. The pattern of these associations is not known and needs to be investigated.
Four muscles (pectoralis major, serratus anterior, upper trapezius and rhomboid major) acting on the scapula were investigated in patients who had been treated in the last six years for unilateral carcinoma of the breast. Muscle activity was assessed by electromyography (EMG) during abduction and adduction of the ipsilateral (affected) and contralateral (unaffected) sides.
The principal aims of the study were: (1) to compare shoulder muscle activity in the affected and unaffected shoulder during elevation of the arm; (2) to explore the relationship between any observed differences in muscle activity and patients report of pain and dysfunction; (3) to explore the relationship between any observed differences in muscle activity and key clinical variables.
Section 2 introduces the collected data set and its procedure. In Section 3, the multivariate linear mixed models are described which address the above principal aims. Section 4 presents the results of the fit of the model to the data. In Section 5, some final conclusions and a discussion of future analyses are given.

Data Set
Two hundred and two patients who had been treated from 2005 to 2010 for unilateral carcinoma of the breast were included in a shoulder morbidity study. The measurement procedure is detailed in our previous work [10]. Patients with any previous history of shoulder or neck problems in either arm were excluded from the study. Humeral elevation in degrees was measured using the Polhemus Fastrak™ motion analysis system, and concurrent EMG readings were taken at 10˝increments of humeral elevation of each shoulder for each patient. The average of three EMG readings at each elevation point was taken as a response (dependent) variable for each patient.
All patients filled in a Shoulder Pain and Disability Index (SPADI) questionnaire immediately prior to data collection. The SPADI questionnaire is a known and valid measure of pain and disability for shoulder dysfunction with high levels of sensitivity and reliability [11], it is scored on a visual analogue scale with 13 items (five for pain and eight for disability). All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by Oxford Brookes University ethics committee (HREC no: A02,064) Pain scores range from a minimum of 0 to a maximum of 500 mm and for disability 0 to 800 mm, where 0 represents no symptoms of pain or disability. Figure 1 represents the distribution of pain and disability scores. Given the skewness of these distributions, each variable is log-transformed in subsequent analyses. (1) to compare shoulder muscle activity in the affected and unaffected shoulder during elevation of the arm; (2) to explore the relationship between any observed differences in muscle activity and patients report of pain and dysfunction; (3) to explore the relationship between any observed differences in muscle activity and key clinical variables.
Section 2 introduces the collected data set and its procedure. In Section 3, the multivariate linear mixed models are described which address the above principal aims. Section 4 presents the results of the fit of the model to the data. In Section 5, some final conclusions and a discussion of future analyses are given.

Data Set
Two hundred and two patients who had been treated from 2005 to 2010 for unilateral carcinoma of the breast were included in a shoulder morbidity study. The measurement procedure is detailed in our previous work [10]. Patients with any previous history of shoulder or neck problems in either arm were excluded from the study. Humeral elevation in degrees was measured using the Polhemus Fastrak TM motion analysis system, and concurrent EMG readings were taken at 10° increments of humeral elevation of each shoulder for each patient. The average of three EMG readings at each elevation point was taken as a response (dependent) variable for each patient.
All patients filled in a Shoulder Pain and Disability Index (SPADI) questionnaire immediately prior to data collection. The SPADI questionnaire is a known and valid measure of pain and disability for shoulder dysfunction with high levels of sensitivity and reliability [11], it is scored on a visual analogue scale with 13 items (five for pain and eight for disability). All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by Oxford Brookes University ethics committee (HREC no: A02,064) Pain scores range from a minimum of 0 to a maximum of 500 mm and for disability 0 to 800 mm, where 0 represents no symptoms of pain or disability. Figure 1 represents the distribution of pain and disability scores. Given the skewness of these distributions, each variable is logtransformed in subsequent analyses. Key clinical variables include: affected side, dominant hand, degree of arm elevation, treatment protocol (Wide Local Excision (WLE) or other), duration after surgery in days, age in years, Key clinical variables include: affected side, dominant hand, degree of arm elevation, treatment protocol (Wide Local Excision (WLE) or other), duration after surgery in days, age in years, physiotherapy and exercise level on affected side, receiving chemotherapy, and the Shoulder Pain and Disability Index [10].
Bilateral EMG measurements of the four shoulder muscles were taken for each patient. Each muscle activity is recorded in millivolts (mV) at 10˝increments of humeral elevation during upward and downward arm movement. At each increment point, for each shoulder and each movement, four EMG measures were obtained (one for each of the four muscles), creating potentially a correlated or multivariate response structure. Incremental measurements at different arm elevations per muscle created a multivariate response structure that can also be regarded as "repeated measures" dimension of the data. Summarised, four correlated, repeatedly measured responses for each shoulder and each movement were generated. In Figure 2, typical observations of a patient are given, for humeral elevation between 10˝and 150˝, this figure represents the observations for an upward arm movement of the left arm. This graph clearly shows that the EMG reading at each given point very much depends on the previous reading within the same movement. Hence, this fact should be implemented in the modelling strategy in Section 3. physiotherapy and exercise level on affected side, receiving chemotherapy, and the Shoulder Pain and Disability Index [10]. Bilateral EMG measurements of the four shoulder muscles were taken for each patient. Each muscle activity is recorded in millivolts (mV) at 10° increments of humeral elevation during upward and downward arm movement. At each increment point, for each shoulder and each movement, four EMG measures were obtained (one for each of the four muscles), creating potentially a correlated or multivariate response structure. Incremental measurements at different arm elevations per muscle created a multivariate response structure that can also be regarded as "repeated measures" dimension of the data. Summarised, four correlated, repeatedly measured responses for each shoulder and each movement were generated. In Figure 2, typical observations of a patient are given, for humeral elevation between 10° and 150°, this figure represents the observations for an upward arm movement of the left arm. This graph clearly shows that the EMG reading at each given point very much depends on the previous reading within the same movement. Hence, this fact should be implemented in the modelling strategy in Section 3.

Statistical Test of Clinical Hypotheses
Shamley et al. previously demonstrated [10] that the activities of muscles controlling the movement of the scapula are linked; if one muscle is compromised, then other muscles might become more active to compensate for the lost movement. However, muscle activity can be influenced by several demographic (age, dominance, gender) and clinical (pain, radiotherapy, chemotherapy, etc.) variables. In this study, we are interested to explore the previous findings in detail and describe the rationale of whether the clinical variables have the same effect on all muscles, or whether the effect of these variables is different for different muscles under different conditions. Hence, an advanced joint multivariate approach must be implemented to take these associations into consideration whilst investigating the following four main null hypotheses.

Statistical Test of Clinical Hypotheses
Shamley et al. previously demonstrated [10] that the activities of muscles controlling the movement of the scapula are linked; if one muscle is compromised, then other muscles might become more active to compensate for the lost movement. However, muscle activity can be influenced by several demographic (age, dominance, gender) and clinical (pain, radiotherapy, chemotherapy, etc.) variables. In this study, we are interested to explore the previous findings in detail and describe the rationale of whether the clinical variables have the same effect on all muscles, or whether the effect of these variables is different for different muscles under different conditions. Hence, an advanced joint multivariate approach must be implemented to take these associations into consideration whilst investigating the following four main null hypotheses.
H 01 : The altered muscle activity is not associated with the patient's report of pain and dysfunction; H 02 : Clinical risk factors have no effect on the activities of the muscles; H 03 : The effect of clinical risk factors is the same on the affected arm as with that on the unaffected arm. H 04 : The activity of the four key muscles acting at the scapula on the affected arm is the same as from those acting on the unaffected arm.

The Multivariate Linear Mixed Model
Let Y iksm denote the n iˆ1 vector of EMG readings at different humeral elevation points from the i th patient (i = 1, . . . , N), n i is the number of elevation points for that patient on the k th muscle (k = 1, . . . , 4), of the affected arm (s = 1) or unaffected arm (s = 2) during upward (m = 1) or downward (m = 2) movements.
We assume for the n i measurements, Y iksm , a multivariate normal distribution over the elevations. It is in fact the usual multivariate multiple regression, i.e.,: where X iksm is a n iˆp matrix of covariates (p is the number of covariates), β ksm is a pˆ1 vector of regression coefficients and Σ k is a n iˆni covariance matrix. Obviously, when the humeral elevations are closer, the correlation should be stronger, therefore, after testing a few other correlation structures, we sensibly assumed an autoregressive structure of order one, i.e., AR(1) for Σ k , i.e., : : : : A simplifying assumption could be to assume that the responses are independent among patients and muscles, that is It is generally unrealistic to assume that all important risk factors (covariates) are measured and included in the model explanatory matrix (X iksm ). The unobserved or unmeasured risk factors of a muscle, which are usually known as the "unobserved muscle's specific effect", causes the measurements of a muscle to be more correlated. To improve Model (3), one could introduce a "muscle specific effect" or random effect into the model. This term usually is known as frailty, so that, conditional on this muscle's specific effect, ν ik , we can assume with, X ik , the stacked design matrix X T ik " ‰ . Coull and Agresti [12] presented similar models for multivariate binomial logit-normal. Fieuws et al. [13] discussed similar models in more detail.
Furthermore, we assumed that the frailty terms of different muscles are correlated. That is, we assumed that the vector of frailty terms ν T i " pν i1 , ν i2 , ν i3 , ν i4 q has the following distribution: We assume that the joint model of with the normal distribution of the frailty vector which adds the covariance structure to the frailties across muscles. The joint model defined by Model (6) is a multivariate linear mixed model with multivariate random effects ν i . As for the univariate linear mixed model with a univariate normal random effect, the marginal likelihood in Model (6) is also analytically tractable. We used SAS™ [14] 9.2, PROC MIXED (SAS, Cary, NC, USA) to estimate the model parameters.
In this model, the correlation structure among the muscles is presented by the correlation structure among their specific frailties given by the variance-covariance matrix (D); if the muscles are not significantly correlated, all off-diagonal entities in the variance-covariance matrix D will be zero. Significant non-zero off-diagonal entities of variance-covariance matrix D are showing the correlations among the muscles. Table 1 presents the result of analyzing the natural log transformation of EMG activities of each muscle using Model (3), i.e., a multivariate normal distribution for ln(emg iksm ) = Y iksm with block diagonal variance-covariance matrix of Σ k . The likelihood ratio test suggests that the effect of arm movement (MOVE_UP/MOVE_DOWN) is best presented by a dummy variable indicating different intercepts for the upward and downward movement. Interactions between arm movements with all other clinical risk factors (covariates) were not collectively significant at the 5% level.

Model Comparison
Contrariwise, the effect of affected shoulder and unaffected shoulder cannot be modelled by considering only different intercepts for affected/unaffected shoulder. This is due to the fact that some of the risk factors are specific to the affected shoulder, such as physiotherapy and exercise. This implies that the effect of clinical risk factors might well be different on the affected shoulder compared to the unaffected shoulder. Therefore, a full interaction model (all main effects and all interactions) initially was employed to assess the effects of the affected shoulder on all clinical risk factors; the highly insignificant interactions were subsequently dropped from the model. The likelihood ratio test suggests that the included interactions are highly significant (change in deviance of 717.2 for 32 degrees of freedom).
The values in bold correspond to significant effects (at the 5% level) for each muscle. The last column ("p-value") indicates whether the clinical risk factor is collectively (the effect on all four muscles collectively with four degrees of freedom) significant or not.
A significant value for the interaction terms shows that, over and above the overall effect of the clinical risk factor on electrical activity of the muscles, the effect of clinical risk factor on the affected side is different to the unaffected side. The autoregressive correlation (ρ k ) and the variance of each muscle (σ 2 k ) are also presented. The result of fitting a multivariate normally distributed model with AR(1) structure for Σ k , presented in Table 1, controls the autoregressive structure of the measurements. The high positive value of ρ shows that the electrical activities are more alike at humeral elevation angles that are closer together.
This model not only ignores associations among the four muscles but also ignores any possible associations between measurements of the same muscle, i.e., affected/unaffected shoulders and downwards/upwards movement. In other words, this model assumes independence between the four muscles and between all observations of the same muscle at the same humeral elevation. This is not a realistic assumption as the four muscles are from the same individual and are likely to be correlated, and all measurements of the same muscle at the same humeral elevation are also likely to be correlated as they are from the same muscle. Table 2 is the result of fitting a univariate linear mixed Model (4) with independent random intercept ν k across muscles and residual block diagonal variance-covariance matrix with blocks Σ k assumed to follow the auto-regressive process of order one. Model (4) controls the association between the measurements of the same muscles at the same humeral elevation point by including a muscle specific random effect into the model. This model suggests that humeral elevation of the arm, the upwards move, affected side and left shoulder will increase the electrical activity generally for all four muscles irrespective of which side is affected, whilst treatment with wide local excision and receiving chemotherapy are associated with generally decreased electrical activity.
Interaction analysis suggests that the electrical activity is significantly different for duration after surgery and SPADI pain. Doing exercise and having physiotherapy on the affected shoulder are significantly associated with electrical activity of the affected muscles. The deviance (´2ˆlog likelihood) for this model is 27,645.1. The change in deviance compared to Model (3) is 1536.1 for four degrees of freedom (p-value < 0.0001), which is highly statistically significant.
Univariate Linear Mixed Model with AR(1) structure for Σ k controls the autoregressive structure of the measurements and accounts for possible associations between measurements of the same muscle at the same elevation point. The disadvantage of Model (4) is that it ignores any possible associations among the four muscles within patients. Table 3 is the result of fitting a full multivariate linear mixed model with AR(1) structure for Σ k given in Model (6). The result of fitting a full multivariate Linear Mixed Model with AR(1) structure for Σ k , as expected, is almost similar to that of Model (4). This model suggests that humeral elevation of the arm, the upwards move, affected side, left shoulder and longer duration from the surgery will increase the electrical activity generally for all four muscles irrespective of the side affected-whilst treatment with wide local excision and receiving chemotherapy are associated with generally decreased electrical activity Interaction analysis suggests that electrical activity is significantly different for duration after surgery and SPADI pain. Once again, doing exercise and having physiotherapy on the affected shoulder are significantly associated with electrical activity of the affected muscles. SPADI pain increases the muscle activity in the affected arm, but the affected arm slightly dilutes the positive effect of duration since surgery on the muscle activity. The deviance for this model is 26833.2. The change in deviance compared to Model (4) is 811.9 for six degrees of freedom, and, compared to Model (3), the change in deviance is 2348 for 10 degrees of freedom. Both suggest statistically significant improvement (both p-values < 0.0001) in model fitting.
Hence, using a multivariate normal distribution for natural log of muscle activities with AR(1) structure for Σ k , for each muscle and ignoring the existing associations between the measurements leads to unrealistic inferences for important clinical variables.
Applying a univariate Linear Mixed Model with AR(1) structure for Σ k (4) while ignoring the associations between muscle specific effects leads to clinically much more sensible parameter estimates. Subsequently, comparing the likelihoods of Models (3) and (4) confirms the presence of strong muscle specific effects.
A full multivariate Linear Mixed Model with AR(1) structure for Σ k (6) assesses the presence of significant association between muscle-specific random effects. A deviance difference of 811.9 for six degrees of freedom suggests the use of a joint multivariate linear mixed model with residual block diagonal variance-covariance matrix, which assumed to following auto-regressive process of order one, is more appropriate.
In this model, the associations between measurements of the same movement at different elevation points is given by ρ for each muscle. The estimated value of ρ is about 0.8, which shows a strong association between measurements of the same movement. However, the associations between muscles where modelled via their unobserved specific random effects. The estimated variance-covariance matrix of this association is given in the following matrixD: A likelihood ratio test comparing Models (4) and (6) shows that the associations between muscles are very important and cannot be ignored. The estimated variance-covariance matrixD indicates that all estimated off-diagonal entities are positive, which demonstrates that all correlations among the muscles are significantly positive.

Statistical Inference of Clinical Hypotheses
The result of Model (6), which is presented in Table 3, can be explored and refitted to assess the main clinical hypotheses.
H 01 : The activity of four key muscles acting at the scapula on the affected arm is not different to those acting on the unaffected arm.
The null hypothesis H 01 should be rejected as the activity of four key muscles acting at the scapula on the affected arm is significantly different to the unaffected arm. The gain in deviance between models with and without affected side interactions is 689.6 for 16 degrees of freedom returning a p-value < 0.0001, highly statistically significant.
H 02 : Altered muscle activity is not associated with patients' report of pain or dysfunction. SPADI pain has a significant effect on activity of the affected arm p-value < 0.0001); however, report of pain and dysfunction had no significant effect in general.
H 03 : Clinical risk factors have no significant effect on the activity of muscles. Clinical risk factors have a significant effect on muscle activities. For instance, WLE treatment (p-value = 0.027) and chemotherapy (p-value = 0.021), compared to their alternatives, have adverse effects on all muscle activities.
H 04 : The effects of clinical risk factors are not different on the affected arm as compared to the unaffected arm. The effects of clinical risk factors on muscle activity are different on the affected arm compared to the unaffected arm. For example, reports of pain and dysfunction (SPADI pain) had no significant effect on the unaffected arm (p-value = 0.422), but increasing SPADI pain is associated with increased muscle activity in the affected arm (p-value < 0.0001). Duration since surgery is associated with decreased muscle activity in the affected arm as compared to the unaffected arm (p-value < 0.0001). Figure 3 is the result of a diagnostic analysis, and it shows the standardised residuals of the full multivariate mixed linear model with AR(1) variance-covariance structure. It is clear that most of the standardised residuals have an absolute value of less than 2.0, which could have easily occurred by chance. It is expected that 5% of standardised residuals should be greater than 2.0 in absolute value according to expected standard normal distribution of the standardised residuals. Hence, Figure 3 does not show any serious diversion from normality for residuals of the implemented multivariate linear mixed model given by Model (6). Figure 3 is the result of a diagnostic analysis, and it shows the standardised residuals of the full multivariate mixed linear model with AR(1) variance-covariance structure. It is clear that most of the standardised residuals have an absolute value of less than 2.0, which could have easily occurred by chance. It is expected that 5% of standardised residuals should be greater than 2.0 in absolute value according to expected standard normal distribution of the standardised residuals. Hence, Figure 3 does not show any serious diversion from normality for residuals of the implemented multivariate linear mixed model given by Model (6).   The comparison between the two figures does not suggest any major influential observation that if removed has significant effect on parameter estimates.

Discussions
Our results demonstrated that the affected side is significantly different in terms of muscle activity than the unaffected side. This is understandable given the nature of the disease and treatment. The results of our model also demonstrate, however, several other important points.
Firstly, with respect to personal factors, only pain appears to play a significant role. Whilst no effect was observed with respect to SPADI pain in the unaffected arm, it was associated with increased muscle activity in the affected arm. This phenomenon has been previously reported and is hypothesised to be a functional adaptation in order to limit movements of the painful muscle, hence reduce overall pain [15].
Hand dominance had no significant effect on muscle activity, regardless of the operated side correlating well with other studies demonstrating either no difference in female subjects [16] or only significance in a few movements, primarily the flexion [17][18][19]. Still independent of the operative side, there is also greater activation and activity in shoulder muscles during abduction of the arm compared to adduction, with the magnitude of activity being proportional to the extent of abduction. The significant difference between abduction and adduction as displayed by the model is naturally expected due to the nature of gravity assisting adduction and thus activating of fewer muscle

Discussions
Our results demonstrated that the affected side is significantly different in terms of muscle activity than the unaffected side. This is understandable given the nature of the disease and treatment.
The results of our model also demonstrate, however, several other important points.
Firstly, with respect to personal factors, only pain appears to play a significant role. Whilst no effect was observed with respect to SPADI pain in the unaffected arm, it was associated with increased muscle activity in the affected arm. This phenomenon has been previously reported and is hypothesised to be a functional adaptation in order to limit movements of the painful muscle, hence reduce overall pain [15].
Hand dominance had no significant effect on muscle activity, regardless of the operated side correlating well with other studies demonstrating either no difference in female subjects [16] or only significance in a few movements, primarily the flexion [17][18][19]. Still independent of the operative side, there is also greater activation and activity in shoulder muscles during abduction of the arm compared to adduction, with the magnitude of activity being proportional to the extent of abduction. The significant difference between abduction and adduction as displayed by the model is naturally expected due to the nature of gravity assisting adduction and thus activating of fewer muscle spindles. We hypothesise that if adduction was performed against resistance such as to simulate the force of gravity which muscles experience during abduction, then activity would be similar. The significance of these findings between healthy volunteers and breast cancer survivors should be assessed in a case-control study.
With respect to clinical factors, previous work had already demonstrated that surgical breast cancer treatment resulted in shoulder morbidity [10], with recent work also showing mastectomy causing greater degrees than WLE [20]. This study of WLE, controlling for chemotherapy and radiotherapy effects, also demonstrated a significant adverse effect on all muscles independent of the affected side, similar to previously reported results.
The effects of various variables on muscles are summarized in Table 4. The effect of time since surgery on muscle activity was not significant per muscle, although, overall, its effect was significant.
The effect of chemotherapy similarly shows significant reduction in overall muscle activity on the ipsilateral side, although, individually, the difference was only significant for the serratus anterior. Chemotherapy induced peripheral neuropathy (CIPN) is a well-known complication of adjuvant therapy, and its incidence is closely linked to the agents used. Although primarily a sensory neuropathy, motor complications have also been reported [21] with greater impact on quality of life [22]. Taxanes, prolific agents used in breast cancer treatment [23], have been implicated in reduced compound muscle action potentials and myopathy, although less frequently than their sensory effects [24]. With respect to the observed difference on the contralateral side with serratus anterior, we could hypothesise that, due to the length of the long thoracic nerve, it may be more at risk from the neurotoxic effects of the chemotherapy agents. This is not without precedence, as the sural nerve, another similarly long nerve, has been reported as being particularly at risk of CIPN with platinum and paclitaxel based compounds [25]. With respect to rehabilitation, our model indicates that pain, physiotherapy and exercise result in an overall significant increase in muscle activity on the affected shoulder (Table 5). Specifically, exercise results in independent increases for the pectoralis major, upper trapezius and serratus anterior whilst physiotherapy results in similar findings for the latter two aforementioned muscles. With respect to the affected side, only our findings are summarised in Table 5.

Conclusions
Shoulder EMG activity and, therefore, muscle activation depends on both clinical and personal variables. Clinically the use of chemotherapy, WLE and time since surgery all decrease muscle activation, whilst physiotherapy increases it. Pain, a personal subjective influence, serves to increase muscle activation and is likely to limit movement and prevent further pain. This work builds on previously published data and offers new insights into the variables that affect shoulder morbidity with an effective model. This study further adds to the understanding of modelling multi-dimensional data and illustrates the risk of ignoring potential associations in the data structure. Several modelling approaches were discussed and compared, and the more suitable analysis method was identified and applied to the data.