Examining and Adapting the Psychometric Properties of the Maslach Burnout Inventory-Health Services Survey (MBI-HSS) among Healthcare Professionals

: Burnout is known to negatively impact healthcare providers both physically and mentally and is assessed using the Maslach Burnout Inventory-Human Services Survey (MBI-HSS). Many versions of this tool have been developed for


Introduction
Burnout is a psychological concept that refers to the experience of emotional exhaustion and depersonalization [1]. Burnout syndrome affects individuals' psychological and physical statuses and has been the subject of a significant amount of research interest around the world [2]. The increased research on this phenomenon, along with the unfavourable consequences it causes, calls for proper estimation of burnout levels [2]. Many tools have been proposed to calculate burnout levels [3]. The most common among them is the MBI (Maslach Burnout Inventory). The MBI incorporates Maslach and Jackson's 3-dimensional view of burnout syndrome and its functioning and has been translated to different languages to be used in different parts of the world [4]. The back-and-forth translation of the survey, however, is often questioned, as results produced from previous studies using translated models are considered to be low in quality, reliability and validity [5]. Higher burnout is linked with higher Emotional Exhaustion (EE) and/or Depersonalization (DP) scores and lower Personal Accomplishment (PA) scores [6]. Among these subscale scores, the emotional exhaustion score is considered especially important for determining burnout levels [7]. Although the MBI is the gold standard tool for burnout estimation, its validity and reliability are often topics of debate. As a result, some studies have been performed aiming to assess the psychometric properties of the MBI [7]. Exploratory factor analysis and confirmatory factor analysis are often used to judge the validity of the proposed models [8]. In addition, the reliability of each subscale can be determined using Cronbach's alpha coefficient [9,10]. In questioning of the internal consistency of the MBI, depersonalization has been found to have lower reliability than the other two subscales. This finding has been consistent across both the original version and translated versions of the MBI [11]. Different alterations to the tool have often been made by changing the number of factors or items. The results of these altered versions have been evaluated against those of the original version [3]. While performing initial studies on the MBI, Maslach discovered that the 16th item, "working with people directly puts too much stress on me", which belongs to the emotional exhaustion subscale, measures depersonalization to an extent, while the 12th item, "I feel very energetic", which belongs to the Personal Accomplishment subscale, tends to instead evaluate aspects of emotional exhaustion. Hence, it was suggested that the removal of certain items such as items 16 and 12 from the tool would improve its functioning, and recent studies have verified this theory based on the improvement of fit indices [3]. Eliminating problematic items of the MBI has now become one of the most frequently employed techniques used by researchers who aim to develop better versions of the tool [3]. Different studies on this topic have been conducted in different areas of the world. A version of the tool with 16 items was tested in Belgium in a study by [12], a 15-item version was suggested by [13] in eastern Asia, and a model with 18 items was tested in northern Europe by [14]. Despite these efforts, however, the results obtained by these versions were not conclusive enough to justify reconstructing the original tool [3]. The 3 dimensions of burnout were previously considered to be independent in their functioning based on Maslach's interpretation, but recent studies have indicated a contradicting view, as the dimensions appear to correlate with one another, with correlations up to 0.73 between depersonalization and emotional exhaustion, 0.49 between personal accomplishment and emotional exhaustion, and 0.62 between depersonalization and personal accomplishment [11]. According to several studies, personal accomplishment is observed to be the subscale that shows the weakest association with the other dimensions. This has led to multiple disputes regarding the impact that personal accomplishment has on burnout levels [3]. On the other hand, due to the strong correlation between the two subscales, some studies have recommended the use of a 2-factor model that combines emotional exhaustion and depersonalization [7]. There are also existing studies that have suggested the use of a bifactor model that consists of the three traditional factors and a general/global factor [15]. Alternatively, other studies have suggested an increase in the number of factors. A study by [16] outlined client-and work-associated aspects of depersonalization and suggested the use of a four-factor model, and another author [17] proposed a five-factor model that further divides personal accomplishment into two distinct subscales. Six-factor models have also been suggested, but the results of these models have been proven to be more difficult to interpret than the initial three-factor model [15,18,19]. This study aims to use data collected across six different regions in the Gulf Cooperation Council Region to assess the validity and reliability of the MBI-HSS model, a version of the MBI designed for use in the healthcare sector, and develop a version of the MBI-HSS best suited for evaluating burnout levels among the healthcare providers in this region. The analysis of data gathered using the original MBI-HSS was used to develop a revised version of the MBI-HSS model.

Materials and Methods
This cross-sectional study was conducted among six tertiary private hospitals belonging to a large private healthcare provider. The bed capacity of the hospitals varied between 150 and 350 beds. The healthcare providers included in the study were physicians, nurses and other healthcare providers. These healthcare providers were recruited through convenience-sampling techniques and their participation in the study was voluntary. Participants were ensured that the data collected would be kept confidential. Prior to data collection, the researchers obtained Institutional Review Board approval from Dr Sulaiman Al Habib medical Group IRB (RC18.11.21). Cross-sectional data was collected in the form of surveys. A total of 1100 surveys had been distributed to the healthcare providers, and 900 returned back the completed surveys giving a response rate of 81.81%.

Participants
The valid sample collected for this study were 888 healthcare professionals, 231 (26.1%) males and 651 (73.9%) females, working at different hospitals in Gulf Cooperation Council Region. Participants were given the MBI-HSS questionnaire adapted from [1] and the questionnaires were collected upon completion. For the findings to be reliable and generalizable, the study made sure a significant number of male and female healthcare professionals were involved. For the same purpose, the study also made sure significant number of healthcare professionals from different length of working experience were included, which encompassed experience of 1-5 years (short), 6-10 years (medium), and more than 10 years (long). Furthermore, research established that the participants for this study must be healthcare workers who have direct patient contact. Based on the discussion above, participants represented a sufficiently large and heterogeneous sample of the healthcare professional working in the region.
Descriptive analysis of the participants' demographic profiles was conducted for the two subsamples (exploratory factor analysis (EFA) and confirmatory factor analysis (CFA)), and the Pearson chi-square test was used to evaluate the differences in the distribution of participants' demographic information. Based on the results summarized in Table 1, all the p values were greater than 0.05, indicating no significant difference in the distribution of participants' demographic information between subsamples. In general, the majority of healthcare professionals in the participating hospitals were female (approximately 70%), and the majority were nurses (approximately 70%). Approximately 90% of the health care providers in this study were observed to be non-Saudi. Furthermore, there was a significant number of participants with different amounts of working experience; 45% had 1 to 5 years of experience, 40% had 6 to 10 years of experience, and approximately 15% had been in this field for more than 10 years. Last, regarding marital status, a significant number of participants were single and married, while only very few of them were divorced.

Instrument
Data collection was via a self-administered questionnaire with two major parts: firstly, respondents' demographic profile which includes, gender, profession, nationality, working experience, and marital status. The socio-demographic characteristics questionnaire was prepared on an ad hoc basis for this study. Secondly, the measurement instrument used in this research was the MBI-HSS, adapted from the Maslach Burnout Inventory-Health Services Survey Manual [1], consisting of 22 items improvised for human service. Using the MBI-HSS in this study means that each item is an affirmation of healthcare professionals' feelings and attitudes towards their work and their patients. The instrument used a 7-point Likert-scale starting from never (0), few times a year or less (1), once a month or less (2), a few times per month (3), once per week (4), a few times per week (5), to every day (6). From the manual, MBI-HSS comprised of three sub-dimensions: (a) Emotional Exhaustion (EE) with nine items (1, 2, 3, 5, 6, 8, 13, 14, and 16), (b) Depersonalization (DP) with five items (5, 10, 11, 15, and 22), and lastly, Personal Accomplishment (PA) dimension, encompassed of eight items (4,7,9,12,17,18,19, and 21).

Procedure
Six private hospitals were included in the study. Four hospitals were in the Riyadh Region, Saudi Arabia. The fifth hospital was in Al Qassim, Saudi Arabia, and the sixth hospital was in Dubai, UAE. Researchers contacted all research and academic coordinators in the six hospitals to demonstrate the surveys to the staff; all coordinators were aware of the objectives of the study and trained in data collection by their own departments. For this research, data were collected between 2017 and 2018 from hospitals in six regions in the Gulf Cooperation Council Region. This research conformed to all ethical guidelines necessary for conducting human research, including the legal requirements of Saudi Arabia. Furthermore, this research project was granted approval from the Institutional Review Board chosen in this study (RC18.11.21). It is important to state that this study did not require ethical approval, as there was no treatment involved, i.e., medical treatment or invasive procedures that could lead to participants' discomfort, and no patients were included in the data collection. Following ethical approval, chiefs of departments and nurse coordinators from the hospitals were then asked to assist in administering the questionnaire to the healthcare workers in the hospital. Participants who volunteered in this research did not receive any form of reward. Furthermore, the cover page of the questionnaire clearly stated the aim of the research, the voluntary nature of participation, the anonymity of the data collected, and the elaboration of the research findings.

Analysis of the Data
For statistical data analysis, the study employed SPSS statistical package version 23 and AMOS version 22. The distribution of respondents' demographic information was determined through descriptive analysis, i.e., frequency, percentage, and mean values. Then, Cronbach's alpha was used to determine the total and subscale reliabilities, while the contribution of each item to the internal consistency was assessed through item-total and item-subscale correlations [19,20]. Item discrimination was tested through a critical ratio to ascertain if the items could be used to effectively measure the differences among participants. The critical ratio is the t-value estimated when comparing the means between two groups, i.e., the upper 27% and the lower 27% of the sample [19,21,22]. In addition, a t value greater than 1.96 (a significance level of 0.05) indicates that the item is able to measure differences among participants. Next, exploratory factor analysis (EFA) was used to restructure the MBI-HSS based on data collected for healthcare professionals in Saudi Arabia. Furthermore, this study employed confirmatory factor analysis (CFA) to evaluate the model fit to the data. In terms of sample size, a total of 888 respondents is sufficient for both EFA and CFA. For EFA, a study by [23] advised that sample sizes of 50, 100, 200, 300, 500, and 1000 or more are very poor, poor, fair, good, very good, and excellent, respectively. Authors [24,25] suggested sampling at least 100 subjects. Moreover, [26] suggested sampling three to six subjects per variable, [24] suggested at least five subjects per variable, and [27,28] suggested sampling at least 10 subjects per variable. For CFA using AMOS, [29] suggested a minimum of 10 respondents per estimated parameter as a sufficient sample size; [30] indicated 200 to be a sufficient sample size. Based on the recommendations above, the data were then randomly allocated to two subsamples using the function to select a random sample of cases in SPSS, resulting in 300 randomly selected samples for EFA and 588 randomly selected samples for CFA. The strategy was to use EFA (n = 300) to identify the underlying factor in the MBI-HSS, and CFA (n = 588) was then used to confirm the factor structure. In the EFA analysis, the Kaiser-Meyer-Olkin test was used to assess the sampling adequacy, while the Bartlett test of sphericity was used to determine the adequacy of the sampling prior to analysis of the factors [31]. Exploratory factor analysis was employed to identify a new structure of the MDI-HSS from the gathered data. In addition, the study retained the factor number as well as items using the following criteria: (1) eigenvalues >1, (2) factor structure of at least four items, (3) factor loadings above 0.4, and (4) items that did not load to more than one factor [15,22,31]. Furthermore, principal component analysis (PCA) with direct oblimin rotation (oblique rotations) was used to determine the factor structure of the MBI-HSS, with the assumption that factors were correlated in the factor model. With the new factorial structure of the MBI-HSS, CFA was performed with the second subsample to confirm the revised structure among healthcare professionals in the Gulf Region. The maximum likelihood estimation was used to estimate the factor loading of the items and model. Then, the model fit to the data was assessed using goodness-of-fit indices as follows: (1) a non-significant chi square, (2) a goodness-of-fit index (GFI) >0.90, (3) an adjusted goodness-of fit index (AGFI) >0.80 and (4) a root mean square error of approximation (RMSEA) <0.08 [15].

Item Analysis
Using the data of the current sample we calculated the internal reliability of the original tool (MBI-HSS) using Cronbach's coefficient alpha, which yielded the estimates of the whole scale as 0.87, 0.85 for Emotional Exhaustion, 0.80 for Depersonalization, and 0.75 for Personal Accomplishment.
Item-analysis was conducted for MBI-HSS and the results were as shown in Table 2. For normality assessment, the study refers to guidelines by Everitt, (1975) which stated that skewness and kurtosis must be within ±3 and ±10 to show univariate normality of the variable [28]. Based on result, skewness ranged from −1.06 to 1.30, and kurtosis ranged from −1.28 to 0.53, clearly showing that the items exhibited normality. Next, item discrimination was utilized to examine whether the items can effectively gauge the differences among participants [32]. Results showed critical ratios (t values) for each item were greater than 1.96, ranging from 21.28 to 46.145. This finding indicates that all items showed discriminant power, i.e., they are able to effectively measure the differences among participants. The item-total correlations values (one factor/dimension) ranged from 0.11 to 0.66, while the values of item-subscale correlations (3 factors/dimensions) ranged from 0.38 to 0.77 (values became larger). Further, some of the item-subscale correlation were low, for instance, MBI 16 (0.439), MBI 18 (0.383), MBI 12 (0.398), and etc. This indicated that some of MBI-HSS items showed inconsistency with low correlation presented by the items, thus the study believes there is a need to revise MBI-HSS model for healthcare professionals in hospitals within Saudi Arabia.

Assessment of Maslach Burnout Inventory-Human Services Survey (MBI-HSS) Constructs
The aim of this study is to find out how well (fit) the Gulf Cooperation Council's healthcare professional data reproduce the original factorial structure offered in the MBI-HSS manual (EF, DP, and PA) by [1]. According to the MBI-HSS manual 2018, factors were found to somehow correlated, thus study assumed there are correlations between factors. Results from CFA on original model showed unsatisfactory model fit, supported by significant Chi square value of 1897 (p < 0.001), GFI value of 0.79, AGFI value of 0.74 and RMSEA value of 0.09 (>0.08). EFA was conducted for 22-items MBI-HSS and the results were summarized in Table 3. From the results, KMO value of 0.90 indicates that the sampling was adequate, while significant Bartlett test shows that the sample was fit for exploratory factor analysis. The analysis produced a three-factor model, which accounted for 56.3% of the total variance. Referring to factor loadings, all items basically loaded into their original factor (dimension), except for MBI 11, MBI 16, MBI 18, and MBI 20. Furthermore, MBI 18 and MBI 20 were omitted due to low factor loading and item loaded into more than 1 factor. As for MBI 11, it seems that "Hardened emotionally" is more relevant to Emotional Exhaustion (EE) for healthcare professionals in Saudi Arabia. MBI 16 (Stress with patient) on the other hand, loaded into Depersonalization (DP) compared to its original factor of EE. Next, a confirmatory factor analysis was conducted for the three factors MBI-HSS obtained from EFA of 300 sample. It is important to state that 588 sub-sample isolated earlier was used in this CFA. The results of CFA on the revised model are shown in Figure 1. Chi square value of 788 was significant with p value < 0.001. Investigation of other indices gave us RMSEA value of 0.06 (<0.08), GFI value of 0.91 (>0.90), CFI value of 0.911 (>0.90) and AGFI value of 0.89 (>0.8) for the new measurement model. This showed that revised three factor model fit adequately with data of healthcare professional in Saudi Arabia hospitals. In addition, composite reliability (CR) was calculated for each of the construct. Construct EE showed CR value of 0.898, construct PA showed CR value of 0.790, and construct DP exhibited CR value of 0.790, thus study concludes that the measurement model has good reliability (internal consistency). Study believes that the revised model ( Figure 1) will be more appropriate in assessing the burnout syndrome of healthcare professional in hospital around Saudi Arabia, compared to the original measurement model.

Discussion
The aim of this study was to analyse the psychometric properties of the MBI HSS scale and create a version best suitable for use in calculating burnout in the Gulf Cooperation Council Region. The critical ratios obtained from item analysis of the original version of the tool indicated that the items were successfully able to measure differences among participants [21]. The item total

Discussion
The aim of this study was to analyse the psychometric properties of the MBI HSS scale and create a version best suitable for use in calculating burnout in the Gulf Cooperation Council Region. The critical ratios obtained from item analysis of the original version of the tool indicated that the items were successfully able to measure differences among participants [21]. The item total correlation value was larger for three factors than for one factor. This finding is in accordance with previous studies [13,15,33], which have shown that the three-factor model of MBI-HSS is the best fit when assessing burnout [8]. On the other hand, our results contradict some of the previous studies which suggest that the use of a two-factor model fits better [3,17]. Results from CFA of the original model were considered unsatisfactory in accordance with previous literature [13,17]. This indicated that the model needed to be revised in order to be used in determining burnout in the Gulf Cooperation Council Region. EFA was then conducted on the 22 items MBI HSS and the factor number and items were retained using the following criteria (1) eigenvalues >1, (2) factor structure of at least four items, (3) factor loadings above 0.4, and (4) items that did not load to more than one factor [15,22,31]. In this study, MBI 22 showed a higher factor loading in the original dimension as compared to previous studies [15,34,35] in which the factor loading of item 22 was found to be less than 0.4. Factor loading of Item 14 produced a value less than 0.4 in a study conducted in China [34] which suggested its deletion, however in this study, a greater value was observed and, therefore, this factor was kept the same when constructing the new model. Confirmatory factor analysis was performed on the reconstructed model developed in this study. The chi square value generated could indicate lack of fit of model with the data, however, according to Arbuckle (2003), it is normal for the chi square to become significant as the sample size grows larger hence we also referred to other indices for evaluation of the model [31]. The model satisfied the criteria put forth in previous research and showed a GFI value >0.90, AGFI value of >0.8 and an RMSEA value <0.08. This also indicated that it is a better fit with the data as compared to the original version [13,36,37]. Results of the levels of burnout present in the sample used in this study indicate higher mean levels of emotional exhaustion and personal accomplishment and a lower mean level of depersonalization in the Gulf Cooperation Council Region, similar results have also been found in a recent study performed in Saudi Arabia and United Arab Emirates [20]. This study has many limitations, one of which is self-reported, the responses of the individuals are subjective and not backed up by clinical evidence. In addition to this, the study only includes private hospitals and hence health care professionals employed at public hospitals have not been represented in our study. This greatly affects the generalizability of the results generated. Therefore, we would encourage further studies of this nature to be performed within this region with public hospitals incorporated into the sample as well. The current study used a non-probability convenience sampling technique which might have limited the generalizability of the findings. It is also important to note that certain factors such as the unequal distribution of genders of the participants (more females than males), the differences in their nationalities (more non-Saudis than Saudis), and the fact that majority of individuals in the sample used are nurses could have also limited the generalizability of the results produced.

Conclusions
This study involved the participation of 888 healthcare providers employed in the Gulf Cooperation Council Region with direct patient contact. The results from confirmatory factor analysis on the original MBI-HSS model showed unsatisfactory fit with significant Chi square value of 1897 (p < 0.001), GFI value of 0.79, AGFI value of 0.74 and RMSEA value of 0.09 (>0.08). This indicated the need for the creation of a new version. Exploratory factor analysis was conducted to develop a new version of MBI-HSS which retained the original three-factor structure with the omission of items 18 and 20, item 11 moved into EE and item 16 moved into DP. Confirmatory factor analysis of this reconstructed version provided a chi square value of 788 which was significant with p value <0.001. Investigation of other indices gave us an RMSEA value of 0.06 (<0.08), GFI value of 0.91 (>0.90) and an AGFI value of 0.89 (>0.8). These results show a more satisfactory fit to the data than the results produced by the CFA of the original version indicating that the newly constructed version can allow for a more accurate estimation of burnout levels. Due to the high levels of burnout in this region [21] along with the lack of an MBI-HSS version validated specifically for this region, we believe that the model of MBI-HSS produced in this study can be of great benefit to the Gulf Cooperation Council Region.