Measuring the psychosocial work environment in a valid and reliable way is increasingly seen as a necessary part of systematic occupational safety and health management [1
]. A widely used research-based non-commercial tool for psychosocial workplace surveys is the Copenhagen Psychosocial Questionnaire (COPSOQ). Originally developed in 2000 for use in research and at workplaces in Denmark, it has today been validated in 18 countries, and results from research from even more language versions have been reported in hundreds of peer-reviewed articles [5
]. COPSOQ is intended for both workplace measurement, usually comparing work groups, departments or companies, and for research, e.g., investigating effects of work environment on health or labour market attainment. The International COPSOQ Network recently released a revised third version, COPSOQ III [5
], which is an update of the two previous versions of the instrument [6
]. The changes are primarily based on experiences from practical use of previous versions for workplace assessments and research but have also taken labour market changes and theoretical developments into consideration [5
]. Importantly, the new version is designed to allow flexible adaptation to national and industry-specific contexts without compromising the potential for international comparisons and for comparisons over time. Items labelled as “core”, “middle” and “long” compose the international COPSOQ III structure. While inclusion of core items is mandatory for national versions, it is important to underline that they do not constitute a short version of the instrument. National versions can be established by the national COPSOQ teams of each country based on all “core” items supplemented with enough items labelled as “middle” or “long” to form a reliable and relevant measurement in the given context. Therefore, all future national versions will include the same mandatory core items, while the total number of items in scales and number of scales are allowed to differ [5
The new Swedish standard version of COPSOQ III is based on preceding development, adaptation and testing of COPSOQ II for use at workplaces and research in the Swedish context [8
] also taking the new COPSOQ III into account [5
]. Several studies have corroborated different aspects of reliability and validity of the Swedish version of COPSOQ II. An iterative process including translation-back-translation procedures and cognitive interviewing methods supported the face and content validity, as well as the cross-cultural equivalency of COPSOQ II and COPSOQ III test items [8
]. The nomological validity has been corroborated by operationalization of an extended JD-R model by the instrument with aspects of workability as outcome [11
] as well as need for recovery [12
] and also in relation to the newly introduced dimensions in the COPSOQ III of Work Engagement, Quality of Work [13
] and Cyber Bullying [14
]. Studies across different occupations have corroborated the internal consistency reliability and construct validity of the scales [11
]. The ability to distinguish different groups (organizations with similar missions, work teams or occupational groups) has been demonstrated [20
], as also the relevance of multilevel analyses and for intervention and organizational change studies [23
As part of a research and development project for use in Swedish workplaces, several workplace surveys have been conducted in close collaboration with stakeholders from different organizations. The data and experiences from this process have contributed to the international development of COPSOQ III, e.g., selection of items, changes in wording and inclusion of new dimensions [5
Now the Swedish standard version of COPSOQ III has been developed. As it is adapted to the Swedish context, it differs from the international version of COPSOQ III, which showed satisfactory basic psychometric properties in findings from 6 countries (including data collected at Swedish workplaces) [5
]. The factor structure of the mandatory “core” items defined for COPSOQ III has been validated in Canada [31
] and the COPSOQ III domain for Social Capital has been validated by qualitative and quantitative methods in Sweden [10
Aggregated group means for organizations or departments are of high relevance for the assessment, implementation and evaluation of organizational interventions [33
]. Although this approach is widely applied when applying COPSOQ for psychosocial risk management in workplaces, the emphasis of validation studies has so far been on the individual level. Nevertheless, a validation study is needed for the presentation and evaluation of the adapted Swedish national standard version of COPSOQ III, to establish population-based benchmarks for Sweden, and especially the aggregation to workplace group means has yet to be validated.
A Need for Benchmarks for Use at Workplaces
Benchmarks can provide various kinds of relevant information for use at workplaces. Population-based benchmarks/reference values are the key to interpreting COPSOQ survey results from a risk management perspective [34
]. For COPSOQ II, such population-based reference values are established, for example, for the working populations in Denmark, Spain, Canada, and France. For Sweden, the opportunities for comparisons have so far included mean scores from a convenience sample of workplace surveys (www.copsoq.se
). Such comparisons can give an idea about the level for each scale for specific occupations but are not representative for the average level in the population. This forces occupational safety and health companies, organizational consultants, HR departments, policy-makers and researchers to interpret results from Swedish surveys with Danish reference values in order to assess psychosocial risks. This is not an ideal situation for several reasons: The data used for establishing the Danish reference values was collected 15 years ago [7
]; the Danish labour market and legislation differs from the Swedish; the Danish benchmarks have not been validated for use in the Swedish context or with a Swedish language version; and finally the values relate to COPSOQ II. Introducing COPSOQ III accentuates the need for updated reference values based on the Swedish labour market of today.
The purpose of this study is to present and evaluate aspects of reliability and construct validity at both individual and workplace levels for the Swedish standard version of COPSOQ III, with the aim of establishing benchmarks for the organizational and social work environment for the adult working population in Sweden.
2. Materials and Methods
The present validation study builds on data from a cross-sectional national survey for the establishment of reference values and for psychometric evaluation of scale characteristics at the individual level. Nested data from a convenience sample of 51 workplace surveys is used for evaluation of the appropriateness of aggregating individual-level COPSOQ dimensions to the organizational level.
2.1. Random Sample
A cross-sectional survey was conducted by Statistics Sweden (SCB) at the request of the research group. Data collection took place from September to November 2018 by post, including an information letter, a paper version together with a stamped return envelope, and a personal link to a web questionnaire. Non-respondents received up to two reminders, the last of these included new paper questionnaires and return envelopes.
From the Swedish employment directory, SCB drew a random sample of 11,556 persons from all 4,525,274 inhabitants in Sweden aged 20–65 years and registered as gainfully employed. In total, 3642 responded (30.9%). Of these, 53 declined participation, 374 were not currently in work, and 33 were excluded based on an ID-check comparing register data with self-reported data. Due to a response rate as low as 6% for those aged 20–24 years and the fact that many in this age group were still in education, we decided to exclude this age group (74 cases) from the analyses for this paper. In addition, 185 business owners and 76 respondents stating that they had neither a superior nor colleagues were excluded from all main analyses. For an overview of the sampling process, see Figure 1
In general, women, the oldest age group, and those with tertiary education were the most likely to respond. This was also reflected in the differences seen across major occupational groups based on the International Standard Classification of Occupations, ISCO-08. People born in Scandinavia were more likely to respond than those born elsewhere, and those with the highest income responded to a larger extent than others.
The study population is presented in Table 1
. Out of the 2847 respondents in the analytic sample, 56% were women, the most frequent major occupational group was Professionals (group 2, 35%), and less than half of the respondents worked in the private sector (47%). Two out of three were in a non-managerial position (67%) and most respondents (81%) reported having direct contact with patients, customers, clients, pupils, etc., at work. More details regarding the study population stratified by major occupational groups (ISCO-08 1-digit) are presented in Table A1
2.2. Workplace Sample
Cross-sectional data was collected from 2016 to 2019 as part of a validation and development project for the use of COPSOQ at workplaces (Grant: AFA Insurance 130301). All staff members in a convenience sample of 51 workplaces (organizations with max. 200 employees each; 26 public and 25 private) received an email with a link to an online questionnaire and an introduction and information about the research project. Each survey was open for 3–4 weeks and included two reminders. The overall response rate for the workplaces was 77% (ranging from 50% to 100%) and analyses included data from 1818 non-managerial employees. The average number of respondents at the workplaces was 28 (SD 18, range 8–138). For this convenience sample, 28% of the employees were under 35 years of age, 22% were 35–44, 27% were 45–54, and 21% were aged 55 or older and 51% were women. The corresponding distribution for the target population 2017 was according to SCB statistics: 26% below age 35, 26% were 35–44 years old, 28% were 45–54 and 21% were 55 or older and 48% were women. Most employees were Professionals (36% ISCO group 2), Technicians and Associate Professionals (24% ISCO group 3), Clerical Support Workers (11% ISCO group 4) or Services and Support Workers (12% ISCO group 5).
The questionnaire for the national study comprised 132 items in total and a free text field for comments. We included 12 background factors regarding work situation and personal characteristics in addition to register data obtained from Statistics Sweden. From COPSOQ III, 85 items were included in the questionnaire to cover 33 dimensions. Furthermore, 35 items were included for other research purposes. The questionnaire applied to employees at workplaces was regarding COPSOQ III items similar to the questionnaire used for the national survey.
2.4. The National Swedish Standard Version of COPSOQ III
In the present study, we evaluate the national Swedish standard version of COPSOQ III. It includes 76 items (according to the international COPSOQ III structure: 32 mandatory “core” items, 15 additional “middle” items and 29 additional “long” items) to cover 33 work environment dimensions (24 multi-item scales, nine single item measures (incl. five items on conflicts and offensive behaviours). Table A2
from Appendix B
gives an overview of the Swedish standard version of COPSOQ III and its correspondence with the international middle version of COPSOQ III and with the Swedish middle version of COPSOQ II. A detailed overview, including formulations in Swedish, is available as an online Supplementary Materials
. In relation to the previous Swedish version, the present third version includes five new dimensions and six dimensions have changed name, one dimension has changed response options, 16 dimensions have a reduced number of items, two items are replaced and five have changes in wording. Decisions regarding the selection of dimensions were guided by the perceived relevance to the Swedish context, cognitive interviews, pilot tests and dialogue with stakeholders, taking the item level in the international COPSOQ III and item-level ICC(1) values into consideration for not jeopardizing the ability to differentiate workplaces, as recently suggested by Bliese and colleagues [35
Scales were computed as means of items with range 0–100, where the scale score was set to missing if respondents had replied to less than half of the items included in the scale [5
]. Each scale was scored in the direction indicated by its name [5
To draw correct inferences about the target population, two sets of weights were calculated for the national representative sample; one based on sex, age, income and educational level for calculating benchmarks for the general population of 25–65-year-old employees in Sweden; and another set of weights based on sex and age for the purpose of calculating representative mean scores for each of the ISCO major occupational groups. The benchmarks for the Swedish standard version of COPSOQ III were computed as mean scores with standard deviations for scales, and frequencies of conflicts and offensive behaviours such as bullying, harassment and violence based on weighted data to match the target population of 25–65-year-old employees working in Sweden. Mean scale scores, standard deviation and frequency of conflicts and offensive behaviours were also computed for each major occupational group, weighted within each group to match the target population (ISCO 1-digit, 25–65 years). Internal consistency reliability was analysed with Cronbach’s alpha for scales with three or more items and Spearman-Brown Coefficient for two-item scales [36
]. The proportion of respondents selecting the lowest (floor) and highest (ceiling) response options for all items in a scale were determined for all scales, as well as the proportion of respondents having replied to less than half of the items in each scale (scale missing). More than 15% of the respondents choosing the lowest or highest response options was considered evidence of a floor or ceiling effect, respectively [37
]. Mean scores and frequency of conflicts and offensive behaviours were calculated according to sex (men/women), work sector (private/public) and white/blue-collar work (ISCO groups 1–2–3 versus 6–7–8–9). Differences within each group were tested with t-tests and Chi-square tests, and Cohen’s d was calculated for evaluation of the effect of sex, sector, and kind of work. A Cohen’s d value of 0.2 indicates a small effect, 0.5 a medium effect and 0.8 a large effect [38
] and a 5–10 point mean score difference is considered a minimum important difference [39
ICC(1) and ICC(2) were calculated for each dimension based on aggregation of individual level data to ISCO major occupational group (national sample) and to workplace (workplace sample). ICC(1) represents the amount of variance in the employees’ responses that can be explained by their membership of a group (occupation or workplace) [35
]. ICC(1) values of 0.05 can be considered as a small to medium effect and higher values indicate stronger effects [42
], ICC(2) is an estimate of reliability of the aggregated group means [35
]. Values <0.5 indicate poor reliability, 0.5–0.75 moderate and >0.75 indicate good reliability of group-level means [43
]. Finally, for the sample of workplaces, we calculated the aggregated level mean, standard deviation, minimum, maximum, range and comparison of mean scores with the Benchmark for each scale.
Bivariate Pearson correlations between scales were calculated for the national sample of 25–65-year-old employees (individual level) and for the convenience sample of workplaces (individual and workplace level) for evaluation of construct validity (distinctiveness of dimensions and concurrent validity).
Informed consent was obtained from all individual participants included in the study. All procedures performed were in accordance with the ethical standards of the national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The Regional Ethical Review Board of Sweden approved the study (Dnr 2015-476; 2018–392; 2019-05904).
Benchmarks for the Swedish standard version of COPSOQ III are presented in Table 2
in addition to scale psychometric characteristics.
The internal consistency reliability was above 0.70 for all scales, except for the two-item scale for Quality in Work (0.69). Most dimensions had low floor and ceiling effects. High floor effect and low mean scores were seen for Job Insecurity (34.8%) and Insecurity over Working Conditions (28.1%). A strong ceiling effect and high mean values were seen for the single item Meaning of Work (40.6%) and for Social Support from Supervisor (30.3%) and Social Support from Colleagues (32.5%). Internal non-response for dimensions was between 0.4% and 1.6%.
The mean scores differed statistically significantly for most scales by sex, work sector and white/blue-collar work (Table 3
Moderate to large differences in mean scores were found between white- and blue-collar workers, in particular. White-collar workers had higher mean scores for Quantitative Demands, Emotional Demands, Influence, Possibilities for Development, Variation and Meaning of Work, while lower for Job Insecurity compared to blue-collar workers. Emotional Demands was the only dimension showing large differences for sex, work sector and kind of work. Women workers, employees working in the public sector and white-collar workers reported the highest levels of Emotional Demands (scale means 15–19 points higher than for their respective counterparts). We found a corresponding pattern with the same groups most exposed to conflicts and offensive behaviours. An additional comparison revealed that business owners scored statistical significantly higher for the outcome dimensions Work Engagement (77) and Job Satisfaction (72), and lower for Stress (31) and Burnout (31) than the study sample did (results not shown in table).
displays psychometric characteristics for major occupational groups based on the ISCO-08 classification. Of the 24 relevant scales, 16 revealed satisfactory reliability values for all major occupational groups. Reliability coefficients below 0.70 were mainly seen among Managers and Elementary Occupations (e.g., Work pace, Recognition, Role conflicts and Quality in work), and only in one case did a reliability coefficient reach below 0.60 (Work Pace/Elementary Occupations). Managers reported the most beneficial scores across occupations (14 out 28 scales) and the group having the most problematic weighted mean scores was Plant and Machine Operators and Assemblers (13 out of 28 scales). Services and Support Workers was the group most exposed to Threats of Violence, Physical Violence and Sexual Harassment. Clerical Support Workers reported bullying most frequently, while Managers were the group most exposed to Cyber Bullying. The widest range for mean scores across ISCO major occupational groups was found for Emotional Demands, Variation, Quantitative Demands and Influence.
The bivariate intercorrelations between dimensions for the total national sample (individual level data) are presented in Table 5
and for the workplace sample (both individual and organizational level) in Table 6
. Too strong intercorrelations may indicate that the scales do not measure distinct constructs. For individual level data, only 6 out of the 378 correlations in the national sample and 9 correlations in the workplace sample were above 0.70. The strongest correlations at the individual level were largely those between scales that were most strongly correlated also at the workplace level, for example, the correlation between Stress and Burnout ranged from 0.79 to 0.83. The correlations were in general stronger between scales aggregated to the organizational level than the corresponding correlations at the individual level. Nevertheless, for the scales Role Clarity and Quantitative Demands, most of the correlations with other dimensions were strongest at individual level. We found differences in the pattern of correlations between individual and workplace level data in relation to a few dimensions, in particularly Role Clarity and Job Insecurity. For example, a moderate negative correlation was seen between Job Insecurity and Quantitative Demands (−0.53) at an organizational level, while the corresponding correlation was non-significant at an individual level. Conversely, a moderate positive correlation between Role Clarity and Social Community at Work was significant at an individual level (0.37/0.34) but insignificant at a workplace level.
displays measures relating to aggregation of data to major occupational groups and to organizational level. The ICC(2) scores indicate a moderate to good reliability of group mean scores for major occupational groups as well as for workplaces. Only aggregation of the individual characteristic Self-Rated Health to workplace level showed poor reliability. A small to medium effect of respondents’ major occupational group was seen for Quantitative Demands, Emotional Demands, Influence, Possibilities for Development, Variation, Meaning of Work, and in addition for Job Insecurity (ICC(1)). In relation to the effect of workplace, the largest explained variance was seen for scales reflecting job demands and aspects of leadership, while small to medium effect sizes were found for all other exposures. The aggregated workplace mean scores ranged from 23 to 54 points.
In the present study, we have evaluated the reliability and construct validity of a Swedish standard version of COPSOQ III at both individual and organizational level and established national benchmarks for workplace surveys. A trade-off exists between the obvious need for a questionnaire of high relevance for the national context and the need to keep a high degree of correspondence with other national versions for facilitation of valid comparisons. Experiences from previous versions of the instrument have shown that practitioners and researchers to a high extent share a wish for shorter questionnaires. We have chosen to reduce the number of items in many dimensions in order to be able to make room for new dimensions covering Work Engagement, Quality of Work, Job Insecurity, Insecurity over Working Conditions and Cyber Bullying. Scales including only a few items potentially reduce the reliability and validity of the measurement. Nevertheless, our overall findings indicate that the Swedish national standard version of COPSOQ III has good psychometric properties for its intended uses.
4.1. Reliability and Scale Characteristics at Individual Level Based on the National Survey
The internal consistency reliability of the scales was satisfactory for the study population as a whole. This corresponds with findings from the international COPSOQ III validation study (Burr et al. 2019). An unacceptably low value for Work Pace was seen for respondents with an Elementary Occupation, and the reliability was questionable for Work Pace, Role Conflicts and Quality of Work for Managers, Craft and Related Trade Workers, and Elementary Occupations. This calls for caution when interpreting results for these specific combinations of scales and major occupational groups. In the future, adding more items to these scales should be considered in the Swedish context.
Compared to findings from the Danish COPSOQ II study [7
] and the international COPSOQ III study [5
], the internal non-response was low for all scales, and especially regarding Social Support from Supervisor and Vertical Trust. Scales referring to managers and work climate can in some cases be difficult to reply to, for example in complex organizations or among the self-employed [10
]. The noticeable lower internal non-response for these scales might be due to stricter inclusion criteria in the present study in combination with the thorough adaptation of formulations based on cognitive interviewing techniques [8
Floor and ceiling effects were minor for most scales, indicating the good ability of the instrument to distinguish over the full spectrum of the scales. However, for the new dimensions, Job Insecurity and Insecurity over Working Conditions, we found a high floor effect. This finding was not a surprise based on the previous findings from the international validation study [5
] and from the Sixth European Working Conditions Survey [44
]. In contrast, we found large ceiling effects for Meaning of Work, Social Support from supervisor and from colleagues. The finding regarding Meaning of Work is also in accordance with previous findings [5
]. Sweden is globally among those countries with the highest proportion of workers employed in service work (2019: 80% [45
]), which is typically perceived as more meaningful than manufacturing work. The high levels of reported social support contrast with the levels reported for COPSOQ II for specific occupational contexts in Sweden [11
]. This could be a consequence of the COPSOQ III standard version including two rather than three items in each of these scales. The level is also higher than the reported international results reported for COPSOQ III [5
]. This difference can probably be understood in the light of the Swedish workplace culture characterized by shared decision making, avoidance of conflicts and aiming at consensus [46
4.2. Reliability and Validity of COPSOQ III for Use at Workplaces and for Multilevel Research Design
COPSOQ is a generic instrument intended for research purposes as well as risk management of the psychosocial work environment at workplaces [5
]. Accordingly, the ability of scales to distinguish exposures for different occupational groups and across workplaces is of great importance.
Despite being an instrument, which collects responses from individual employees, the main intention is to capture workplace and organizational conditions, not individual perceptions. It is thus very important that the aggregated workplace scores refer to something that is shared by the employees in a certain work unit/organization and not just to a mean of largely unrelated individual responses. Our findings corroborated the reliability of such group mean scores regarding psychosocial exposures based on aggregation to occupation and workplace level.
The traditional criterion is a minimum of 5% explained variance for the relevance of taking the aggregated level into account [41
]. The amount of variance explained by workplace fulfilled the criteria for all dimensions except Self-Rated Health, which is an individual outcome mainly influenced by non-work-related factors. This underlines the importance of considering the workplace level for research on the psychosocial environment and justifies the relevance of aggregating individual scores to group mean scores when reporting survey results back to workplaces. Our findings corroborate previous research on the COPSOQ II showing that job exposure matrices are of little relevance for psychosocial risk assessment of, e.g., relational factors in workplaces [48
]. However, the low amount of variance attributed to the major occupational groups does not imply that occupation is of no relevance, as the ISCO-digit-1 grouping comprises many different occupations working in different sectors, etc., within each major group. In a specific context such as public dental services, psychosocial work environment factors have been reported to differ considerably for dentists working in different organizations, while this is not the case for dental nurses and hygienists [22
]. Additionally, the traditional criterion has been questioned as even ICC(1) values as low as 0.01 in some cases are relevant to take into account in multilevel analyses [50
We found a similar overall pattern of inter-correlations at the individual level across the two samples of the present study and those reported from the international validation study [5
] (Burr et al. 2019). In general, the strength and direction of correlations supported the concurrent validity of the scales. However, the strength of the inter-correlation between Stress and Burnout and the similarity of correlation for these two scales to other dimensions calls for further clarification of whether they actually represent two separate constructs as measured here.
As one might expect, however, we found differences in the strength of correlations at the individual level when comparing the Swedish with the international findings. In particular, the two new dimensions regarding insecurity showed considerably stronger correlations with other dimensions in the Swedish sample compared to the international average correlations across national samples. A high degree of employment security on the labour market combined with a high flexibility decreases the detrimental health effect of individual employees’ perceptions of job insecurity [51
]. The Swedish labour market is, however, characterized by high employment security for people in fixed positions, but little flexibility in hiring and firing of workers; this combination may result in especially strong adverse reaction to individual level experienced job insecurity [51
In accordance with what is typically reported [52
], the correlations at the aggregated workplace level were in general stronger than for the individual level. We found some interesting differences in the general pattern of correlations between individual and workplace level. This may be due to conceptual differences between aggregated and individual level dimensions [33
]. Stronger correlations at the individual level could also indicate individual bias, such as negative affectivity or generalized effects of health, for instance depressive symptoms [53
]. Stronger correlations at the organizational level, on the other hand, could indicate generalized effects of managerial practices or financial constraints at the organizational level. For example, the Psychosocial Safety Climate (PSC) of organizations has been shown to act as a precursor to and moderator of job demands and resources in the workplace [54
]. This underlines the importance of careful theoretical considerations and the relevance of multilevel study design in work environment research in order to avoid the ecological or the atomistic fallacy.
4.3. Strengths and Limitations
The findings of our study should be seen in the light of some advantages and limitations.
A trade-off exists between the need to optimize the relevance of a generic questionnaire to the local context and the prospects for comparison over time and context. We found it to be possible to reduce the number of items, to maintain a broad coverage and even include new dimensions of high relevance to Swedish regulations (e.g., Work Engagement and Quality of Work). Another advantage is that the Swedish national version of COPSOQ III builds on experiences from COPSOQ II and a careful adaptation process including translation-back-translation, use of cognitive interviews and perceptions from stakeholders of different kinds.
The study design allowed for analyses including individual level data and nested data from workplaces. This adds to the knowledge about the reliability and validity of the instrument for use at workplaces and for integration in multilevel analyses.
The response rate for the workplace sample was a satisfactory 77%, clearly indicating the relevance of the instrument for use in this context. For the national survey the response rate was a less satisfying 31% and for two of the major ISCO 1-digit groups the number of respondents was too low to allow for valid calculation of scale mean scores. However, the strength of this dataset is that it was based on a random sample of wage earners in Sweden and the opportunity of calculating weights for adjustment based on complementary demographic register data. A comparison of weighted and unweighted benchmarks and mean scores (not reported) showed only minor differences in estimates. While the low response rate still is a limitation of the study, we find no indication that selection bias is a major problem for the reported population-based benchmarks and mean values for the major occupational groups, which can thereby be considered representative of the underlying population.
In future studies, it will be relevant to employ a longitudinal multilevel design with integration of self-reported data and register data (e.g., absence, staff turnover and measures of performance). In particularly, it will be relevant to evaluate test-retest reliability, responsiveness and predictive criterion validity. Bliese and Jex pointed out that simple analyses of means for people working together often may be appropriate for implementation and evaluation of organizational interventions and are also important to consider in stress research projects [33
]. This makes further validation of the multilevel structure of the instrument and evaluation of measurement invariance across different groups and language versions highly relevant.