The Portuguese Third Version of the Copenhagen Psychosocial Questionnaire: Preliminary Validation Studies of the Middle Version among Municipal and Healthcare Workers

A third version of the Copenhagen Psychosocial Questionnaire (COPSOQ III) was developed internationally aiming to respond to new trends in working conditions, theoretical concepts, and international experience. This article aims to present the preliminary validation studies for the Portuguese middle version of COPSOQ III. This is an exploratory cross-sectional study viewing the cross-cultural adaption of COPSOQ III to Portugal, ensuring the contents and face validity and performing field-testing in order to reduce the number of items and to obtain insight into the data structure, through classic test theory and item response theory approaches. The qualitative study encompassed 29 participants and the quantitative one 659 participants from municipalities and healthcare settings. Content analysis suggested that minor re-wording could improve the face validity of items, while a reduced version, with 85 items, shows psychometric stability, achieving good internal consistency in all subscales. The COPSOQ III Portuguese middle version proved to be a valid preliminary version for future validation studies with various populations, able to be used in correlational studies with other dimensions.


Introduction
In a world where psychosocial factors at work have a devastating individual, social and economic impacts, assessing psychosocial factors quickly becomes crucial [1][2][3][4] for individuals, society and economies all over the world. In this sense, assessing and measuring the psychosocial work environment is increasingly seen as essential for the development forward is an important asset for organizations that want to add value to their management systems in the scope of continuous improvement and sustainable development.
Following the previous experience with the validation of the COPSOQ II middle version in Portugal, this study aims to present, discuss and evaluate the aspects of crosscultural adaption, reliability, and preliminary validity of the Portuguese COPSOQ III Middle Version, viewing to obtain a research version.

Study Design
This is an exploratory cross-sectional study aiming the cross-cultural adaption of COPSOQ III to Portugal, by analyzing the face and content validity (qualitatively), and performing field-testing to reduce the number of items and to obtain insight into the data structure, through classic test theory (CTT) and item response theory (IRT) approaches. The main aim of this study is to achieve a research version for future validation studies. It includes two phases: first, a qualitative pilot study, to analyze face validity and to ensure content validity; second, a quantitative field study, to trim the questionnaire through a reliability approach and to confirm the trimmed solution with construct validity and item response theory analysis. These phases are according to the international procedures defined by the International COPSOQ network (https://www.copsoq-network.org/ (accessed on 4 January 2022)), allowing data comparability among different countries and populations.

Participants
For the first phase, 29 participants were invited to participate in the think aloud and gave informed oral consent. The sample included various geographical regions (48.3% from the north, 34.5% from Lisbon and Tagus Valley, 13.8% from the center and 3.4% from the Algarve), 51.7% were women, the mean age was 42.6 years old (±11.3; min = 21; max = 65), with different occupations and education levels (from secondary school (24.1%) to graduation levels (75.9%)).
To trim the version resulting from the thinking aloud procedure (first phase), an online version of the questionnaire was sent to a municipality and to healthcare settings. The informed consent was presented in the first page of the questionnaire, which could only be accessed by those who had given their informed consent. The final sample included 659 participants (488 females), aged between 20 and 68 years old, with a mean age of 47.5 years (SD = 9.4). Regarding the professional sectors, 190 participants belonged to the healthcare sector and 469 came from a municipality. This was a convenience sample adequate for this exploratory study.

Instruments
The evolution of the middle version from COPSOQ II to COPSOQ III was mainly based on changes of dimensions and items: the addition of the dimension "control over working time", the split of the "job insecurity dimension" into two dimensions ("job insecurity" and "insecurity over working conditions"), the addition of items in some dimensions (e.g., the dimension "trust regarding management" was relabeled "vertical trust" and a new item was introduced), the relabeling of some dimensions (e.g., the dimension "rewards" was relabeled "recognition"), and the rephrasing of some items (e.g., items included in the dimension "social support from supervisor" were rephrased to stress that support should be asked when needed) [7]. Based on the knowledge of these changes and on the validation project of the Portuguese COPSOQ II middle version [12], with the corresponding crosscultural adaptation of the instrument to Portuguese, it was decided to use the tool obtained from the previous validation project (maintaining all the dimensions and the items, but one: "offensive behaviors" will not be included in the middle version of COPSOQ III) and to translate only the new items and scales by using a pool of seven experts from different disciplines (Psychology, Ergonomics, Engineering, and Management), diverse professional backgrounds (academic, industrial), and a consensus technique. The Portuguese COPSOQ II middle version included items and dimensions from the long version [12], and this option was maintained for the COPSOQ III middle version. All items were measured using a 5-points Likert scale [5][6][7]12,13], from "never" until "always" and from "none" until "extremely". The dimension "Self-rated Health" is measured with a 5-points scale from "Excellent" to "Poor".

Methods
After participants have completed the questionnaire, the think aloud method was used [20] to ensure the content validity, defined by the COSMIN group as 'the degree to which the content of an instrument is an adequate reflection of the construct to be measured' [21], and assess face validity, defined as 'the degree to which the items of an instrument indeed look as though they are an adequate reflection of the construct measured' [21]. Participants were asked to complete the questionnaire and, afterwards, to comment on the appropriateness, comprehensibility, relevance, and ambiguity of the items and problems with response categories. Additionally, they were asked on their own interpretation of the different terms [20]. All the comments and the duration to complete the questionnaire were both registered. This step was followed by a content analysis and a qualitative analysis of the results carried out by the pool of seven experts aiming at selecting the relevant changes to be implemented. Considering a participants' widespread complaint that the questionnaire was too long, thus compromising its completion, the version of the questionnaire that came out of the thinking aloud method was submitted to a trimming procedure. This trimming consisted of three different stages:

Trimming based on Reliability Analysis
At this stage, all subscales were submitted to a reliability analysis. Considering the need to reduce the number of items in the instrument, it was decided to carry out an "if item deleted" analysis, eliminating the items that affected reliability in each factor. Considering the ordinal nature of data and the violation of mathematical continuity, internal consistency was assessed through the calculation of Ordinal Cronbach's α [22], based on a polychoric correlation matrix. Raw alpha (traditionally used based on Pearson's correlation matrix) and raw omega (traditionally used) values were also provided to allow for comparability between countries, as there are countries that report these indices as a measure of reliability.

Agreement on Trimming based on Factorial Validity
The factors that had problematic items in the previous stage were submitted to an exploratory and confirmatory factor analysis to assess whether the elimination of these items would be problematic for the corresponding factor, compromising its factorial validity. Mardia's test was performed to assess the multivariate normality of the sample [23]. Given the violation of the normality assumption and the ordinal nature of the data, estimators based on the asymptotic covariance matrix were used. These estimators are derived from the polychoric correlation matrix estimated from the observed categorical variables [24]. In order to perform an exploratory factor analysis (EFA), the ULS estimator, based on the diagonal form of the asymptotic covariance matrix, was used [25]. Factor retention criteria was used, as well as a parallel analysis (of the factor analysis) with the ULS method and the Kaiser criteria. A confirmatory factor analysis (CFA) using a weighted least-square-mean and variance adjusted estimator (WLSMV) was also conducted. The overall goodness-of-fit was assessed using the following indexes and cut-off points for "good adjustment": Chi-square (χ2); comparative fit index (CFI; 0.90 ≤ CFI ≤ 0.95); Tucker-Lewis index (TLI; 0.90 ≤ TLI ≤ 0.95); root mean square error of approximation (RMSEA; 0.05 ≤ RMSEA ≤ 0.70); P[rmsea ≤ 0.05]; and standardized root-mean-residual (SRMR; SRMR < 0.80) [26]. The Average variance extracted (AVE) of each factor was calculated using the following formula [27]: mean-residual (SRMR; SRMR < 0.80) [26]. The Average variance extracted (AVE) of each factor was calculated using the following formula [27]: are the standardized factor weights and = 1 − ≅ 1 − are the residues of each item). An AVE of 0.5 or greater suggests an adequate convergence between the items of each construct [28].
3. Agreement on Trimming Based on the Item Response Theory (λ ij are the standardized factor weights and ε ij = 1 − R 2 ij ∼ = 1 − λ 2 ij are the residues of each item). An AVE of 0.5 or greater suggests an adequate convergence between the items of each construct [28].

Agreement on Trimming Based on the Item Response Theory
In addition to this factorial verification, based on the classical test theory, a polytomous item response theory analysis [29] with a partial credit model was also carried out to support the decision to eliminate items based on reliability. This analysis aims to assess whether any of the items that were eliminated using classical test theory methods are more discriminative of the measured latent trait than the items that were not eliminated. In fact, this analysis was thus based on the discrimination parameter that represents how much the item can differentiate individuals with different latent traits [29]. If the discrimination value of an included item is higher than the excluded one, it means that the individuals with higher and lower ability are more likely to agree in the item that was not excluded. On the other hand, if there is an agreement between methods (ITR and CCT) we have more confidence in the elimination of items.
All of the statistical analyses were performed using R.

Results
This section is divided into the two main phases of the study, content validity (including face validity), and the trimming process that allowed the analysis of reliability and construct validity aiming at the elimination of items and achievement of a research version.

Content Validity
The comments regarding comprehensibility focused on the level of ambiguity and on the difficulties in understanding some questions (Q3 (n = 1); Q11 (n = 1); Q12 (n = 1); Q14 (n = 3); Q20 (n = 1); Q40 (n = 1); Q41 (n = 1); Q45 (n = 2); Q64 (n = 1); Q78 (n = 1)), just as the perception of similarity between items (Q11 and Q12 (n = 1); Q75 and Q76 (n = 1)). Although the items remained, mainly, with a similar phrasal structure, content analysis suggested that minor re-wording could improve the face validity of these items, which resulted in the version used for the next step (Table 1). Regarding the dimension work-life conflict, the concerns with the different mixed organizational models resulting from the pandemic situation, there including working from home have determined the rephrasing of the items, aiming at their neutrality, considering the possibility to have both people, working from home or in the workplaces.    A feasibility analysis recommended an interview, for those with lower education levels, in order to assure the appropriate interpretation of the items.

Trimming Process
The results of the analysis for each dimension/factor as well as the reliability coefficients in each depurative solution are shown in Table 2. Looking at the data in Table 2, we can see that only the dimensions "Possibilities for Development", "Control over Working Time", "Vertical Trust", "Work-Life Conflict" and "Job Satisfaction" have registered improvements in the reliability coefficients with the elimination of items. In fact, using this methodology, 8 items were eliminated: 1 item in "Possibilities for Development"; 2 items in "Control over Working Time"; 1 item in "Vertical Trust"; 2 items in "Work Life Conflict"; and 2 items in "Job Satisfaction". After this Trimming stage, and in order to confirm the solution obtained, the trimmed and non-trimmed version of the subscales that were reduced in the previous step were submitted to an exploratory and confirmatory factor analysis. Mardia's test showed that the data is not multivariate normal, g1p = 1732.29, χSkew = 190263.4, p < 0.0001; g2p = 9660.6, ZKurtosis = 79.72, p < 0.0001; χSMSkew = 191148, p < 0.0001. Bartlett's test of sphericity (χ 2 = 42294.85, p < 0.001) and KMO (0.934) indicate that this data is probably suitable for factor analysis.
The EFA (factorial weights, variances, and complexity) and CFA (factorial weights, AVE and goodness of fit indexes) results can be seen in Table 3. By analyzing Table 3, it is possible to verify that both the exploratory and confirmatory factorial analysis supported the decisions obtained in the trimming based on reliability. In fact, higher loadings were found and AVEs and better goodness of fit indexes on trimmed subscales than on nontrimmed ones.  The results of the polytomous IRT with the partial credit model can be seen in Table 4. Remarkably, the items eliminated from each factor present a lower discrimination value in this analysis. This result showed a congruence between analytical methods (CTT vs. IRT) and gives us confidence that the elimination of the eight items does not affect the reliability and validity of the factors and the instrument.

Discussion
Worldwide, more than one in four older workers experience job strains related to work organization and the absence of an adequate assessment of psychosocial risk factors [30]. The relevance of assessing and monitoring psychosocial risk factors in workplaces is clear. It means that companies need accurate and updated information to support their decisions, as well as usable tools to assess the psychosocial factors. COPSOQ II has played an important role in the assessment of psychosocial factors all over the world [5,6,8,9], especially in Portugal [12,13,19], with a wide use in academic studies [14,15] and in business environments [19]. With the emergence of COPSOQ III, which includes fundamental dimensions for the psychosocial reality at work, it became essential to validate the Portuguese middle version.
In a first phase, we intended to evaluate the face validity and to ensure the content validity of the new version of this widely used instrument. Meeting and reflecting with experts, along with the thinking aloud procedure with study subjects, led to changes in the formulation of some items, in order to make them fit the Portuguese culture and working contexts and to ensure the appropriate and accurate interpretation of the items by every subject, regardless of the academic background, work experience, gender or age. This step allowed the instrument to be globally understood, becoming more inclusive and gender-and age-neutral. Moreover, the changes that emerged from the COVID-19 pandemic, regarding mixed organizational models that include working from home [31], were considered in order to obtain a version that could be used in different scenarios. This first phase ended with a version that comprised 31 dimensions and 93 items. The Portuguese COPSOQ III middle version includes items from the long version to the middle version. This is a common procedure accepted by the international COPSOQ network [7] and used worldwide in validation studies of the third version [17].
Considering that one of the most frequent complaints recorded during the thinking aloud process was the length of the questionnaire, which made the process of completing it very time consuming, or even leading to dropouts, a trimming procedure was carried out. This procedure was based on the psychometrics of the classical test theory and the TRI showed the psychometric robustness of the trimmed version. The Portuguese COPSOQ III middle version started with 24 new items, when compared with the Portuguese COPSOQ II middle version. In the trimmed version, eight items were excluded. Mostly, they were part of already existing dimensions (development possibilities, vertical trust, work-life conflict, and job satisfaction) and one new one (control over working time). Interestingly, other validation studies faced similar problems with internal consistency, with almost the same dimensions: development possibilities, work-life conflict, job satisfaction, and control over working time [17]. The international validation studies have already recommended that the selection of items within the scales could be reconsidered, aiming to overcome problems with internal consistency or psychometric robustness [7]. In accordance with other validation studies that used a working population for different activity sectors [17], or those focused on specific populations, such as professional drivers [18], improvements in the reliability coefficients with the elimination of items were obtained.
It is important to stress that, in the present study, the internal consistency was even stronger than the mean values presented in the original evaluation study [7]. These results show that there might be slightly different versions of COPSOQ III depending on the process of adaptation and validation at a national level, or towards specific populations [7,17,18], which, however, always allow data comparability due to the followed common and validated methodological approach. Nevertheless, in this Portuguese middle version, none of the eliminated items was "core items", thus respecting the concept introduced in the development of COPSOQ III. Core items are considered mandatory to ensure comparability between countries. However, at the same time, national versions can diverge regarding supplementary items [7]. Despite the procedure described, the Portuguese COPSOQ III middle version maintained its coherence and alignment with the original international instrument [7] and its several validations worldwide [1,7,17,18].
This preliminary validation study was carried out with a convenience sample that consisted mainly of the healthcare sector and of municipalities; the response rate could not be calculated. The data was collected in early 2021, a year that was strongly affected by the worldwide COVID-19 pandemic, the volatile country dynamics, and the ever-changing working environments and work regulations. This fact is prone to have an impact on perceptions or even on responses; however, it is a common scenario shared by every country in the world [32].
It should also be noted that the objective was to adapt and to perform a preliminary assessment of the validity of the scales; therefore, important procedures, such as criterion validity (external criteria), factorial validity with global models, scalar configurational metric invariance analysis between professional sectors, and convergent and discriminant validity, were not assessed in this study. Taking future research into consideration, additional studies would be needed in order to analyze the overall structure of the Portuguese COPSOQ III middle version with various populations, based on construct validity, external validity, and predictive validity.
In addition to the need for a rigorous analysis of the construct and criterion validities, there is a need, in future steps of the validation process, to expand the sample to other occupational settings in order to obtain a comprehensive analysis, all the while not compromising the data validity, either for research contexts or working environment assessments and interventions. Despite that, the sample size served the objectives of the study, methodology and study phase.
This Portuguese middle version of COPSOQ III proved to be a valid preliminary version for future validation studies. Although this initial part of research is not reported frequently in the literature, it was considered of importance to show how the research version was achieved regarding transparency in research and data transferability to organizational settings.

Conclusions
The Portuguese research version of the COPSOQ III Middle Version presented good face validity, ensured content validity, and reached a stable reduced version of 85 items while maintaining good psychometric characteristics. The Portuguese version proved to be a valid preliminary version for future validation studies with various populations, and for use in correlational studies with other dimensions. The instrument also has a high potential for transferring knowledge from academia to industrial and/or occupational scenarios.
It should also be noted that the COPSOQ III Middle version is a useful tool to assess the psychosocial risk factors and to contribute to health and well-being at work, which is considered a universal value by the United Nations (UN), the World Health Organization (WHO) and the International Labour Organization (ILO). This version also has the advantage of contributing to studies in the area of occupational health and safety, thus playing an important role in sustainable development and contributing to achieving the goals of the 2030 Agenda for Sustainable Development.  Institutional Review Board Statement: All subjects gave their informed consent for inclusion before they participated in the study. All data was obtained in a anonymized form and data are not externally accessible. The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of Faculty of Human Kinetics, University of Lisbon (protocol code 9/2021 from 3 February 2021).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy and ethical restrictions.