The Survey Measure of Psychological Safety and Its Association with Mental Health and Job Performance: A Validation Study and Cross-Sectional Analysis

Objectives: This study validated the Japanese version of O’Donovan et al.’s (2020) composite measure of the psychological safety scale and examined the associations of psychological safety with mental health and job-related outcomes. Methods: Online surveys were administered twice to Japanese employees in teams of more than three members. Internal consistency and test–retest reliability were tested using Cronbach’s α and intra-class correlation coefficient (ICC), respectively. Structural validity was examined using confirmatory factor analysis (CFA) and exploratory factor analysis (EFA). Convergent validity was tested using Pearson’s correlation coefficients. Multiple linear regression analyses were conducted to examine the relationship between psychological safety and psychological distress, work engagement, job performance, and job satisfaction. Results: Two hundred healthcare workers and 200 non-healthcare workers were analyzed. Internal consistency, test–retest reliability, and convergent validity were acceptable. CFA demonstrated poor fit, and EFA yielded a two-factor structure, with team leader as one factor and peers and team forming the second factor. The total score showed significant and expected associations with all outcomes in the adjusted model for all workers. Conclusions: The Japanese version of the measure of the psychological safety scale presented good reliability and validity. Psychological safety is important for employees’ mental health and performance.


Introduction
Psychosocial factors at work are well-known determinants of workers' health and well-being. Psychological safety (PS) at work has received much attention as an important psychosocial factor in workers' positive mental health and other work-related outcomes, such as work engagement, satisfaction, communication, and performance [1,2]. PS describes workers' perceptions of the consequences of taking interpersonal risks in a particular context, such as a workplace [3,4]. In 1999, Edmondson defined PS as a shared belief that the team is safe for interpersonal risk-taking (i.e., doing learning behavior that may place workers at risk, including seeking feedback, sharing information, asking for help, talking about errors, and experimenting) [3].
Previous review articles have reported three streams of research on PS (i.e., individual-, team-, and organizational-level), with team-level analysis the largest and most active [1,4]. A meta-analysis has reported that individual-and team-level PS is significantly related to 19 items divided into three sections (i.e., team leader, peers, and team), as introduced earlier.
The Japanese version of the survey measure of PS was developed according to the procedure specified in the International Society of Pharmacoeconomics and Outcomes Research (ISPOR) task force guidelines [20]. The forward translation was conducted independently by two external translators proficient in Japanese and English. We then performed reconciliation, back-translation, back-translational review, harmonization, and cognitive debriefing. NS and YS conducted reconciliation, and KI chose the appropriate expression of the items. A native English translator back-translated the scale unaware of the original scale. The original developer confirmed and accepted the back-translated measures. Cognitive debriefing sessions were conducted with three Japanese nurses, including HA Their feedback about difficult wording was used for further modifications. The results from these stages were combined to develop the final measure. The full Japanese version of the survey measure of PS is presented in Supplementary Materials. The final scale contained 19 items, with nine items for the team leader, seven items for peers, and three items for the team as a whole, measured on a seven-point Likert scale. The scale score was calculated by averaging the items. Higher scores indicated greater PS.
Online surveys were administered twice to Japanese employees who had not been appointed as leaders of their team at baseline (January 2022) and at a two-week follow-up (February 2022). The Research Ethics Committee of the Graduate School of Medicine/Faculty of Medicine, The University of Tokyo, approved the study, No. 2019361NI- (3). The study was reported according to the Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) guideline, which is used to improve the quality of efforts to develop health-related self-report measurement instruments [21].
Participants living in Japan were invited from the registered panel of an Internet research company (Rakuten Insight Inc., Tokyo, Japan). Equal numbers of HCW and non-HCW were recruited. Participants' inclusion criteria were as below: (i) full-time employees 20-65 years old; (ii) working for a company with more than five employees; (iii) joined a team with more than three members; (iv) not a president or manager; (v) not a team leader.
All participants at baseline were invited to participate in a two-week follow-up. The follow-up survey was closed after 100 answers were collected.

Measurements
To test the convergent validity, the psychological safety scale for workers developed by Liang et al., social support at work, servant leadership, organization-based self-esteem, and organizational justice were measured.
Psychological safety was measured with the PS scale developed by Liang et al. (2012) that reflects Kahn's [22] focus on the workers' speaking out [15]. The Japanese version of the scale was translated by Ochiai et al. [12]. It contained five items measured on a five-point Likert scale. The items asked workers to rate the extent to which they feel free to express their thoughts and feelings. The scale score was calculated by averaging the items. Higher scores indicated greater PS. Cronbach's alpha was 0.71 in this sample.
Social support at work was measured using the Brief Job Stress Questionnaire (BJSQ) [23] containing items assessed on a four-point Likert scale. Social support at work comprises two subscales: supervisor support (three items) and co-workers' support (three items). A higher score indicated higher social support at work. In this sample, Cronbach's alphas were 0.89 for supervisor support and 0.88 for co-workers' support.
Servant leadership was measured with the Japanese short version of the Servant Leadership Survey (SLS-J) [24] evaluating the employees' supervisors. This scale includes six items measuring empowerment (leader side), three items measuring humility (servant side), three items measuring standing back (servant side), three items measuring stewardship (leader side), and three items measuring authenticity (servant side) on a six-point Likert scale. The score for each dimension of the SLS-J-short was calculated by averaging the item scores. A higher score indicated stronger servant leadership. Cronbach's alpha was 0.95 for empowerment, 0.91 for humility, 0.84 for standing back, 0.83 for stewardship, and 0.81 for authenticity.
Organization-based self-esteem was measured using the Japanese version of the Organization-based Self-Esteem Scale [25]. This scale has eight items measured on a fivepoint Likert scale. The scale score was calculated by averaging the items. A higher score indicated higher organization-based self-esteem. Cronbach's alpha was 0.94.
Organizational justice was measured with the Japanese version of the Organizational Justice Questionnaire (OJQ) [26]. The OJQ consists of two subscales: procedural justice and interactional justice. Seven items assess procedural justice, and six items assess interactional justice on a five-point Likert scale. Each factor score was calculated by averaging the items. A higher score indicated a greater degree of organizational justice. Cronbach's alpha was 0.93 for procedural justice and 0.95 for interactional justice.
To examine the associations of the PS scale with mental health and job-related outcomes, psychological distress, work engagement, job performance, and job satisfaction were measured.
Psychological distress was measured with the Japanese version of the K6 scale [27,28]. This scale has six items (felt nervous, hopeless, restless or fidgety, worthless, depressed, and that everything was an effort in the past four weeks) rated on a five-point Likert scale. The total score was calculated by summing all items. The higher score indicated greater distress. Cronbach's alpha was 0.93.
Work engagement was measured using the Japanese version of the Utrecht Work Engagement Scale (UWES-9) [29]. This scale has nine items rated on a seven-point Likert scale. The scale score was calculated by averaging the items. The higher score indicated greater work engagement. Cronbach's alpha was 0.96.
Work performance was evaluated using one item of the Japanese version of the WHO Health and Work Performance Questionnaire (HPQ) [30]. Participants were asked to rate their work performance over the past four weeks. Items were scored on an 11-point scale ranging from 0 (worst) to 10 (best). A high score indicated good work performance.
Job satisfaction was measured by one item from the Brief Job Stress Questionnaire (BJSQ) [23] on a four-point Likert scale. A higher score indicated more job satisfaction.
Demographic variables were gender, age, education attainment, working from home, marital status, company size, occupation (e.g., professions, service workers), and job category (e.g., doctor, nurse) at baseline.

Statistical Analysis
In this study, the HCWs and non-HCWs were analyzed separately. First, the distribution of demographic characteristics as well as means and standard deviations (SDs) for the total scores of the PS scale and its three subscales at baseline and follow-up were calculated. Then, to assess internal consistency and test-retest reliability of the PS scale, Cronbach's α and intra-class correlation coefficient (ICC) for each of the subscales were calculated, following the COSMIN guidelines [21]. To assess structural validity, a confirmatory factor analysis (CFA) with three factors (i.e., team leader, peers, and team) was conducted to test the goodness of fit of the existing structure of PS. Model fit was assessed using a combination of fit indices including the chi-square (χ 2 ), the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA), the standardized root mean square residual (SRMR), the goodness of fit index (GFI), the Akaike's information criterion (AIC), and the adjusted goodness of fit index (AGFI). If the CFA showed a poor fit, an exploratory factor analysis (EFA), which hypothesized no factor structure with the Promax rotation method, using a robust maximum likelihood estimation, was conducted. To test the hypotheses (expected relationships with other outcomes), convergent validity was examined using Pearson's correlation coefficients (r) which were calculated between each score of the PS scale and PS scale for workers developed by Liang et al., social support at work, servant leadership, organization-based self-esteem, and organizational justice, which was considered to have moderate to high positive correlations with PS scale (r > 0.40) [12].
Since both independent and dependent variables were continuous, we conducted multiple linear regression (MLR) analyses to examine the relationship between the PS scale and outcomes (i.e., psychological distress, work engagement, job performance, and job satisfaction). After standardizing these variables, we first examined crude associations. Second, we examined adjusted associations considering the covariates for gender, age, educational attainment, working from home, marital status, company size, occupation, and job category simultaneously. Previous studies related to PS have frequently used MLR analysis [31,32], and this study followed traditional formulas [33,34] to estimate the relationship between theoretically and practically related variables. As literature suggested [1,2], PS can influence outcomes investigated in this study theoretically and conceptually. In addition to the full scale, we examined the relation of three subscales, putting each scale in the model individually (Model 1) and simultaneously (Model 2).
Statistical significance was defined as p < 0.05. IBM SPSS Statistics ® version 28 (IBM, Armonk, NY, USA) and IBM SPSS Amos ® version 28 were used for the analyses.
Internal consistency and test-retest reliability values of the PS scale are presented in Table 2. For HCWs, the Cronbach's alpha of each section ranged from 0.91 to 0.95, ICC ranged from 0.75 to 0.89, the mean total score was 4.96, and Cronbach's alpha was 0.96. For non-HCWs, Cronbach's alpha ranged from 0.93 to 0.96, ICC ranged from 0.84 to 0.92, the mean total score was 4.63, and Cronbach's alpha was 0.92.
The results of confirmatory factor analyses were χ 2 (149) = 540.001, CFI = 0.899, TLI = 0.884, RMSEA = 0.115, SRMR = 0.0444, GFI = 0.764, AIC = 622.001, and AGFI = 0.699 for HCWs. For non-HCWs, the values were χ 2 (149) = 584.778, CFI = 0.903, TLI = 0.888, RMSEA = 0.121, SRMR = 0.0472, GFI = 0.733, AIC = 666.778, and AGFI = 0.659. Factor loadings for each item of PS are presented in Table 3. The model fit was poor, so we tried conducting EFA, which hypothesized no factor structure with the Promax rotation method, using a robust maximum likelihood estimation. Table 4 shows the results of the EFA that yielded a two-factor structure. Among HCWs and non-HCWs, Section 2 (peers) and Section 3 (team as a whole) were combined into a single factor. Table 5  Section 3 (team as a whole) showed high correlation with empowerment (r = 0.701). HCW did not achieve high correlations (r < 0.70) but showed a similar trend to non-HCW. Table 1. Characteristics of Japanese non-manager employees with more than three team members.

Healthcare Workers (HCW)
Non     The results of the MLR analyses are shown in Table 6. In HCWs, the full scale showed significant associations with low psychological distress (adjusted β = −0.508, p < 0.001), high work engagement (adjusted β = 0.462, p < 0.001), high job performance (adjusted β = 0.476, p < 0.001), and high job satisfaction (adjusted β = 0.592, p < 0.001). In Model 1 (individually entered), all three subscales of the scale (team leader, peer, and team as a whole) were significantly associated with low psychological distress, high work engagement, high job performance, and high job satisfaction. In Model 2 (simultaneously entered), Section 1 (team leader) was significantly associated with high work engagement, high job performance, and high job satisfaction in the adjusted model. Section 2 (peers) was significantly associated with low psychological distress. Section 3 (team as a whole) was significantly associated with high job satisfaction. For non-HCWs, the full scale showed significant associations with low psychological distress (adjusted β = −0.424, p < 0.001), high work engagement (adjusted β = 0.510, p < 0.001), high job performance (adjusted β = 0.494, p < 0.001), and high job satisfaction (adjusted β = 0.587, p < 0.001). In Model 1 (individually entered), all three subscales showed significant associations similar to those observed in HCWs. In Model 2 (simultaneously entered), Section 1 was significantly associated with high work engagement, high job performance, and high job satisfaction in the adjusted model. Section 3 (team as a whole) was associated with high work engagement and job satisfaction. No section showed a significant association with low psychological distress in the adjusted model, but Section 1 in the crude model did show significance.

Discussion
The Japanese version of the survey measure of PS developed by O'Donovan et al. demonstrated acceptable high internal consistency, test-retest reliability, and convergent validity. Structural validity remained an issue. The full survey measure of PS showed significant associations with low psychological distress, high work engagement, high job performance, and high job satisfaction. These results were found for both HCWs and non-HCWs. Overall, the Japanese version of the survey measure of PS proved to be reliable and valid for use in all working populations.
In terms of internal consistency, Cronbach's alpha of the full scale exceeded the stringent criterion of 0.80 [35]. The ICC for test-retest (two weeks) reliability was acceptable, except for HCWs in Section 3 (team as a whole). Because Section 3 had a small number of items, discrepancies in the evaluation of one item may easily be reflected in a lower ICC.
In CFA, the three-factor model did not have a good fit theoretically. The indicators of the fit model in CFA showed a low to moderately acceptable fit of the three-factor model. Rather, EFA suggested a two-factor structure. Peers and team as a whole were combined into one factor, suggesting that the Japanese population might imagine colleagues (peers) when they see the word "team". A future study is needed to examine the structure in another sample.
The factor loading pattern was almost identical for factor 1 (peers and team) among both HCWs and non-HCWs. However, the pattern differed slightly for factor 2 (leader), while "speaking up is valued by team leader" (no. 7) loaded highly on both. For HCWs, a "sense of trust in team leader" (no. 9) and "support for the new task and learning (no.8) had high loadings, while for non-HCWs, "feeling safe discussing personal problems and disagreements" (no. 3) and "communicating about work issues" (no. 2) had high loadings. In clinical settings, patient safety and speaking are likely to be prioritized regardless of leaders' attitudes. While leaders' behavioral integrity affected the reported treatment errors [36], trust in leaders may influence the PS atmosphere among Japanese HCWs. Support for learning new tasks may characterize leaders who create psychologically safe workplaces in Japanese clinical settings. In non-HCWs, a previous study suggested that being allowed to express opinions and doing so were different experiences among Japanese workers [12]. Leaders' willingness to allow and encourage employees to speak up and employees' perceptions of doing so may both be required to ensure PS among non-HCWs.
Convergent validities were also well supported, as we expected. The findings were in line with previous research showing the positive association of PS with supervisor support, co-workers' support, and organizational factors [12]. A supportive work environment may make workers feel safe in taking interpersonal risks. PS has been known to mediate the relationship between servant or inclusive leadership and job-related outcomes (e.g., job performance) [5][6][7][8]. Concerning servant leadership, subscales of empowerment showed the greatest associations for both HCWs and non-HCWs. Empowerment in leadership was defined as a motivational concept aimed at fostering a pro-active, self-confident attitude among followers and giving them a sense of personal power by encouraging self-directed decision making, information sharing, and coaching for innovative performance [24]. In Japan, leaders who can empower their team members also facilitate PS. For non-HCWs, PS was highly correlated (r > 0.70) with supervisor factors, such as supervisor support, leadership (especially empowerment), and interactional justice. For HCWs, no measure achieved high correlations. The leader's supportive attitude, examined in previous research, may correspond with PS for non-HCW, and other workplace factors may influence clinical settings. Another reason may be that measurement scales tested for convergent validity were developed for workers (not specifically for HCWs). Overall, theoretical associations suggested good convergent validity for both HCWs and non-HCWs.
The full scale of the survey measure of PS was significantly associated with low psychological distress, high work engagement, high job performance, and high job satisfaction, as we expected. This finding empirically demonstrated the theoretical framework stated in the previous literature [1]. Model 2 (simultaneous entry) showed significant associations between Section 1 (team leader) and work engagement, job performance, and job satisfaction for both HCWs and non-HCWs. Given the Japanese corporate culture that emphasizes hierarchical relationships [37], the team leader may be listening to and respecting others to enhance these job-related positive outcomes. At the same time, low psychological distress was significantly associated with Section 2 (peers) only for HCWs. As mentioned earlier, speaking up is especially important in clinical settings to prioritize patient safety [36]; therefore, for HCWs, an environment where they cannot admit their mistakes or point out those of their peers may cause frustration and psychological distress. A previous study reported that the ability of nurses to forgive themselves and others was significantly associated with PS [38]. Lack of PS from peers may increase the risk of mental health deterioration among HCWs. Peers' role may be more essential for mental health in clinical settings than in other workplaces. PS was associated with high work engagement and job performance in this study. A safe atmosphere where workers can ask questions, communicate opinions, raise issues, and suggest new ideas may increase their motivation.
This study had several limitations. It was conducted online, and participants were recruited from the research company panel, decreasing the generalizability. In addition, the self-reporting style could have biased the results; for example, people with high distress may have rated the items differently. Finally, the cross-sectional nature of the analysis precluded the assessment of causal relationships. Future studies could explore the associations of PS with outcomes using longitudinal design and workers from more diverse backgrounds.

Conclusions
The Japanese version of the survey measure of PS developed by O'Donovan et al. had acceptable reliability and validity for both HCWs and non-HCWs groups, while structural validity remained an issue and needs further examination. This measure is the first Japanese scale that can evaluate the multidimensional PS of leaders, peers, and teams in the workplace. The associations with other important factors [2] (e.g., creativity, learning behavior) and the mediator role of PS, which recent studies examined [5][6][7][8][9][10][11], were not investigated in this study. Such evidence should be replicated in the future, using this scale in Japan. Despite the limitation of the cross-sectional analysis, PS showed positive associations with good mental health and positive job-related outcomes in this study. Considering the present findings that there was a slight difference in impacts of PS in HCWs and non-HCWs on employees' mental health, future research may be able to develop effective interventions to improve PS by industry. Examining multiple aspects of PS may also improve the workplace environment by considering specific issues in each workplace context.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ijerph19169879/s1, The final version of the Japanese Psychological Safety Scale.
Author Contributions: K.I. was in charge of this study, supervising the process and providing his expert opinion. N.S., A.I., H.A. and K.I. organized the study design and analyzed the data. Collaborators Y.S., D.N. and A.T. ensured that questions related to the accuracy or integrity of any part of the work were appropriately investigated and resolved. All authors participated in conducting the survey. N.S. wrote the first draft of the manuscript, and all other authors critically revised it. All authors have read and agreed to the published version of the manuscript.
Funding: This research was supported by AMED under Grant No. 22de0107006h0002. The sponsors had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; the preparation, review, or approval of the manuscript; and the decision to submit the manuscript for publication. Informed Consent Statement: Online informed consent was obtained from all participants with full disclosure and explanation of the purpose and procedures of this study. We explained that their participation was voluntary, and they could withdraw consent for any reason simply by not completing the questionnaire.

Data Availability Statement:
The data supporting this study's findings are available from the corresponding author, KI, upon reasonable request.