The Effect of Using Participatory Working Time Scheduling Software on Employee Well-Being and Workability: A Cohort Study Analysed as a Pseudo-Experiment

Shift workers are at increased risk of health problems. Effective preventive measures are needed to reduce the unfavourable effects of shift work. In this study we explored whether use of digital participatory working time scheduling software improves employee well-being and perceived workability by analysing an observational cohort study as a pseudo-experiment. Participants of the Finnish Public Sector cohort study with payroll records available between 2015 and 2019 were included (N = 2427). After estimating the propensity score of using the participatory working time scheduling software on the baseline characteristics using multilevel mixed-effects logistic regression and assigning inverse probability of treatment weights for each participant, we used generalised linear model to estimate the effect of using the participatory working time scheduling software on employees’ control over scheduling of shifts, perceived workability, self-rated health, work-life conflict, psychological distress and short sleep (≤ 6 h). During a 2-year follow-up, using the participatory working time scheduling software reduced the risk of employees’ low control over scheduling of shifts (risk ratio [RR] 0.34; 95% CI 0.25–0.46), short sleep (RR 0.70; 95% CI 0.52–0.95) and poor workability (RR 0.74; 95% CI 0.55–0.99). The use of the software was not associated with changes in psychological distress, self-rated health and work-life conflict. In this observational study, we analysed as a pseudo-experiment, the use of participatory working time scheduling software was associated with increased employees’ perceived control over scheduling of shifts and improved sleep and self-rated workability.


Introduction
Work at social and healthcare organisations is organised in several shifts to provide 24-h service [1]. Studies suggest that shift workers are at increased risk of sleep disturbance [1,2], poor mental health [1,3,4] and work-life conflict [5]. Night shift workers have been found to sleep less and report more often fatigue, insomnia and mental health problems than day workers [6], with more marked differences between shift and day workers observed in younger age groups [7,8]. There is a need to identify practical ways to reduce the unfavourable effects of shift work.
Worktime control has beneficial effect on work-non-work balance [9], and is defined as control over starting and ending of workdays, length of shifts, taking breaks and holidays and timing of overtime [9]. Theoretical groundings for the assumed benefits of increased work-time control stem from several motivational and occupational health theories [10,11].
Worktime control and self-scheduling of shifts have been suggested to improve wellbeing [12] and have also other beneficial health effects [13]. In a recent cross-sectional study adjusted for the propensity score, self-scheduling of work shifts among nurses was associated with better organisational justice and work attitude [14]. In prospective observational cohort studies, poor worktime control has been associated with increased risk of psychological distress [15,16] and depressive symptoms [17,18]. Unpredictable and irregular work schedules have also been linked to poor sleep quality [16,19] and poor mental health [4]. In addition, work-life conflict has been more common in employees who work > 40 h per week [5] or have short intervals (<11 h) [5] between work shifts. Conversely, employees with a flexible schedule seem to sleep better [20] and report less often work-life conflict [21] or poor self-rated health [20] than workers without a flexible schedule. In our earlier observational study, employees using the participatory working time scheduling software reported increased control over scheduling of shifts [22], and reduced rates of excessive sleepiness [22] and sickness absence [23]. However, another study of self-scheduling software found no evidence on decreased stress among elderly care workers [24] and an intervention to increase control over working time did not confirm improvements in sleep quality [25].
Observational studies are prone to selection bias and confounding [26]. However, confounding in prospective cohort studies can be limited with rigour data analysis. The propensity score methods, for example, are commonly used to analyse prospective cohort studies as pseudo-trials [27], studies intended to assess causal relationship without random assignment. The propensity score estimates the probability of receiving a treatment, an intervention or an exposure conditional on measured baseline characteristics [27]. The estimated propensity score is then used to achieve balance in background characteristic across treatment groups through matching, weighting or stratification [27]. A well conducted propensity score weighting is more efficient to control confounding in observational studies than regression adjustment [27,28].
In the current study, we analyzed an observational cohort study as a pseudo-trial to explore whether using the participatory working time scheduling software improves shift workers' well-being and perceived workability. In addition, we examined whether beneficial health effects of using the participatory working time scheduling software differ between younger and older employees.

Population
This study is part of ongoing Working Hours in the Finnish Public Sector study [22] based on payroll data of working hours, and comprised of employees who worked in five hospitals districts and one division of municipal health services in Finland between 2015 and 2019. A total of 5207 hospital employees with payroll data on working hours were included in 2015 survey, 6080 employees included in 2017 survey and 4920 included in 2019 survey (Figure 1). We used two employee cohorts, one with baseline in 2015 and follow-up in 2017 and the other with baseline in 2017 and follow-up in 2019. From both cohorts, we excluded day workers, employees who worked <31 days in past 3 months, worked <150 days in past year, or physicians who worked on-call shifts > 90 days in past 91 days or >90 days in past year. There were 2224 employees in the 2015-2017 cohort and 1912 employees in the 2017-2019 cohort. In total these provided 4136 observations, of which 1709 had missing data on baseline characteristics and were not included in the propensity score. After pooling the two cohorts, there were 2427 employees at study baseline. Of these, 881 were users and 1546, non-users of the participatory working time scheduling software. The study outcomes were measured at follow-up, two years after the baseline. A total of 402 participants had missing data on one or more study outcomes at follow-up and were excluded from the main analysis. Thus, the final analytic sample consisted of 2025 employees, including 881 users and 1144 non-users of the participatory working time scheduling software with data on the baseline characteristics and the outcomes of inter-est. This pseudo-experiment was registered on ClinicalTrials.gov (NCT02775331) before initiation of the intervention. and were excluded from the main analysis. Thus, the final analytic sample consisted of 2025 employees, including 881 users and 1144 non-users of the participatory working time scheduling software with data on the baseline characteristics and the outcomes of interest. This pseudo-experiment was registered on ClinicalTrials.gov (NCT02775331) before initiation of the intervention.

Outcomes
The study outcomes were control over scheduling of shifts, perceived workability, self-rated health, psychological distress, short sleep and work-life conflict. Control over scheduling of shifts was assessed with a single Likert question: 'How much control do you have over the scheduling of work shifts?' The response alternatives included (1) very much, (2) quite a lot, (3) some, (4) quite a little and (5) very little. We defined intermediate control as having some control over scheduling of shifts (response 3) and defined low control as having quite a little or very little control over scheduling of shifts.
A Likert scale question from the Work Ability Index [29] was used to measure workability. The responses ranged from 0 (unable to work at all) to 10 (ability to work is at its best). The single question is strongly associated with the Work Ability Index and both showed similar patterns of associations with health outcomes [29]. We used a

Outcomes
The study outcomes were control over scheduling of shifts, perceived workability, self-rated health, psychological distress, short sleep and work-life conflict. Control over scheduling of shifts was assessed with a single Likert question: 'How much control do you have over the scheduling of work shifts?' The response alternatives included (1) very much, (2) quite a lot, (3) some, (4) quite a little and (5) very little. We defined intermediate control as having some control over scheduling of shifts (response 3) and defined low control as having quite a little or very little control over scheduling of shifts.
A Likert scale question from the Work Ability Index [29] was used to measure workability. The responses ranged from 0 (unable to work at all) to 10 (ability to work is at its best). The single question is strongly associated with the Work Ability Index and both showed similar patterns of associations with health outcomes [29]. We used a dichotomised outcome and defined poor workability as score ≤ 6. Self-rated heath was assessed with a single-item 'How do you rate your health?' and response alternatives included (1) good, (2) fairly good, (3) average, (4) fairly bad and (5) poor. The question is widely used and recommended as an indicator of health in surveys [30]. We dichotomised fairly bad or poor vs. others. The 12-item General Health Questionnaire (GHQ-12) was used to assess psychological distress. The responses to each item range from 0 to 3 (coded 0-1-2-3). We recoded all items as 0-0-1-1 [31], and defined psychological distress as score ≥ 3 [32]. Sleep duration was inquired with a question 'How many hours do you usually sleep during a 24-h period?'. We defined short sleep as sleep duration ≤6 h. Work-life conflict was assessed using a single question: how often do you feel that your work takes too much time or energy from your family or life? The response alternatives were (1) never, (2) rarely, (3) sometimes, (4) often and (5) very often. We dichotomised often/very often vs. others.

Use of the Participatory Working Time Scheduling Software
The intervention group consisted of hospital employees who used the participatory working time scheduling software [23]. The participatory scheduling software allows employees interactively schedule the shifts. After negotiations and alterations, the head nurse accepts the roster for a three-week period. The participatory scheduling enables the employees to influence their working time and enter their desired shifts into their wards' shift schedule following collectively agreed rules on, e.g., the number of night shifts or proportion of weekends off-work. The employees are also able to see their co-workers who will be working the same shift. The hospitals internally decided which wards and when to start using the participatory working time scheduling software. The hospitals also made all decisions about the length of introduction period and training whereas the rules for shift scheduling were made at ward level.
The control group consisted of hospital employees who used traditional scheduling (from here on non-users of the software for brevity). In the traditional working time scheduling, the head nurse schedules the shifts for a three-week period at least two weeks prior to the start of the period. The employees have limited influence on their working time. The head nurse makes the final decisions on the final schedules in all cases. We have applied intention-to-treat principle and included all participants who had been using the software for at least a month.

Baseline Characteristics
To facilitate a rigorous propensity score weighting, a wide range of baseline data on characteristics potentially affecting the study outcomes and selection into the user vs. non-user group of the software were collected. These included personal and work-related factors. Detailed description of some measures is reported in Appendix A Table A1.
Personal factors: Information on age, sex, marital status, consumption of beer, wine or other low-alcohol drinks per week, consumption of spirit per month, smoking status (never, past, current), self-reported height and weight, number of children and history of stressful life events was gathered in both 2015 and 2017 surveys. Information on education, vocational training and history of medical conditions was collected in 2015 survey. History of medical conditions included allergy, asthma, bronchitis, hypertension, heart disease, cerebrovascular disease, osteoarthritis, rheumatoid arthritis, low back pain, sciatica, peptic ulcer, migraine, depression, other mental disorders, diabetes, high cholesterol level and sleep apnoea. For 2017-2019 cohort, we utilised data on these background characteristics collected in year 2015 survey.
Leisure-time physical activity was assessed with four questions in 2015 and 2017 surveys. Information on the employees' average weekly hours of leisure-time physical activity within the past 12 months was collected regarding four grades of intensity: (1) walking, (2) brisk walking, (3) jogging, and (4) running, or their equivalent activities [33]. The number of hours per week for each activity grade ranged between zero and four hours. A metabolic equivalent (MET) index was calculated for each participant by multiplying the MET-values of each activity intensity by the time spent on them, and summing [34,35]. We used tertile distribution and split the MET index into low, medium and strenuous activity.
Work-related factors: Information on the types of usual work shifts, type of work time (part-time vs. full-time), number of days of on-call in a month and working as a supervisor was gathered in both 2015 and 2017 surveys. A question on types of usual shift had four alternative responses: (1) shift work without night shifts (two-shift work), (2) shift work with night work (three-shift work), (3) regular night work and (4) other irregular work. The Job Content Questionnaire was used to assess job demands (3 items) and job control (9 items). In all the 12 items, response alternatives ranged from 1 (strongly disagree) to 5 (strongly agree) and total scores were computed for job demands and job control after reversing 2 items for job control. To create job strain for each participant, job demands and control were dichotomised using the median distribution and job strain defined as experiencing high demands combined with low control [36]. Procedural justice was assessed by seven-item [37]. The items assess perceived fairness of managerial procedures, and responses ranged from 1 (strongly agree) to 5 (strongly disagree). We used tertile distribution and split the sum score into high, intermediate and low justice levels.
The magnitude and significance of the change at work was measured with a single item and responses ranged from 1 (changes have been small and insignificant) to 7 (changes have been large and significant). We used tertile distribution and split the changes into small, medium and large. Employee's involvement in planning changes at work was also assessed with a single item and responses included (1) I can influence change very much, (2) I can influence some and (3) change usually comes unexpectedly without my ability to influence it.
Uncertainty at work was assessed with five Likert scale questions about the threat of termination of some jobs, transfer to other tasks, forced redundancies, dismissal and increase in workload beyond tolerance. Rewards for work in the forms of income, benefits, appreciation and satisfaction was assessed with four Likert scale questions. Responses to the questions on uncertainty and rewards at work ranged from 1 (very much) to 5 (very little). We summed the responses and included the sum scores as continuous variables in the propensity score models. The participants were also asked about fatigue during working hours or leisure time, their intention to retire early and discrimination at workplace on the grounds of age, sex, education, opinion, ethnicity, sexual orientation.

Statistical Analysis
We used the total sample of the 2015-2017 cohort (N = 5539 employees) and assessed intraclass correlation coefficient for hospital wards. Intraclass correlation coefficient measures similarities of responses within clusters and it was 0.206 for control over scheduling of shifts, 0.082 for self-rated health, 0.051 for work conflict, 0.038 for workability, 0.024 for psychological distress and 0.013 for short sleep at follow-up.
We estimated the propensity score of using the participatory working time scheduling software on the baseline characteristics (i.e., year 2015 survey for the 2015-2017 cohort and year 2017 survey for the 2017-2019 cohort, Figure 1) using multilevel mixed-effects logistic regression. We used three-level random-intercept model of using the participatory working time scheduling software on baseline characteristics with individuals nested within wards and wards nested within hospitals. We included baseline characteristics as well as outcome variables at baseline. Baseline characteristics in the propensity scores were: age (continuous variable), sex, education (three levels), vocational training (three levels), marital status, being a supervisor, type of work time, types of usual shifts, on-call work in a month (dichotomised), changes at work, employee's involvement in changes at work, uncertainty at work, work rewards, intention to retire, discrimination, history of medical conditions (12 conditions), leisure time physical activity (three levels), body mass index (continuous variable), having a child, smoking status (never, past, current), alcohol consumption, history of stressful life events (illness or death), high job demands, job strain, procedural justice (three levels), fatigue during working hours or leisure time, psychological distress (GHQ-12), workability, control over scheduling of shifts, work-life conflict, duration of sleep and self-rated health.
We created an inverse probability of treatment weight for each participant using the propensity score and assigned weight to employees based on the inverse of their probability of using the participatory working time scheduling software. We stabilised the inverse probability of treatment weights to reduce the variability and bias [38]. We then used generalised linear model with a binomial distribution and a log link function. This method allows analytical weights (aweights in Stata) for inverse probability of treatment weight. Some of the participants in the control group were included in both cohorts and we controlled for repeated observations using vce (cluster) option. As a sensitivity analysis, we estimated subgroup balancing propensity scores for employees aged <50 years and those aged ≥50 years [39]. We assessed whether weighting balanced the baseline characteristics between users and non-users of the participatory working time scheduling software [27]. We estimated the unweighted and weighted standardised differences in baseline characteristics to compare prevalence and means between users and non-users. We used Stata, version 17 for the analyses.

Results
The baseline characteristics for the total study population and by age groups are reported in Table 1. Of the 2427 participants at baseline, 90% were women and 55.8% had completed at least upper secondary school education. The job titles with 50 or more participants included nurse (36.2%), practical nurse (9.5%), department secretary (6.8%), supply/instrument technician (4.3%), radiology nurse (4.1%), midwife (3.5%), specialist (2.5%) and mental health nurse (2.1%). Sixty-eight percent of employees aged <50 years and 50% of employees aged ≥50 years worked three shifts (including nights). Low control over scheduling of shifts was more common in employees aged ≥ 50 years (29.2%) than employees aged < 50 years (18.8%). This was also the case for short sleep (19.5% in employees aged ≥50 years vs. 11.4% in those aged <50 years). Thirty percent of employees aged <50 years and 28.2% of employees aged ≥50 years has psychological distress, and 7.1% of employees aged <50 years and 15.6% of employees aged ≥50 years reported poor workability at baseline.  Table 2 shows prevalence of the outcomes at follow-up. Intermediate or low control over scheduling of shifts, poor perceived workability, poor self-rated health and short sleep were more prevalent among employees aged ≥ 50 years, whereas work-life conflict and psychological distress were more prevalent among employees aged <50 years. Comparison of baseline characteristics between users and non-users. Balance in the baseline characteristics between users and non-users was achieved for most of the baseline characteristics (Table 3). Of the 43 baseline characteristics included in propensity score, standardised difference between users and non-users was 10% or higher for 15 variables in unweighted sample and for 5 variables in the weighted sample. In the weighted sample, standardised difference was 10% or higher for education, vocational training, parttime job, types of shifts and control over scheduling of shifts. Weighting reduced imbalances between users and non-users at baseline by 31% for education and by 43-52% for vocational training, part-time job, types of shifts and control over scheduling of shifts.
Associations of using participatory working time scheduling software with the outcomes. Employees who used participatory working time scheduling software had more control over scheduling of their shifts than non-users at follow-up ( Table 4). The risk of low control over scheduling of shifts was three-fold lower in users than non-users (risk ratio [RR] 0.34, 95% CI 0.25-0.46). Moreover, users were at lower risk of poor workability (RR 0.74, CI 0.55-0.99) and short sleep (RR 0.70, CI 0.52-0.95) than non-users. Using participatory working time scheduling software had no significant effects on psychological distress, self-rated health and work-life conflict.
In subgroup analyses, using participatory working time scheduling software was associated with higher control over scheduling of shifts in both employees aged < 50 years and those aged ≥ 50 years. The association of using this software with lower prevalence of short sleep reached statistical significance (p <0.05) only in employees aged <50 years. Using the software was not significantly associated with poor workability in younger or older employees. In a sensitivity analysis, excluding 84 employees who slept 9 h or longer did not change the RR for short sleep in total sample (RR 0.71, 95% CI 0.52-0.97). However, the RR for employees aged <50 years did not remain statistically significant (RR 0.65, 95% CI 0.42-1.01). In another sensitivity analysis, subgroup balancing propensity scores were estimated for employees aged <50 years and those aged >50 years. Again, using the software had beneficial effects on control over scheduling of shifts both in employees aged <50 years and those aged >50 years (Table A2). The association of using the software with lower prevalence of perceived poor workability and short sleep was found only for employees aged <50 years, but the estimates did not reach statistical significance.

Discussion
In the present study, we used propensity score weighting to balance baseline characteristics between users (N = 881) and non-users (N = 1144) of participatory working time scheduling software and conduct an unconfounded comparison of the two mainly women groups for the risk of low control over scheduling of shifts, short sleep, poor well-being and poor workability at follow-up. Our findings suggested that the use of participatory working time scheduling software increases hospital employees' control over scheduling of shifts and might reduce the risk of short sleep and poor workability. No association was observed for psychological distress, self-rated health, or work-life conflict.
The beneficial effects of using participatory working time scheduling software on short sleep and workability are plausible. Using self-scheduling software improves employees' control over working time, and self-control over working time can reduce the risk of short sleep if work shifts that are optimal for individual sleep needs are selected. Shorter sleep is further linked to poor workability [40], and sufficient sleep can improve well-being and productivity [41]. Previous prospective cohort studies found that low worktime control increases the risk of psychological distress [15] and depressive symptoms [17,18]. Improvement of psychological distress could also directly improve sleep. Work-life imbalance has partially mediated the relation between worktime control and depressive symptoms [42]. However, a quasi-experiment showed that worktime self-scheduling via a computer program improved employees' control over scheduling of working hours, but did not decrease self-perceived stress [24]. Moreover, an intervention study failed to confirm an association between increase in employees' control over working hours and sleep quality, however, that study recruited a small group of elderly care workers [25]. In the current study, the self-scheduling software was designed to improve employees' control over working hours and based on a larger number of employees, self-control over working time was associated with a reduced risk of short sleep and workability.
As strengths of the current study, we used propensity score weighting to investigate effects of a working time intervention on employees' well-being and workability. The study collected data on a large set of confounding factors, sample size was relatively large and the time span of using the software was long enough. There was a small to medium clustering effect [43] and intraclass correlation coefficient for hospital wards was above 0.05 for three of the six study outcomes, being 0.206 for control over scheduling of shifts, 0.082 for self-rated health and 0.051 for work conflict. Due to unmeasured cluster-level confounders, multilevel data structure should be considered in estimating propensity score and/or in weighting analysis [44], and ignoring the multilevel data structure in the analysis can lead to biased estimates [44]. To reduce biases, we used multilevel model to estimate the propensity score.
The study had some limitations. Statistical power was limited, particularly for agespecific subgroup propensity score weighting. The absence of the associations of using the participatory working time scheduling software with workability and short sleep in agespecific propensity score weighting might be due to low statistical power and imbalance in baseline characteristics between users and non-users rather than due to the absence of an association. Some baseline characteristics such as presence of chronic medical conditions were measured in year 2015, but not in year 2017. Although for 2017-2019 cohort we utilised the data on these characteristics measured in year 2015, some exposed and unexposed individuals may have been misclassified. Even though we estimated propensity score using a large set of baseline characteristics, unmeasured and residual confounding cannot be ruled out in observational studies [45]. The propensity score weighting controls partly for unmeasured characteristics that are correlated with measured characteristics.

Conclusions
This pseudo-experiment adds to previous research on the effect of worktime control interventions on hospital employees' well-being. Our findings suggest that participatory working time software might provide a practical tool to increase employees' perceived control over shift scheduling and improve sleep and workability. However, randomised controlled studies are needed to confirm the findings and examine the generalisability of the software across other occupational sectors.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the authors.

Conflicts of Interest:
The authors declare that they have no conflicts of interest. Table A1. References and description of some of the original scales and questions used in the Finnish Public Sector surveys.

Scale or Question Items Response Alternatives Note
Control over scheduling of shifts [46] How much are you able to influence your working hours? the scheduling of the shifts?
(1) very much (2) fairly much (3) to some extent (4) fairly little (5) very little One item selected from the Ala-Mursula scale [46] Workability [47] Let's assume that your workability at its all-time best would be given 10 points, and 0 points would indicate that you are completely unable to work. What point would you give to your current workability?

Scale or Question Items Response Alternatives Note
Perceived health [48] How is your health?
(1) good (2) fairly good (3) average (4) fairly poor (5) poor Work-life conflict [49] How often you feel that your work takes too much time or energy from your family-life or life?
(1) completely agree (2) somewhat agree (3) do not agree or disagree (4) somewhat disagree (5) completely disagree Retirement intention [53] If you had the possibility to choose between continuing at work and retiring, what would you do?
(1) I would continue working (2) I would retire (3) I don t know

Modified from the original questions
Stressful life-events [54] The following section includes a list of life events anyone can be forced to face. Have any of them ever happened to you? If an event has occurred within the past 12 months, please indicate the month in which the event happened (use a corresponding number of the month).
(1) divorce or separation (2) significant complications in personal finances (3) serious illness of spouse/partner (4) serious illness of own child (5) serious illness of mother or father (6) serious illness of another family member (7) death of spouse/partner (8) death of own child (9) death of mother or father (10) death of another family member (11) psychological violence (12) physical or sexual violence yes/no Eight items on serious illnesses and deaths are used in the current study.  Table A2. Effects of using participatory working time scheduling software on the outcomes using subgroup balancing propensity score. RR, risk ratio.