Statistical Analysis of Absenteeism in a University Hospital Center between 2007 and 2019

Objectives: To estimate the evolution of compressible absenteeism in a hospital center and identify the professional and sociodemographic factors that influence absenteeism. Method: All hospital center employees have been included over a period of twelve consecutive years (2007 to 2019). Compressible absences and occupational and sociodemographic factors were analyzed using Occupational Health data. Since the distribution of the data did not follow a normal distribution, the number of days of absence was presented as a median (interquartile range (IQR): 1st quartile–3rd quartile), and comparisons were made using non-parametric tests followed by a negative binomial model with zero inflation (ZINB). Results: A total of 16,413 employees were included, for a total of 2,828,599 days of absence, of which 2,081,553 were compressible absences (73.6% of total absences). Overall, 42% of employees have at least one absence per year. Absent employees had a median of 15 (IQR 5–53) days of absence per year, with an increase of a factor of 1.9 (CI95 1.8–2.1) between 2007 and 2019 (p < 0.001). Paramedical staff were most at risk of absence (p < 0.001 vs. all other occupational categories). Between 2007 and 2019, the number of days of absence was multiplied by 2.4 (CI95 1.8–3.1) for administrative staff, 2.1 (CI95 1.9–2.3) for tenured, 1.7 (CI95 1.5–2.0) for those living more than 12 km from the workplace, 1.8 (CI95 1.6–2.0) among women, 2.1 (CI95 1.8–2.6) among those over 50 years of age, 2.4 (CI95 1.8–3.0) among “separated” workers, and 2.0 (CI95 1.8–2.2) among those with at least one child. Conclusions: Paramedical personnel are most at risk of absenteeism. Meanwhile, absenteeism is increasing steadily, and overall, the increase is major for administrative staff. The profile of an employee at risk of absenteeism is a titular employee, living at distance from work, probably female, over 50 years old, separated, and with children. Identifying professionals at risk of absenteeism is essential to propose adapted and personalized preventive measures.


Introduction
Absenteeism at work can be defined as "any unexpected absence of an employee from his or her workstation" [1][2][3]. It is a major economic and public health issue in the hospital sector [4]. A hindrance to productivity [5] and a threat to the balance of 2 of 16 work teams [6], it is also likely to affect the quality of care [7,8]. Within healthcare facilities, caregivers have been identified as being particularly at risk for absenteeism [9]. However, studies on the subject are limited [9]. Most of the literature on absenteeism in the hospital sector focuses solely on caregivers [10][11][12][13]. Among them, caregivers' aides are reported to have the highest absenteeism rate [9]. Furthermore, to our knowledge, there is no literature that examines the professional status or home-to-work distance as determinants of absenteeism. A few studies have also looked at the sociodemographic determinants of absenteeism [14][15][16][17][18]. Among the factors that have been widely studied are gender and age: absenteeism is thought to be higher among women [14][15][16] and older individuals [17]. The literature has also identified celibacy as a protective factor in absenteeism [10,18]. On the other hand, the majority of studies that have looked at absenteeism from work have focused on short-term or health-related absences only [9,17]. In addition, publications dealing specifically with absenteeism in healthcare institutions are generally limited to a few years of study only [7][8][9][10]. Thus, there is no literature dealing with a large volume of employees followed over the long term (several consecutive years) and incorporating a minimum of information on occupational characteristics (occupation, status, distance between home and work) and sociodemographic characteristics (age, sex, marital status, children).
The main objective of this study was to study the evolution of absenteeism among staff at a University Hospital Center over a 12-year period. Our secondary objectives were to identify the professional and sociodemographic factors influencing this absenteeism.

Study Design
Using occupational health data in relation with Human Resources, we conducted a longitudinal study on absenteeism among all staff at the Clermont-Ferrand University Hospital. The data analyzed covered a period of twelve consecutive years, from 2007 to 2019. This study was approved by the South-East VI Personal Protection Committee and the French National Commission for Information Technology and Civil Liberties (CNIL). In agreement with the CNIL, a unique 6-digit identifier was assigned to each university hospital center employee to ensure the anonymity of data.

Study Population: Eligibility Criteria
The criteria for inclusion were to be part of the staff of the University Hospital Center of Clermont-Ferrand between 2007 and 2019, regardless of profession or establishment, and to benefit from annual leave. No exclusion criteria were applied to this study.

Primary Judgement Criteria
So-called "compressible" absences were studied, i.e., those for which mitigation measures could be implemented because they were partly related to working conditions or health status: ordinary sick leave, occupational injury, occupational disease, strike, longterm leave, long-term sick leave, commuting accident and unauthorized absence [19]. Each absence detailed the start date, end date, reason, and duration. Non-compressible absences (annual leave, safety rest, training, union activities, maternity leave, family events, miscellaneous absence authorizations and unpaid absences) were not studied.

Secondary Judgement Criteria
The occupational variables were occupational status (titular, non-titular), home-towork distance (calculated from each employee's postal code and place of work), and occupation. Occupations were studied independently and were grouped into caregiver/noncaregiver groups and then into categories (medical, paramedical, administrative, technical).
The sociodemographic variables available for each absence were: gender, age of subject, marital status, parental status, and number of children.

Statistics
Statistical analysis of absenteeism data was conducted using Stata software (v16, College Station, TX, USA). The proportions of absent employees were compared for each factor using a Chi-square test. A Shapiro-Wilks test verified that the number of days of absence did not follow a normal distribution (Supplementary Figure S1). Among the absent employees, the number of days of absence was expressed as a median (interquartile range (IQR): 1st quartile-3rd quartile) and compared according to the different variables (professional and sociodemographic) via the non-parametric tests of Mann-Whitney (2 groups: gender, titular/non-titular, caregiver/non-caregiver, home-work distance, parentality) and Kruskal-Wallis (>2 groups: occupation, age, marital status). If there were more than two groups, "multiple pair-wise comparison tests" (paired comparison) were conducted. Effect sizes were also calculated between each group. Age classes were carried out according to quartiles: <30 years, 30-40 years, 40-50 years, and >50 years. Home-work distances were separated into two classes according to the median: less and more than 12 km. The risk of absence as a function of each variable (professional and sociodemographic) was calculated using a negative binomial model with zero inflation (ZINB), considering the high proportion of 0 (zero compressible days of absence for many employees). Vuong and likelihood ratio tests where alpha = 0 were automatically performed after each negative binomial model with zero inflation to determine whether the model was appropriate. Furthermore, a sensitivity analysis was performed excluding people with 365 or 366 days of absence per year. We can therefore rely on the results of our model. The risk calculations were carried out globally (covering the 12 years of the study as a whole) and by period (2007 to 2010, 2011 to 2014, and 2015 to 2019). The three periods were defined statistically based on observed changes in the number of days of absence on the global model. The weight of each factor was studied using multivariate analyses: logistic regressions for the proportions of absentees and the ZINB model for the number of days of absence. The normality of the residuals after the models was checked using qqplot and Shapiro-Wilk tests. Risk calculations were also carried out using classical logistic regression to check the consistency of the results (sensitivity analysis). The results were expressed as: coefficient, 95% confidence interval (95% CI). For all the tests performed, the significance threshold was set at 0.05.

Study of Absenteeism-Overall Model and by Time Period
Out of a total of 2,828,599 days of absence during the twelve years studied, 2,081,553 were compressible absences (73.6% of total absences). Regular sick leave accounted for the largest number of days of absence (52.0%), followed by occupational diseases (20.7%), accidents at work (10.0%), long-term sick leave (9.3%), long-term leave (3.6%), unauthorized absences (2.1%), commuting accidents (1.3%) and strikes (0.9%). Overall, 42% of the University Hospital Center employees had at least one compressible day of absence per year, with a median of 15 (5-53) days of absence/employee/year. The proportion of absent employees remained globally stable over time, and the number of days of absence was multiplied by 1.9 (CI95 1.8-2.1) in twelve years (p < 0.001) (Figures 1-3). accidents at work (10.0%), long-term sick leave (9.3%), long-term leave (3.6%), unauthorized absences (2.1%), commuting accidents (1.3%) and strikes (0.9%). Overall, 42% of the University Hospital Center employees had at least one compressible day of absence per year, with a median of 15 (5-53) days of absence/employee/year. The proportion of absent employees remained globally stable over time, and the number of days of absence was multiplied by 1.9 (CI95 1.8-2.1) in twelve years (p < 0.001) (Figures 1-3).
Occupation: Hospital services officers was the occupation with the highest number of absent employees (54% vs. 50% for assistant nurses, 46% for administrative officers, or 31% for hospital practitioners, for example, p < 0.001) ( Table 1). For example, the risk of absence was increased by 67% (CI95 62-72%) compared to a hospital practitioner. Assistant nurses and hospital services officers were the two professions with the longest absences (20, 7- Figure 2). The first period stands out with higher risks of absence for all occupational categories (Figure 4).
Home-work distance: This variable was reported for 1,497,957 absences (87%). Employees living more than 12 km from work were the most absent (44% vs. 40% of employees living less than 12 km from work, p < 0.001, effect size 0.05) ( Table 1), with an increased risk of absence of 24% (CI95 20-28%). The number of days of absence of an employee living more than 12 km away from work was 1.04 (1.00-1.09) times that of an employee living less than 12 km away (17.5-57 vs. 14.4-45 days/employee/year, p < 0.001) ( Table 1). Between 2007 and 2019, the number of days of absence was multiplied by 1.7 (CI95 1.5-2.0) among employees living more than 12 km from their place of work (Table 2, Figure 2).
Age: Those over 50 years of age had a higher proportion of absent employees than those under 30 years of age (44% vs. 40%, p < 0.001) ( Table 1), with an increased risk of absence of 13% (CI95 8-18%, p < 0.001). Those over 50 years of age were absent the longest ( Figures 3 and 4), reaching a number of days of absence 2.89 times higher over the last period (Supplementary Table S1).
Parenting: Employees with at least one child had the highest proportion of absent employees (44% vs. 39% among employees without children, p < 0.001, effect size 0.16) (Table 1), with an increased risk of absence of 18% (CI95 14-22%). The study results by time period were similar to these only between 2007 and 2010. After that, there was no difference between the two groups ( Figure 3). The number of days of absence of an employee with at least one child was 1.48 (CI95 1.43-1.54) times that of an employee without children (16, vs. 12, 4-38 d/employee/year) (p < 0.001) ( Table 1). Between 2007 and 2019, the number of days of absence for employees with children was multiplied by 2.0 (CI95 1.8-2.2), with an increase in the gap with employees without children over time ( Table 2, Figures 3 and 4), reaching 1.74 times higher in the latter period (Supplementary Table S1).
A sensitivity analysis conducted without those with 365 or 366 days of absence showed that the significance of the factors in the ZINB models did not change except for being a caregiver or not and for the distance variable. Indeed, being a caregiver becomes a protective factor in the second period, whereas it was a risk factor before, and living more than 12 km away remains protective but becomes very significant in the overall period as well as in the third period. These are the only variables impacted during this sensitive analysis; all others show similar results. Study of compressible absenteeism-sensitivity analyses: The results were broadly similar using a classical logistic regression model. The risk of absence increased for caregivers by 11% (CI95 10-13%), paramedics by 10% (7-14%), Hospital Services Officers by 77% (65-90%), and tenured by 33% (31-35%), employees living more than 12 km from work by 11% (9-13%), women by 19% (17-21%), over-50s by 9% (7-11%), separated employees by 18% (13-23%), and employees with at least one child by 14% (12-16%) (Supplementary Figure S2). The results of period logistic regressions were also similar to the results obtained by the ZINB model (Supplementary Table S2).

Discussion
The main results showed: O A high prevalence of absences and an increase in absenteeism over time for most of the groups studied. O Paramedical personnel remain particularly at risk of absences even if new absentees emerge (administrative staff). O The involvement of sociodemographic factors in the occurrence of compressible absences.

Prevalence of Absences and Evolution
We found that over the last 12 years, the proportion of employees absent at least once a year for a compressible reason was 42%. The median duration of absence was 15, 5 to 53 days/employee/year. In addition, we found a 1.9-fold increase (CI95 1.8-2.1) in the number of days of compressible absence between 2007 and 2019. To our knowledge, and as mentioned in the introduction, no publication to date has focused on compressible absences from work, making it impossible to compare our results with data from the literature. In fact, most of the available studies on the subject have focused on short-term absences or absences for health reasons only and have taken place over short periods of time [10,13,18]. In recent years, hospital reforms have multiplied [4]. From activity-based pricing to internal hospital changes, these reforms have contributed to the deterioration of working conditions [4]. In addition, job dissatisfaction, low decision latitude, or lack of time and resources to perform the tasks required have been associated with a high risk of absence [20][21][22]. All these factors could partly explain the observed increase in compressible absenteeism. In addition, absenteeism is associated with a significant loss of resources within healthcare institutions [4], which is itself partly responsible for the lack of available human and material resources. It can then lead to an overload of work and subsequently to new absences [21].

Occupational Risk Factors
We identified paramedical staff as having the highest number of compressible days of absence for the entire study period, with a median of 19, 6 to 64 days of absence per employee per year. These results are consistent with those in the literature [9,23]. It is important to note, however, that the available studies cover all reasons for absence and not just compressible absenteeism. These results can be explained in part by the high rate of job dissatisfaction [24][25][26][27][28][29]. A 2008 survey of job satisfaction among caregivers in various countries, including France, revealed that more than 40% of those surveyed were dissatisfied with their physical working conditions and psychological support [25]. At the beginning of our study period, administrative staff were at low risk of absence. We then noted an alarming increase in compressible absence days within this occupational category. Since this appears to be a new phenomenon, we did not identify any available data on the subject. We can, however, try to explain this phenomenon by the growing job dissatisfaction of all hospital staff due to the successive reforms of recent years [4].
Caregivers, in particular, are at very high risk of job stress and burnout [30][31][32][33]. Burnout is indeed increasingly present in hospitals and can even lead to suicide among some professionals [34]. The last major occupational risk factor was living far from the workplace. Repetitive commuting can in fact contribute to increasing fatigue and stress levels. In addition, most trips are made by car, which increases sedentariness and has been identified as one of the factors reducing work capacity [35,36].

Sociodemographic Factors and Absenteeism
We first identified women as having a higher number of compressible days of absence per year. This finding is consistent with the data in the literature [37,38]. Some publications have attempted to find an explanation for this discrepancy and in particular have shown that women have more difficulty balancing work and private life [16]. Another explanation would lie in the health status and personality differences between men and women [38]. Then, we determined that employees in the oldest age group had the highest number of days of absence per year. These results are consistent with those in the literature [17,39]. This finding seems easily explained insofar as the state of health of the individual, identified as a determinant of absenteeism, is closely linked to age. We were able to identify that being single was a protective factor for compressible absences. On the other hand, having at least one child was identified as a risk factor for compressible absences. These results are again similar to the data in the literature [10,16,18]. All of these results can be explained in part by the difficulty of reconciling work and family life [16].

Limitations
The main limitation of our study was the use of imperfect statistical models. Indeed, we used a negative binomial model with zero inflation, which is usually used to process count data with a high proportion of zeros. However, our data did indeed present a high proportion of employees without compressible absences but also employees with many days of absence (365 days in the year). However, these individuals were too few in number for the model to give false results (Supplementary Figure S1). Indeed, a sensitivity analysis was performed without those individuals with a peak of 365 or 366 days of absence. The results were not different from those found previously. The models used also did not allow the study of absence kinetics. The second limitation of our work was the missing data. However, these only concerned the variable "working distance" and represented only 10% of the data. Thus, we still had access to data on more than one million absences, which underlines the quality of the database on which our work is based. For reasons of anonymity, we were unable to access each employee's work service and details of the reasons for absence. Access to these data as well as to activity indicators (such as patient data) could be relevant for future studies on the subject. Indeed, an increase in the volume of activity grouped with a lack of staff (linked to reforms and budget restrictions) could be one of the explanations for the increase in absenteeism in hospitals. Finally, compressible absences are by definition absences for which mitigation measures can be put in place [40][41][42][43][44][45][46]. It would be relevant to cross-reference the results obtained in our study with information on the preventive measures already adopted by the hospital center to assess their effectiveness. Finally, our study was retrospective, and it was therefore impossible to take into account certain professional subjective variables such as job satisfaction, social support or psychological load.

Conclusions
Absenteeism in healthcare institutions is a public health issue insofar as it can lead to an alteration in the quality of care. Compressible absenteeism, which accounts for more than 2/3 of absences from the university hospital center, has been on the rise for several years. While compressible absenteeism affects all the professions at the university hospital center, some remain more at risk than others, particularly hospital services officers and assistant nurses. Some professionals are more exposed to the risk of absenteeism (paramedical and technical fields), while others, who seemed protected ten years ago, are experiencing a worrying increase in the prevalence of absenteeism (administrative staff). Conversely, non-tenured staff are experiencing a decrease in risk compared to tenured staff. While professional characteristics seem to play a major role in the occurrence of absences, particularly in relation to working conditions, it is important to highlight the importance of sociodemographic factors. Indeed, we found that being a woman, being a parent, being elderly or being separated significantly increases the risk of absenteeism. Reducing compressible absences is now a major challenge for human resources and actions exist to achieve this. Among the means explored are health promotion actions (physical activity and nutrition) or actions to improve the work environment (communication, conflict management) [42][43][44][45][46]. These measures seem to show promising results and could make it possible to limit absences and the associated costs.  Data Availability Statement: Data cannot be shared publicly due to confidentiality. Data are available from the Institutional Data Access/Ethics Committee of the Clermont-Ferrand University Hospital (contact via the Human Resources Department) for researchers who meet the criteria for access to confidential data.