The Association between Emergency Department Length of Stay and In-Hospital Mortality in Older Patients Using Machine Learning: An Observational Cohort Study

The association between emergency department (ED) length of stay (EDLOS) with in-hospital mortality (IHM) in older patients remains unclear. This retrospective study aims to delineate the relationship between EDLOS and IHM in elderly patients. From the ED patients (n = 383,586) who visited an urban academic tertiary care medical center from January 2010 to December 2016, 78,478 older patients (age ≥60 years) were identified and stratified into three age subgroups: 60–74 (early elderly), 75–89 (late elderly), and ≥90 years (longevous elderly). We applied multiple machine learning approaches to identify the risk correlation trends between EDLOS and IHM, as well as boarding time (BT) and IHM. The incidence of IHM increased with age: 60–74 (2.7%), 75–89 (4.5%), and ≥90 years (6.3%). The best area under the receiver operating characteristic curve was obtained by Light Gradient Boosting Machine model for age groups 60–74, 75–89, and ≥90 years, which were 0.892 (95% CI, 0.870–0.916), 0.886 (95% CI, 0.861–0.911), and 0.838 (95% CI, 0.782–0.887), respectively. Our study showed that EDLOS and BT were statistically correlated with IHM (p < 0.001), and a significantly higher risk of IHM was found in low EDLOS and high BT. The flagged rate of quality assurance issues was higher in lower EDLOS ≤1 h (9.96%) vs. higher EDLOS 7 h <t≤ 8 h (1.84%). Special attention should be given to patients admitted after a short stay in the ED and a long BT, and new concepts of ED care processes including specific areas and teams dedicated to older patients care could be proposed to policymakers.


Introduction
Emergency departments (ED) are the first healthcare settings that patients with acute illness encounter prior to admission to the hospital. The imbalance between the demands of ED patients and the availability of ED resources to provide emergency care has caused overcrowding in the ED, which has been identified as one of the main factors compromising timely and efficient care [1]. ED/hospital crowding has become a significant public health problem across the globe. Boarding and overcrowding have been intensified during the COVID-19 pandemic [2].
The time elapsed between ED arrival and ED discharge is defined as ED length of stay (EDLOS). Prolonged EDLOS is believed to be one of the major factors associated with ED overcrowding and affects clinical outcomes adversely [3]. However, the definition of prolonged EDLOS varies across countries, for example, prolonged ED visits have been defined as >4 h in the United Kingdom, >6 h in Canada and the US, and >8 h in Australia [4,5]. Boarding time (BT) is defined as the time spent between the ED decision to admit the patient to the hospital and ED departure time and is considered an important contributor to the EDLOS. Prolonged BT will occupy resources in the ED and potentially affect the outcomes of other patients [6,7].
In our recent meta-analysis and systematic review, we found that there was an association between EDLOS and IHM for patients with EDLOS below 3 h in non-ICU-admitted ED patients [8]. Mohr et al. [3] found that prolonged BT in the ED, thus prolonging EDLOS, is associated with worse clinical outcomes including mortality, particularly in critically ill patients. Although there is a significant association between crowding and EDLOS, the relationship between EDLOS and in-hospital mortality (IHM) remains unclear.
Given the lack of evidence, additional studies are needed to examine the association between EDLOS and IHM using real-world data. This study attempts to fill the gap by finding evidence of the relationship between EDLOS and IHM, which could potentially help to improve patient experiences and outcomes, relieve the stress of ED healthcare providers, create a better working environment, and support hospitals' managerial decisions and policy making. The aim of this study was to examine the association of EDLOS with IHM among older patients who were admitted to the hospital from the ED.

Study Population
All patients admitted to the ED (57,000 visits per year, 150 nurses, 64 senior doctors, 39 residents) of an urban academic tertiary care medical center in the US between January 2010 and December 2016 were collected. The ED has an observation unit which is considered an in-hospital unit. The IHM analyzed in our study includes the patients who died in the observation unit. The EDLOS defined in Figure 1 does not include the time spent in the observation unit. This study was approved by the institutional review board of Beth Israel Deaconess Medical Center, Boston, MA (Approval Number: 2016P-000439). From a total of 383,586 encounters, we excluded those samples who were (a) young patients (age at visit <60 years) (n = 61,765), (b) not admitted to the hospital after an ED visit (i.e., discharge) (n = 242,865), or (c) experienced an unreasonable relative order of several time records (time of triage registration, time of the start of care, time of the disposition decision, and the time at ED departure) to determine the EDLOS (n = 109). For example, the ED entry time was later than the ED exit time. The final retrospective cohort contained 78,847 elderly encounters and was stratified into three age groups based on the World Health Organization criteria for the classification of older persons: the early elderly group (age 60-74 years; n = 38,817; 49.2%), the late elderly group (age 75-89 years; n = 32,261; 40.9%), and the longevous elderly group (age ≥ 90 years; n = 7769; 9.9%) (see Figure 2). finding evidence of the relationship between EDLOS and IHM, which could potentially help to improve patient experiences and outcomes, relieve the stress of ED healthcare providers, create a better working environment, and support hospitals' managerial decisions and policy making. The aim of this study was to examine the association of EDLOS with IHM among older patients who were admitted to the hospital from the ED.

Study Population
All patients admitted to the ED (57,000 visits per year, 150 nurses, 64 senior doctors, 39 residents) of an urban academic tertiary care medical center in the US between January 2010 and December 2016 were collected. The ED has an observation unit which is considered an in-hospital unit. The IHM analyzed in our study includes the patients who died in the observation unit. The EDLOS defined in Figure 1 does not include the time spent in the observation unit. This study was approved by the institutional review board of Beth Israel Deaconess Medical Center, Boston, MA (Approval Number: 2016P-000439). From a total of 383,586 encounters, we excluded those samples who were (a) young patients (age at visit <60 years) (n= 61,765), (b) not admitted to the hospital after an ED visit (i.e., discharge) (n = 242,865), or (c) experienced an unreasonable relative order of several time records (time of triage registration, time of the start of care, time of the disposition decision, and the time at ED departure) to determine the EDLOS (n = 109). For example, the ED entry time was later than the ED exit time. The final retrospective cohort contained 78,847 elderly encounters and was stratified into three age groups based on the World Health Organization criteria for the classification of older persons: the early elderly group (age 60-74 years; n = 38,817; 49.2%), the late elderly group (age 75-89 years; n = 32,261; 40.9%), and the longevous elderly group (age ≥90 years; n = 7769; 9.9%) (see Figure 2).

Data Collection and Data Processing
For each ED encounter of the cohort, we extracted demographic and clinical features recorded in the electronic medical records (EMR), including age, gender, race, language (English and non-English), health insurance categories, mode of transport (such as walkin, ambulance, and helicopter), level of triage acuity score measured using a 5-point scale (i.e., level 1-resuscitation, level 2-emergency, level 3-urgent, level 4-less urgent, and level 5-nonurgent), principal diagnosis codes (i.e., ICD-9 or ICD-10 codes), ED disposition after care (such as ICU and non-ICU), patient medical histories using the Charlson Comorbidity Index, ED waiting time (between triage registration and the start of care), ED boarding time (between the admission decision and departure from the ED), and the EDLOS time (from ED arrival until the patient left the ED) (see Figure 1). We also extracted the quality assurance issues (QAI) flagged by any healthcare provider when they suspected a patient safety event (PSE), defined as a negative health outcome suspected to be related to a medical error during ED care [9]. The outcome of interest was the death during hospitalization. For each age group, variables missing in more than 99% of the population were excluded to reduce the EMR variable dimension. The interquartile range (IQR) technique [10] was used to remove outliers. Specifically, the upper and lower limits were set to 5 times the IQR, and any observation beyond the limits would be considered a potential outlier. One-hot encoding (or dummy variable processing) was used to turn the categorical variables into a binary vector representation.

Experimental Methodology
This research mainly explored four machine learning prediction models, i.e., Logistic Regression, Random Forest [11], eXtreme Gradient Boosting (XGBoost) [12], and Light Gradient Boosting Machine (LightGBM) [13]. LightGBM contains two novel techniques, Gradient-based One-Side Sampling and Exclusive Feature Bundling for processing a large number of data samples and features, respectively, which becomes a highly efficient gradient boosting decision tree in terms of computational speed and memory consumption.

Data Collection and Data Processing
For each ED encounter of the cohort, we extracted demographic and clinical features recorded in the electronic medical records (EMR), including age, gender, race, language (English and non-English), health insurance categories, mode of transport (such as walk-in, ambulance, and helicopter), level of triage acuity score measured using a 5-point scale (i.e., level 1-resuscitation, level 2-emergency, level 3-urgent, level 4-less urgent, and level 5-nonurgent), principal diagnosis codes (i.e., ICD-9 or ICD-10 codes), ED disposition after care (such as ICU and non-ICU), patient medical histories using the Charlson Comorbidity Index, ED waiting time (between triage registration and the start of care), ED boarding time (between the admission decision and departure from the ED), and the EDLOS time (from ED arrival until the patient left the ED) (see Figure 1). We also extracted the quality assurance issues (QAI) flagged by any healthcare provider when they suspected a patient safety event (PSE), defined as a negative health outcome suspected to be related to a medical error during ED care [9]. The outcome of interest was the death during hospitalization. For each age group, variables missing in more than 99% of the population were excluded to reduce the EMR variable dimension. The interquartile range (IQR) technique [10] was used to remove outliers. Specifically, the upper and lower limits were set to 5 times the IQR, and any observation beyond the limits would be considered a potential outlier. One-hot encoding (or dummy variable processing) was used to turn the categorical variables into a binary vector representation.

Experimental Methodology
This research mainly explored four machine learning prediction models, i.e., Logistic Regression, Random Forest [11], eXtreme Gradient Boosting (XGBoost) [12], and Light Gradient Boosting Machine (LightGBM) [13]. LightGBM contains two novel techniques, Gradient-based One-Side Sampling and Exclusive Feature Bundling for processing a large number of data samples and features, respectively, which becomes a highly efficient gradient boosting decision tree in terms of computational speed and memory consumption. In addition, a popular predictive interpretation technique, the game theory-inspired Shapley Additive exPlanations (SHAP) [14], was applied to explain the predictive model at the individual patient and population level. A positive SHAP value means that the presence of the variable increases the likelihood of the adverse outcome for this sample. A negative SHAP value suggests that the presence of the variable decreases the likelihood of the adverse outcome for a particular patient. If a SHAP value is close to 0, this suggests that the model does not consider the variable relevant to estimating the likelihood.

Diagnosis Subgroup Analysis
Due to differences in the distribution of EDLOS and BT among different diagnosis populations, it was necessary to conduct subgroup analysis on different main diagnoses to verify the association between EDLOS/BT and IHM. We extracted the main diagnostic information of each patient, which is represented by ICD9/ICD10 codes, and each type was further divided into 3-digit/3-char, 4-digit/4-char, and 5-digit/5-char codes. In order to unify the diagnostic grouping, as shown in Supplementary Figure S1, we first unified the different digit/char codes into 3 digits/chars, then mapped them to their respective ICD9/ICD10 main categories, and finally unified the code categories of ICD9 and ICD10.

Statistical Analysis
Continuous variables were presented as a mean (standard deviation, SD) for normal distribution or a median (interquartile range, IQR) for non-normal distribution, whereas categorical data were presented as a frequency (percentage). For missing categorical data, a value of 0 was set as a separate category, while for numerical data, missing values were not imputed because the used tree-based machine learning models (e.g., LightGBM) can handle missing values and the optimal null value splitting direction was obtained by automatic learning based on improvement in training performance. The t test or Kruskal-Wallis test was used to test group mean differences for continuous variables, and the Chi-square test, or Fisher's exact test, was used for categorical variables to check the association. Since the dataset was large enough, 10-fold cross-validation (CV) was applied to evaluate the effectiveness of machine learning models, the original samples were randomly divided into 10 subsamples, where one subsample (i.e., 10%) was retained as the validation data for testing the classifier, and the remaining 9 subsamples (i.e., 90%) were used as training data. Furthermore, the CV process was then repeated 10 times, with each of the 10 subsamples used as the test data only once. The 10 results from the folds were then averaged to produce a single performance estimation. The relationship between EDLOS/BT and IHM was evaluated by stratified analyses using multivariable logistic regression models [odds ratio (OR) and 95% confidence interval (CI)]. The area under the receiver operator characteristic curve (AUROC) and the 95% bootstrapped CI were used to compare the overall prediction performances. Delong's test [15] (a nonparametric test) was used to calculate the statistical significance for comparing AUROCs of two or more correlated ROC curves. Two-tailed p < 0.05 denoted statistical significance for all comparisons. Data processing and analysis were performed using Python 3.7 with open-source python packages (e.g., "xgboost", "lightgbm" and "shap") and scikit-learn libraries.

Results
Of the 78,847 encounters meeting the inclusion criteria, IHM occurred in 2975 (3.8%). As shown in Table 1, the median (Q1, Q3) of EDLOS and BT in the elderly population were 366 (271, 495)and 143.0 (104, 219) minutes, respectively. IHM increases with age and is distributed in each group as follows: 2.7% in the 60-74 age group, 4.5% in the 75-89 age group, and 6.3% in the ≥90 age group. The proportion of males decreased with age (51.7% in the 60-74 age group vs. 33.8% in the ≥ 90 age group, p < 0.001). The IQR increased with age (p < 0.001), i.e.ƒ, 882 (2.3%), 859 (2.7%) and 219 (2.8%) for the 60-74, 75-89, and ≥90 age groups, respectively. Charlson Comorbidity Index scores were higher in early elderly patients than in longevous elderly patients (e.g., for Charlson score >2, 6.0% vs. 1.4%). The walk-in mode of transport was more frequent in the early elderly patients (48.6%), while the ambulance mode was more frequent in the late elderly (59.4%) and longevous elderly patients (70.1%). QAIs were observed in 2.3%, 2.7%, and 2.8% of the 60-74, 75-89, and ≥ 90 age groups, respectively.  Table S3 compares the characteristics of survivors and non-survivors in the low EDLOS group (EDLOS < 300 min) and the high EDLOS group (EDLOS ≥ 300 min) in the entire elderly population, where the percentage of non-survivors in the low-EDLOS population is higher than in the high-EDLOS population (i.e, 6.1% vs. 2.6%, p < 0.001), and the incidence of QAI in non-survivors of low-EDLOS population is higher than in the high-EDLOS population (i.e., 28.7% vs. 17.9%, p < 0.001). Figure 3 shows the receiver operator characteristic (ROC) curves of four machine learning models in the three age subgroups. From this figure, we observed that the predictability of IHM decreased as age increased (p < 0.001, Delong's test). Supplementary  Table S4 illustrates the AUROC and corresponding 95% CI for IHM prediction in four age groups based on four machine learning models (i.e., logistic regression, random forest, XGBoost, and LightGBM).   Figure 3 shows the receiver operator characteristic (ROC) curves of four machine learning models in the three age subgroups. From this figure, we observed that the predictability of IHM decreased as age increased ( < 0.001, Delong's test). Supplementary  Table S4 illustrates the AUROC and corresponding 95% CI for IHM prediction in four age groups based on four machine learning models (i.e., logistic regression, random forest, XGBoost, and LightGBM). For example, AUROCs of the LightGBM model for ages 60-74, 75-89, and ≥90 years were 0.892 (95% CI, 0.870-0.916), 0.886 (95% CI, 0.861-0.911), and 0.838 (95% CI, 0.782-0.887), respectively. Since the tree-based LightGBM model outperformed the other three models, the LightGBM was chosen as the final predictive classifier for the SHAP explainer. Figure 3. The receiver operating characteristic curves of four prediction models (i.e., Logistic Regression, Random Forest, XGBoost, and LightGBM) for three age groups. The dark line represents the mean ROC curves, and the light area represents the corresponding 95% confidence interval. Figure 4 shows the SHAP values corresponding to the specific EDLOS and BT for each patient in the whole older population (age ≥60 years) and three age subgroups, and the mean SHAP values corresponding to all samples per EDLOS or BT in minutes, respectively. Figure 4(a1,b1) shows the relationship between EDLOS/BT (0 ≤ ≤ 24 h) and IHM in older patients; when the EDLOS <5 h or BT ≥3 h, there was a higher risk of IHM. Furthermore, we defined EDLOS and BT in hours and calculated the median [IQR] of SHAPs as shown in Supplementary Figure S2, which shows the effect of the varying EDLOS and BT (in hours) on IHM. The influence trends of EDLOS on IHM were similar with the increase of EDLOS, SHAP values moved from the positive range to the negative range, and the trend became more obvious with the increase in age (see Figures 4 and S2). For example, for the age ≥90 group, with the increase of EDLOS, the median [Q1, Q3] of SHAP Figure 3. The receiver operating characteristic curves of four prediction models (i.e., Logistic Regression, Random Forest, XGBoost, and LightGBM) for three age groups. The dark line represents the mean ROC curves, and the light area represents the corresponding 95% confidence interval. Figure 4 shows the SHAP values corresponding to the specific EDLOS and BT for each patient in the whole older population (age ≥ 60 years) and three age subgroups, and the mean SHAP values corresponding to all samples per EDLOS or BT in minutes, respectively. Figure 4(a1,b1) shows the relationship between EDLOS/BT (0 ≤ t ≤ 24 h) and IHM in older patients; when the EDLOS < 5 h or BT ≥ 3 h, there was a higher risk of IHM. Furthermore, we defined EDLOS and BT in hours and calculated the median [IQR] of SHAPs as shown in Supplementary Figure S2 values first decreased from a positive contribution (e.g., 0.103 [0.060, 0.171] at 2-4 h) to 0 (in about 5 h) and then became a negative contributor (indicating a decreased risk of IHM, e.g., −0.110 [−0.177, −0.084] at 14-16 h). The effect of EDLOS and BT with IHM is highly volatile, but we can still observe an obvious trend from the perspective of big data: lower EDLOS and higher BT were associated with increased risk of IHM.   Figure 5 illustrates the SHAP dependence plots of EDLOS with the ICU and QAI interaction in the older population, from which we can see that most ICU admissions and QAI patients were mainly concentrated in the low EDLOS (e.g., <5 h). Figure 6 shows the SHAP visualization of the top 9 risk factors for predicting IHM in each age subgroup, from which we can see that ICU admission was the most important risk predictor of IHM. QAI, the triage and acuity score, the Charlson score, EDLOS and BT were all important predictive factors of IHM.
higher risk of IHM due to this feature value. The dark green line represents the average ris samples with a given EDLOS/BT value (in minutes). Figure 5 illustrates the SHAP dependence plots of EDLOS with the ICU an interaction in the older population, from which we can see that most ICU admission QAI patients were mainly concentrated in the low EDLOS (e.g., <5 h). Figure 6 sho SHAP visualization of the top 9 risk factors for predicting IHM in each age subg from which we can see that ICU admission was the most important risk predictor o QAI, the triage and acuity score, the Charlson score, EDLOS and BT were all imp predictive factors of IHM.   Table S5 shows the distribution of 21 major diagnostic categories. Four major diagnostic categories with mortality rates exceeding 10%, the R00-R99 (20.7%, symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified), the I00-I99 (28.5%, diseases of the circulatory system), the S00-T88 (10.7%, injury, poisoning and certain other consequences of external causes), and the J00-J99 (13.5%, diseases of the respiratory system). Figure 7 shows the distribution of EDLOS and BT for these four diagnostic subgroups. Figure 8 shows the effect of varying EDLOS and BT (in hours) on IHM for four major diagnostic populations based on the SHAP method, from which we can see that the risk trends of EDLOS and BT on IHM in the four diagnostic subgroups are similar and confirm that lower EDLOS and higher BT have a higher IHM risk. However, different subgroups would have different cutoff values, for example, for the I00-I99 diagnosis subgroup, when BT exceeds 2 h, it has a significant positive effect on the risk of IHM; however, for the S00-T88 diagnosis subgroup, BT shows a higher risk of IHM after more than 4 h.  Table S5 shows the distribution of 21 major diagnostic categories. Four major diagnostic categories with mortality rates exceeding 10%, the R00-R99 (20.7%, symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified), the I00-I99 (28.5%, diseases of the circulatory system), the S00-T88 (10.7%, injury, poisoning and certain other consequences of external causes), and the J00-J99 (13.5%, diseases of the respiratory system). Figure 7 shows the distribution of EDLOS and BT for these four diagnostic subgroups. Figure 8 shows the effect of varying EDLOS and BT (in hours) on IHM for four major diagnostic populations based on the SHAP method, from which we can see that the risk trends of EDLOS and BT on IHM in the four diagnostic subgroups are similar and confirm that lower EDLOS and higher BT have a higher IHM risk. However, different subgroups would have different cutoff values, for example, for the I00-I99 diagnosis subgroup, when BT exceeds 2 h, it has a significant positive effect on the risk of IHM; however, for the S00-T88 diagnosis subgroup, BT shows a higher risk of IHM after more than 4 h.   Distribution of emergency department length of stay (EDLOS, in minutes) and boarding time (BT, in minutes) of four major diagnostic populations, namely R00-R99 (symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified), I00-I99 (diseases of the circulatory system), S00-T88 (injury, poisoning and certain other consequences of external causes), and J00-J99 (diseases of the respiratory system).

Discussion
Overcrowding can increase EDLOS, which is used by hospital administrators as an indicator of the quality of care delivered in the ED. However, the relationship between the EDLOS and IHM remains unclear and underestimated, and advanced age is an established independent risk factor for IHM. In this study, machine learning methods were used to identify the specific EDLOS and BT associated with increased IHM following ED care of older patients. By quantifying the predictive importance of different EDLOS to IHM, our study showed that lower EDLOS and higher BT were significantly associated with a higher risk of IHM in older patients.
In previous studies, the main method used was Logistic Regression to analyze the relationship between EDLOS and IHM [8]. Since tree-based models consistently outperform standard deep models on tabular-style datasets in many medical applications [16], three other tree-based predictive models (i.e., Random Forest, XGBoost, and LightGBM) were chosen for the comparison. Due to the faster training speed and better accuracy, the highly efficient LightGBM model was used as the final predictive classifier. In addition, the LightGBM model can handle missing values, and the optimal null value splitting direction is obtained by automatic learning based on improvement in training performance. On that basis, the SHAP method [16][17][18] was used to interpret the LightGBM classifier results to analyze the importance of individual features. The SHAP method not only provides local interpretation of inference data, enabling users to analyze key factors that are positively or negatively affecting the model's decision-making process, but also provides global interpretation, especially from the collective feature importance plots (see Figures 4,5 and S5).
Although some studies had explored the relationship between EDLOS and IHM, no consistent conclusion had been reached. Some studies realized in different countries have found no specific EDLOS cutoff [19][20][21] [34]. In our meta-analysis, we found that low EDLOS increases IHM in non-ICU admitted patients, whatever the age (8). For the three age subgroups, the cutoff (i.e., SHAP value = 0) of EDLOS increased with age, for example, as shown in Figure 4, the approximate cutoffs of early elderly, late elderly, and longevous elderly populations are 3 h, 4 h, and 5 h, respectively, and it seems that high EDLOS is less deleterious in longevous elderly patients than in early and late elderly patients. There is a positive correlation between the risk of IHM and BT, and long BT may have a significant adverse increase of IHM in the older population. In addition, we further analyzed four diagnostic subgroups with high IHM levels (see Figure 7), where lower EDLOS and higher BT are significantly associated with the risk of IHM, and the specific cutoff values will vary depending on the different diagnosis (see Figure 8).
Through the SHAP interpreter method, each patient's EDLOS or BT feature (in minutes) corresponds to a SHAP value (like the logarithmic of estimate odds ratio). Because of the complexity of ED admission patients, the SHAP trends have a strong volatility with the increase of EDLOS and BT for each age population. Lower EDLOS (e.g., <4 h) had a more significant increase of IHM (see Figures 4 and 8), where the flagged rate of QAI was higher than in high EDLOS (e.g., 9.96% in ≤ 1 h vs. 1.84% in 7 h < t ≤ 8 h). Therefore, we can speculate that older patients may benefit more from long ED care rather than an accelerated admission. In addition, a clear trend was observed regarding higher BTs having a stronger positive correlation with the risk of IHM in both subgroups (see Figures 4 and 8). Singer et al. [35] have already demonstrated that mortality is increased in boarding patients whatever the age in the ED. That is, under similar EDLOS, longer BT would adversely affect patient outcomes.
The top-ranking risk predictors of IHM were different among the age groups 60-74, 75-89, and ≥90 years (see Figure 6). The role of risk factors obtained by the SHAP method followed medical common sense. The Charlson comorbidity score has a stronger predictive effect; the higher the Charlson score is, the greater the corresponding SHAP value shows an increased risk of IHM. Patients with severe acuity triage scores from level one to level three showed the highest SHAP value related to a higher risk of IHM. Negative SHAP values indicating a decreased risk of IHM were obtained for patients who entered the ED in a walk-in transport mode. Moreover, patients experiencing QAI had a higher risk of IHM. Our results should be taken into account by hospital policymakers to propose that EDs should be redesigned to include specific areas allowing adequate monitoring of patients with specific teams focused on their care management and to ensure that the best care will be delivered when they stay longer in the ED before admission to the hospital. Moreover, ED healthcare teams must be cautious when deciding to admit a patient to the wards after a short EDLOS in any age group. From our study using artificial intelligence minimizes heterogeneity and allows to understand the precise role of the time spent in the ED on the quality and safety of the care delivered to an heterogeneous population managed in the ED.
Many guidelines emphasize the importance of the time during the care process of acute diseases from the prehospital to the ED setting (e.g., "time is brain", "time is heart" . . . ).
This study has some limitations. First, although we included a large ED cohort observed for seven years (2010-2016), these data do not include recent years, especially the COVID-19 pandemic period. Second, we mainly used the timestamp information related to the length of stay in the ED (such as triage registration time, the start of care time, admission decision time, and ED exit time) and we did not include the entire EMR (e.g., lab tests and treatments), which may further increase the predictive performance. Third, this study is only a retrospective study. A prospective study needs to be designed in the future, in which the evaluation of the ED throughput process needs more attention. Fourth, based on the World Health Organization criteria for the classification of older persons (age ≥ 60 years), the older populations were further divided into the early elderly (60-74 years), the late elderly (75-89 years), and the longevous elderly (over 90 years). However, certain developed countries in the West have chosen age 65 as the cutoff point of older patients. Finally, although our results were statistically significant, they only reflect the population of one academic medical center. A multicenter comparative study involving hospitals in different countries or different types of hospital (such as community hospitals) is needed to demonstrate the robustness and relevance of our results, which will provide strong suggestions for policymakers to propose new older patients ED processes of care.

Conclusions
This is the first study analyzing a large EMR dataset using machine learning methods to determine the relationship between EDLOS and IHM in older patients. Our study confirms that lower EDLOS and higher BT are correlated with IHM in older patients. ED healthcare providers can improve the care process for patients who will stay a short time in the ED, but they do not have significant impact on the hospital beds availibity. Policymakers, administrators, and ED leaders should propose new procedures to reduce BT and provide dedicated well-trained ED professionals, including special elderly patient care areas in the ED.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jcm12144750/s1. Table S1.Characteristics of the study emergency department participants according to IHM (survivors and non-survivors). Table S2. Characteristics of the non-ICU and ICU admitted older (≥60) ED patients. Table S3. Characteristics of survivors and non-survivors in the older ED participants according to EDLOS. Table S4. The area under the receiver operator characteristic curve (AUROC) and 95% confidence intervals in predicting in-hospital mortality (IHM). Table S5. Distribution of 21 major diagnostic categories. Figure S1. Flow chart of population classification based on diagnostic codes. Figure S2. Effect of varying emergency department length of stay (EDLOS, in hours) and boarding time (BT, in hours) on death-in-hospital (IHM) for three age groups based on the SHAP method. The box plots report the median and the interquartile range of the SHAP values of patients within the range of EDLOS.

Institutional Review Board Statement:
The protocol of this study was approved by the Committee on Clinical Investigations (CCI), the appropriately authorized Institutional Review Board (IRB), and Privacy Board appointed to review research involving human subjects. This action was reviewed via Expedited review. This study has been reviewed under the number 2015P000113 and approved for continuation with a waiver of informed consent and authorization under expedited category #5.
Informed Consent Statement: Patient consent was waived due to the study's retrospective design and anonymous characteristics.

Data Availability Statement:
The clinical data used in this study is not publicly available and restrictions apply to its use. Open reasonable request, the amendment can be requested to the corresponding author (AB) to share the necessary data.

Conflicts of Interest:
The authors declare no conflict of interest.