Machine Learning for Predictive Modelling of Ambulance Calls

: A novel machine learning approach is presented in this paper, based on extracting latent information and using it to assist decision making on ambulance attendance and conveyance to a hospital. The approach includes two steps: in the ﬁrst, a forward model analyzes the clinical and, possibly, non-clinical factors (explanatory variables), predicting whether positive decisions (response variables) should be given to the ambulance call, or not; in the second, a backward model analyzes the latent variables extracted from the forward model to infer the decision making procedure. The forward model is implemented through a machine, or deep learning technique, whilst the backward model is implemented through unsupervised learning. An experimental study is presented, which illustrates the obtained results, by investigating emergency ambulance calls to people in nursing and residential care homes, over a one-year period, using an anonymized data set provided by East Midlands Ambulance Service in United Kingdom.


Introduction
Most care home residents attending the Emergency Department (ED) are transported by ambulance. Ambulance conveyance to the ED is important, because it often leads to hospitalisation, and is associated with significant costs to both patient and health care system.
Many factors affect a decision to transfer a care home resident to hospital including patient, family, provider, care home, and contextual factors.
In the United States, for example, it has been estimated from a nationally representative sample that older (aged 65 years or above) nursing home residents alone accounted for 13.97 million ED visits (1.8 ED visits annually per resident) [1].
In one study, over half were not admitted to hospital and of those discharged from the ED, most had normal vital signs, nearly 1 in 5 did not have any diagnostic tests and those with injuries were 1.78 times more likely to be discharged than admitted, suggesting that these were features of preventable ED visits [1].
Over half are deemed potentially preventable using international criteria [2]. Transfers to ED are deemed potentially preventable when they "may have been avoided if optimal management of an existing condition was available at an earlier stage". Definitions usually include conditions classed as 'low acuity' (standard or non-urgent on an ED triage system such as the Manchester Triage System) and not requiring in-patient management resulting in direct discharge from ED [2].
A significant minority of patients conveyed to ED (15% in [3]) have ambulatory care sensitive conditions (ACSCs)-nearly half resulted in hospital admission. Conditions with a higher risk of admission include chronic obstructive pulmonary disease, congestive heart failure, kidney/urinary tract infection, and dehydration. Various symptoms trigger transfer to ED, the most common in one study being fatigue, lethargy or weakness, shortness of breath, and change in level of consciousness [4].
Relatives also perceive themselves to be involved in the decision to transfer as advocates for their family member [5]. Reasons for relatives encouraging conveyance including worry about nursing home care, lack of advance care planning or preparation for end of life, and poor communication or agreement between family members on goals of care [6].
Provider factors include unclear expectations, staffing capacity and capability, and limited access to multidisciplinary support or problems communicating with other decisionmakers [7]. Frontline staff faced with a worsening of resident condition, insistence by family members or recommendation by a physician may deem such transfers as unavoidable, whatever subsequently happens [8] .
Some studies have shown that nursing home characteristics such as staff-resident ratio and skills as associated with transfer to ED [9]. Advance care planning and support from local health services may also reduce decisions to transfer patients to hospital [10]. Contextual factors might include time of day or day of week. Most transfers from nursing homes to EDs occur during weekdays working hours [11]. Many patients transferred to the ED are admitted (almost half in one study) and those admitted are more likely to die compared with those resident in community dwellings [12].
Our aim is to explore predictors of ambulance attendance and conveyance to hospital for people residing in care homes. Preliminary results in applying machine learning to emergency medical transport datasets have been reported in [13]. This is the first datadriven approach designed to assist decision making on whether an ambulance should convey or not a person to a hospital, taking into account related information and features over a dataset of about 25,000 cases. A former six-year study [14], led by Prof. Siriwardena, referred to diabetes related emergencies in ambulance calls, calculating a statistical model for predicting costs of the health service and associated hospital care, over a sample of about 12,000 cases. Another study [15], led by Prof. Siriwardena, aimed to identify, through regression analysis, whether patient (age, sex, condition) and paramedic (sex, role) factors were associated with reduction of pain in about 9500 subjects requiring primary emergency transport to a hospital.
In this paper we present a new methodology, composed of two steps, a forward and a backward one, in which two respective data models are generated and used to predict ambulance attendance and conveyance to hospital.
The forward step includes training of a machine, or deep learning scheme, such as a neural network, or a deep convolutional (CNN) or a convolutional recurrent (CNN-RNN) network [16,17] to analyze clinical and/or demographic information and suggest conveyance, or not, of the person for which the call is made to a hospital. We have developed CNN and CNN-RNN models for prediction in healthcare [18,19] for diagnosis of neurodegenerative diseases, such as Parkinson's, or Alzheimer's based on MRI and DaTscans, as well as Covid-19 based on chest computed tomography (CT) scans and x-rays.
The backward step is based on extraction of appropriate internal features, say features v, from the forward model. We then generate a model of concise representations, say c, through clustering of these features [17,20]. Using this model and the nearest neighbour criterion, the backward step can provide, in an efficient and transparent way, the decision on conveyance, while inferencing on the decision making process.
Recent research has focused on extracting trained network representations and using them for classification purposes, either by an auto-encoder methodology, or by monitoring neuron outputs in the convolutional or/and fully connected network layers [20,21]. Surveys about combining clustering with learning can be found in [22,23].
The novel contributions of this paper are: (i) We develop a two step approach, including a forward and a backward model, which are able to analyze data aggregated in ambulance calls in real life environments; (ii) The backward model is based on extraction of latent variables from the trained forward model, followed by appropriate clustering; thus, providing concise representations that can be analyzed in an efficient and transparent way for suggesting patient's conveyance to a hospital, or not; (iii) Each concise representation set is linked to the respective input data being, therefore, able to illustrate the main (similar) cases and respective conditions on which the provided decision was based.
The rest of the paper is organized as follows-Section 2 presents the components and methods used in the proposed approach, which is also summarized in the Section. The experimental study is presented in Section 3. Discussion on the obtained results and future work are described in Section 4.

The Ambulance Call Data Set
Ambulance called by care home residents for transporting to the emergency department is common and involves significant costs to both the patients and the health care systems. Arguably, over half of calls may not be necessary according to some studies, which could infer significant reduction over the costs by providing an intelligent call system. There are two decisions to be made for the ambulance calls: firstly, the decision should be made if the patients would be attended by the ambulance; secondly, the decision should be made if the patients would be conveyed to hospital.
For the purpose of developing a system that assists the above-mentioned decision making, we use emergency ambulance calls to people in nursing and residential care homes over a one-year period (January-December 2018) using an anonymized data set from East Midlands Ambulance Service in United Kingdom, comprising Call and Dispatch and Clinical Records data. The data contain call categories, timings, geographical locations, clinical recordings, physiological data, treatments and outcomes of treatment, which are considered as factors informing decisions made. We have established a database for organising and manipulating the data, which is convenient to access when using machine learning tools.
In particular, the database includes, on the one hand, information regarding the subject for which the call was made, including their ethnicity, gender, age, as well as any existing disability or impairment, such as hearing, visual, motor, walking, longstanding illness, intellectual, mental health disorder. It also includes subject's medical, surgical, family, social and complaint historical information, as well as contextual information regarding the incident date and time and the receiving hospital.
On the other hand, it includes information about subject's clinical measurements, including the impression on subject's condition that the ambulance team formed when arrived at destination, possible loss of consciousness, the heart rate, blood pressure -systolic and diastolic -, SP02, Glucose and Respiratory Rate.
In addition, the database includes, for each subject, the outcome, that is, whether they were transported on not to hospital. It further differentiates these two categories, by providing more information on the outcome, that is, whether the subject: was treated and transported; refused care; was treated and discharged; deceased and was not transported; did not require any treatment; was referred to primary care.
In the current study we focus on analyzing subjects' clinical information, ignoring personal and contextual information. Moreover, we target the prediction of the binary output problem, that is, subject's transportation, or not, to a hospital. This is a database using an anonymized existing data set from East Midlands Ambulance Service NHS Trust (EMAS). We do not use data taken directly from care homes, primary care or hospitals in our experimental study. We have R&D permissions from EMAS for use of the data under existing protocols for our evaluation study and a data sharing agreement between EMAS and the University of Lincoln, allowing the anonymized data set to be transferred. Data, although anonymized, are stored at the University of Lincoln on secure, suitably encrypted and password protected computer systems.

Data Pre-Processing
In the database, there are clinical and non-clinical data for each patient and the decision outcomes. Data pre-processing has to be carried out before employing machine learning methods. Data cleansing has to be done, as the data is very noisy. For example, there is a large number of data instances containing missing values. In this paper, we have removed instances with missing values, at least when they refer to features which are kept within the current set of experiments.
Moreover, the values of some factors are represented as text. These data have been transformed to numerical values. Data reduction was also performed, as some factors have not been deemed necessary to be included in the current analysis.
Following this procedure, we formed a data set containing around 25,500 data instances. About 20,000 of them refer to cases that were not finally transferred to a hospital, and about 5,000 of them to cases that were conveyed. Our approach has been to split the data into training (about 17,000) data, validation (about 3400) data and test (about 5100) data. We perform 10-fold cross validation on the data set, that is, repeat the training and testing procedure, based on different splits to training, validation and testing data.

The Forward Model
In the forward model, a system is designed and delivered to guide the emergency ambulance attendance team to make a decision. The system uses the explanatory variables as input and produces the likelihood of the possible decisions, as shown in Figure 1. To develop such a system, machine learning or deep learning approaches can be employed, applied to the provided data set. The inputs include the plausible factors and the output takes a binary value, indicating, either to convey the patient(s) to hospital or not. Different machine learning techniques can be used in the forward model estimation step, where a system is trained to predict the category to which the input data correspond. In this paper, we consider Random Forests (RF) and Logistic Regression, Support Vector Machines (SVM), Artificial Neural Network (ANN) and Convolutional Neural Network (CNN) models.
RF belongs to the category of ensemble learning algorithms [24]. As a base learner of the ensemble, RF use decision trees, constructing independent trees, with a bootstrap sample of the training data being chosen at each tree. The final prediction is a weighted average of all tree predictions. On the other hand, logistic regression estimates the parameters of a logistic model [25].
SVM arises from a nonlinear generalization of the Generalized Portrait algorithm [26]. SVM uses a kernel function, which allows to project the original data into a higher dimensional space to be linearly separable. The radial basis function is an efficient kernel choice, which has been used in our work.
Multilayer perceptrons (MLP) have been the main Artificial Neural Network (ANN) architecture used for supervised learning and classification tasks in the past [27]; they consist of multiple fully connected layers of neurons, with feedforward spread of information. Their training is performed with the backpropagation algorithm. Convolutional neural networks (CNN) have been the main Deep Neural Network (DNN) architecture in the last decade and a variety of related literature can be found in [16].

The Backward Model
The backward model is based on the designed forward model; the explanatory variables which significantly informed the response are inferred from the data, as shown in Figure 2. To achieve this goal, we select to extract and further analyse the, say, M, outputs of the last fully connected layer, or last hidden layer of the trained system, for example, artificial neural network (ANN), convolutional neural network (CNN). This is due to the fact that these outputs constitute high level, semantic extracts, based on which the trained network provides its final predictions. Other choices involve features extracted, not only from high level, but also from mid and lower level layers.
In the following we present the extraction of concise semantic information from these representations, using unsupervised learning. This information provides the explanatory variables that significantly informed the network decision.

The Proposed Approach
Ambulance called by care home residents for transportation to an emergency department is common and involves significant costs to both patients and health care systems. Arguably, over half of calls may not be necessary according to some studies, which could infer significant reduction over the costs by providing an intelligent call system. Our aim is to develop a machine learning system that assists ambulance paramedics in deciding whether there is need of conveyance or not.
Let us assume that an ANN, or DNN is trained over aggregated data set S, to predict the need of conveyance of the resident of a care home to hospital.
Let also T denote the respective test set used to evaluate the performance of the trained network. In our forward model step, we train the network using the data in S, with cardinality, say, N s . After training, for each input k, we collect the M values of the outputs of neurons in the last fully connected layer, generating a vector v s (k). A similar vector v t (k) is generated when applying the trained network to each input k of the N t test set. Thus, for all k, we get: In the backward model step, we derive a concise representation of V s , by using a clustering procedure, based on the k-means++ algorithm [28] to generate, say, L clusters Q = {q 1 , . . . , q L } through minimization of the following function: in which µ i denotes the mean of v values belonging to cluster i. For each cluster i, we then compute the corresponding cluster center c(i), thus defining the set of cluster centers C, generating a concise representation and conveyance prediction model.
The procedure, of using data set S to generate the set of cluster centers C is illustrated in Figure 3, through clustering of the latent variables extracted from the, for example, DNN system.
The derived representation will generally consist of a small number of cluster centers, for which we can examine the respective input experimental variables.
Let us now focus on using the set C for predicting the conveyance status in new subject cases, for example, those included in the test data set T. For each input in T, we compute the v s value. We then calculate the euclidean distance of this value from each cluster center in C and classify the test input to the category of the closest cluster center. As a result, we classify each test input to a respective category, thus predicting the subject's conveyance status. It should be mentioned that, using this approach, we can predict a new subject's status in a rather efficient and transparent way. At first, only L distances between M-dimensional vectors have to be computed and the minimum of them be selected. Then, it can be inferred why the specific conveyance decision was taken, through correlation with the explanation variables corresponding to the selected cluster center.

The Final Data Set
The experimental study was based on the ambulance call data set which was described in the former Section. In particular, we focused our research on eight explanatory variables, which are listed in Table 1, including subject's consciousness, heart rate, systolic blood pressure, respiratory rate, subject's (destination) condition, oxygen saturation (SPO2), glucose and diastolic blood pressure. Focusing on these variables, we selected a large part of the data set, consisting of about 3300 cases in which the subject was conveyed to hospital (Category 0) and 21,300 cases in which no conveyance was made (Category 1). We split the data so that 2034 and 13,474 cases were selected as training data, respectively, in each category. Accordingly, 700 and 4200 cases were selected as test data, with the rest being used for validation purposes.
As has been mentioned, in the forward model step we used 10-fold validation, computing average values for the selected performance measures, which were accuracy and F1 score (per class and total) [16]. The values of all variables were normalized in the interval [0, 1] before performing the modelling steps.

Developing the Forward Model
To develop the forward model, we trained and compared different classifiers, including Random Forests (RF) and Logistic Regression, Support Vector Machines (SVM), Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN), to determine whether an ambulance needs to be called out, or not, based on each subject's information.
We have also applied CNN-RNN instead of CNN models, without any significant performance improvement. This was expected, because consequent input data samples do not share any temporal continuity (they rather refer to different subjects and to different ambulance call cases). This is different than in cases (e.g., [15]), where CNN-RNN models take advantage of temporal correlation in the input data to produce improved performance.
Since the provided data set was imbalanced, in all experiments we performed data augmentation by oversampling category 0. Each batch of the learning algorithms consisted of an equal number of samples from both categories (0 and 1).
We used [29] for standard implementation of RF, LR and SVM models, tuning their parameters for best results. The performance of RF and LR method (using a threshold of 1,5) was almost the same, so the results reported for RF also hold for LR as well.
The selected ANN architecture was a fully connected multilayer feedforward network. In the architecture that provided best results, the first layer contained 16 (or 50) neurons and the second layer contained 16 neurons, all with 'Relu' activation functions, followed by the output layer composed of two neurons corresponding to the two categories. Other network architectures have also been tried, but with no significant performance improvement.
The CNN architecture consisted of multiple convolutional layers, followed by fully connected layers. Best results have been obtained when using two convolutional layers, each containing 64 activation maps, followed by a fully connected layer containing 16 neurons and the output layer. Since our input data are one-dimensional (consisting of 8 variables), we used 1-D convolutions with filter length varying from 2-7; a length of 3 was found to be the best.
All implementations were made in Python and Tensorflow. We used the Adam Optimiser in mini-batches, using a learning rate of 0.001 and a batch size of 200-500 (400 was found to be the best choice).
We have trained the above architectures with the data set described above. The obtained prediction accuracy in all cases is shown in Table 2. It can be seen that the CNN model has a superior performance and has been selected as forward model in our approach.

Developing the Backward Model
We followed the procedure described in the former section and extracted the vector with elements the 16 neuron outputs of the CNN fully connected layer, to be used as latent information in the backward model. We then applied the clustering procedure to the training data set and used the validation/test data set to select the optimal number of clusters.
The generation of 20 clusters, 10 per category, was the outcome of this part of our approach. We then computed the respective cluster centers, each being represented by a vector in a 16-dimensional space. The backward model is composed of these 20 vectors, together with the respective 8-dimensional explanatory variable input vector. Tables 3 and 4 show the 8 explanatory variables for each cluster center respectively for the conveyance and non-conveyance categories. To illustrate how the training input data were split in these clusters, Tables 5 and 6 show the number of training data in each of the 10 clusters, respectively for each category.  Table 6. Non-conveyance clusters: number of elements. 0  1148  1  3351  2  752  3  563  4  1405  5  852  6  2812  7  1128  8  937  9 526

Cluster Number Number of Elements
As a result, we have generated a backward model, which combines: (a) the (datadriven) 16-dimensional DNN representation vector, belonging to one of 20 cluster centres (10 from each category) and (b) the 8-dimensional (semantic) respective input vector of explanatory variables. Whenever a new case appears, the explanatory variables are collected and provided as input to the forward model, implemented through a trained DNN. The latter predicts whether the subject needs to be conveyed.
Moreover, the backward model classifies the respective latent variables extracted from the DNN to the nearest cluster center, thus verifying the DNN-based prediction in an efficient and transparent way. We have validated the ability of the generated backward model-consisting of the 20 cluster centers-to provide the same prediction accuracy as the DNN over the data test set. At the same time, the backward model presents the explanatory variables of the closest cluster center and all related semantic information.

Discussion
We have designed convolutional neural networks that were able to outperform other machine learning methods (by approximately 5%), generating a forward model that achieved an accuracy of 80%, ranging from 67% for the conveyance class to 82% for the non-conveyance one. The difference in class prediction accuracy is due to the imbalance in the number of cases between the two classes (about 5 times higher in non-conveyance cases). Moreover, we designed a backward model based on clustering latent variables extracted from the trained CNNs. This model has been able to measure correlation of new input data cases to a representative set of 'cluster centres', thus explaining the way through which the network provides its decision in the new cases.
The CNN-based forward model has been successful, because it applied multiple filters to the data, detecting significant (mid-level and high-level) features' correlations; the proposed clustering of high-level features, in the form of extracted latent variables, provided significant insight analysis and successfully implemented the backward model for our analysis.
The derived backward model has been able to provide the same prediction accuracy as the DNN model from which it emerged in a computationally efficient, one shot classification approach. In addition, it informs the decision makers of the way that similar approaches have been treated in the past-by linking the current case to one the cases corresponding to the representative cluster centers, combining data-driven representations and semantic information related to the explanatory variables.
By examining the values of explanatory variables corresponding to the 20 cluster centres in Tables 3 and 4, we can deduce various conclusions. In the conveyance category, the majority of cases refer to subjects with low consciousness (a value of 0) and/or a high value of destination condition (equal or higher to 0.2). Two clusters, with such characteristics, include more than 60% of the conveyance cases. Some other cases refer to problems in vital signal measurements, such as the BP systolic and BP diastolic values. In one case, a cluster with 7 cases, refers to lack of vital signal values. On the other hand, in the non-conveyance category, we see clusters with high values of consciousness (0.5, or 1.0 values), and/or low values (such as 0) of the destination condition.
Apart from the above cases, where semantic justification can be inferred, there are many other cases, with non-significant differences among the respective values; it is in such cases, where data-driven knowledge, for example, how former cases have been treated, can be supportive for future decision making.
In future work we will increase the size of the data set especially with reference to conveyance cases. Instead of using only data collected within one year, we will obtain and use data from the last five years period.
We also plan to increase the number of explanatory variables, including all contextual information, including subject profiling and geographical information. Moreover, we will extend the number of response variables. The need of ambulance will be split in "treated and transported", "treated and discharged", "patient refused transport", "deceased and transported"; the no need case will be split in "no treatment required", "referred to primary care", "referred to other", "own transport".
To face the problem of missing variables, we will apply Generative Adversarial Network (GAN) models [16]. We will validate the data obtained through GANs in collaboration with our medical experts, so as to extend the derived forward and backward models with such data as well.
In future work we also plan to formalize the extracted semantic information through a knowledge representation methodology [30], interweaving it with the machine learning framework [31][32][33] and attention models [34], extending former work of ours.
Our findings will inform development of future interventions and/or strategies to reduce unnecessary hospital admissions following emergencies in care homes, for example through better responses, or pathways following acute episodes of care.

Data Availability Statement:
The data presented in this study are available on request from the United Kingdom East Midlands Ambulance Service NHS Trust. Request should be addressed to Prof. Niro Siriwardena.