Healthcare is conventionally regarded as the act of taking preventative or necessary medical procedures to improve a person’s well-being. Such procedures are typically offered through a healthcare system made up of hospitals and professionals (such as general practitioners, nurses, doctors, etc.) working in a multidisciplinary environment with complex decision-making responsibilities.
With the advent of advanced health information technology (HIT) and electronic health records (EHR) in the mid-2000s [1
], hospitals started to manage and share patient information electronically rather than through paper records. This has led to a growing usage of handwriting capable mobile technologies and devices able to sync up with EHR systems, thus allowing doctors to access patient records from remote locations and support them in the delivery of care procedures. Consequently, it is not unusual that a doctor visits a patient interacting with several mobile devices at the same time.
Notwithstanding the benefits of EHR systems and mobile technologies towards improving the delivery of care procedures [2
], there are also indications that their use may have a negative impact on patient-centeredness [6
]. This often results in higher physical and cognitive efforts of doctors while visiting patients, making them more inclined to make medical mistakes [7
] and lose the rapport with their patients [8
]. However, as pointed out in [10
], multi-tasking and information transfers through EHR systems have become necessary aspects of healthcare environments, which can not be avoided entirely.
While the realization of a technological solution that supports doctors in the enactment of care procedures through mobile devices requiring a limited physical/cognitive effort for their usage, and that ensures the continuity of information flow through EHR systems, would be desirable, to date, most of the existing solutions focus exclusively on one aspect of the foregoing requirements or a partial combination of them [11
On the one hand, the Human–Computer Interaction (HCI) community has investigated how the use of multimodal interfaces
has the potential to reduce the cognitive efforts on users that manage complex activities such as the clinical ones. For example, in [12
], the authors state that “multimodal interface users spontaneously respond to dynamic changes in their own cognitive load by shifting to multimodal communication as load increases with task difficulty and communicative complexity
”. Furthermore, recent research by Pieh et al. [13
] has shown that multimodal approaches to healthcare deliver the most effective results, compared to a single modality on its own.
On the other hand, the Business Process Management (BPM) community has studied how to organize clinical activities in well-structured healthcare processes
and automate their execution through the use of dedicated Process Aware Information Systems (PAISs). PAISs are able to interpret such processes and to deliver to doctors and medical staff (e.g., nurses, general practitioners) relevant information, documents and clinical tasks to be enacted, by invoking (when needed) external tools and applications [14
]. Nonetheless, current BPM solutions, which are driven by predefined rules lists, have proven to be suitable to manage just the lower-level administrative processes, such as appointment making, but have made little progress into the core care procedures [15
Based on the foregoing, in this paper, we present the main findings of the Italian project TESTMED (TESTMED was a 24-month Italian project, and stands for “meTodi e tEcniche per la geSTione dei processi nella MEdicina D’urgenza
”, in English: “methods and techniques for process management in emergecy healthcare
”), whose purpose was to design and develop a clinical PAIS, referred to as TESTMED system, which investigated touch and vocal interfaces as a potential solution to reduce the cognitive load of doctors interacting with (clinical) mobile devices during the patient’s visit, and a process-aware approach for the automation of a specific class of care procedures, called clinical guidelines
(CGs). CGs are recommendations on how to diagnose and treat specifical medical conditions, presented in the form of “best practices”. They are based upon the best available research and practice experience [16
The objective of the project was not on automating the clinical decision-making, but on supporting doctors in the enactment of CGs, delivering them the relevant clinical information (such as the impact of certain medications, etc.) to reduce the risk arising from a decision. The system exploits concepts from BPM on how to organize CGs and how to support their execution, in whole or in part. In addition, the system supports vocal and multi-touch interaction with the core clinical mobile devices. This allows the doctor to switch between different modes of interaction selecting the most suitable (and less distracting) one during a patient’s visit.
The TESTMED system has been designed through the User Centered Design (UCD) methodology [20
] and evaluated in the emergency room of DEA (“Dipartimento di Emergenza ed Accettazione”, i.e., Department of Emergency and Admissions) of Policlinico Umberto I, which is the main hospital in Rome (Italy). The target was to demonstrate that the adoption of mobile devices providing multimodal user interfaces coupled with a process-oriented execution of clinical tasks represents a valuable solution to support doctors in the execution of CGs.
This paper extends our previous works [21
] in several directions by including many new elements that were previously neglected. To be more specific:
The introduction has been partially rewritten and extended;
A refined background section describing the characteristics of healthcare processes under different perspectives has been provided;
The description of clinical guidelines is more detailed and complete;
The section describing related works has been extended significantly with a new contribution describing the state-of-the-art of vocal interfaces;
An improved user evaluation section discussing the complete flow of experiments to evaluate the effectiveness and the usability of the system has been proposed, measuring also the statistical significance of the collected results;
All other sections of the paper have been edited and refined to present the material more thoroughly.
The rest of the paper is organized as follows: Section 2
provides a relevant background knowledge about healthcare processes and CGs, and introduces a concrete CG that will be used to explain the approach underlying the TESTMED system. Section 3
describes the general approach used for dealing with the enactment of CGs, while Section 4
presents the architecture of the system, introducing technical details of its software components. Then, Section 5
presents the outcomes of the user evaluation of the system and some performance tests. Finally, Section 6
discusses relevant works and Section 7
concludes the paper by tracing future work.
3. Enactment of Clinical Guidelines with TESTMED
The main challenge tackled by the TESTMED project was to reduce the gap between the fully automated solutions provided by the BPM community and the clear difficulties of applying a traditional process management approach in the healthcare context. To realize this vision, the major outcome of the project was the development of a clinical PAIS, referred to as a TESTMED system, enabling the interpretation and execution of CGs and their presentation to doctors and medical staff through multimodal user interfaces.
The TESTMED system is thought to be used when a patient suffering from a medical condition (amenable to a CG) asks for a visit. Doctors are provided with a tablet PC (supporting touch and vocal interaction) that runs the TESTMED system. Thus, the doctor is enabled to select, instantiate, and carry out specific CGs.
For example, in the case of chest pain, the doctor starts filling a survey for determining the severity of the patient’s medical condition, which is expressed through a chest pain score (cf. also Section 2.3
). The survey is presented to the doctor on the graphical user interface (GUI) of the tablet PC where the TESTMED system is installed (see Figure 4
a). The interaction can be performed by exploiting the touch features of the tablet, or, vocally, through integrated speech synthesis and recognition. The grey icon with a microphone that is located at the top-right of the GUI in Figure 4
a is shown only when the interaction shifts from touch to vocal. It serves as a visual feedback that the vocal interaction is properly working.
The vocal interaction requires that the doctor wears a headset (The use of a headset guarantees both a higher quality of the vocal interaction in a noisy environment like the hospital ward’s one, and that the privacy of the visited patient is preserved) with a microphone linked to the tablet; s/he can listen to the questions related to the survey and reply vocally by choosing one of the speech-synthesized possible answers. Each answer is associated with a specific characteristic and provides an associated rate.
Once the survey is completed, the TESTMED system elaborates a dedicated therapy including a sequence of clinical treatments and analysis prescribed to the patient. The therapy is structured in the form of a care pathway
. As an instance, when the chest pain score is greater than 4, the suggested care pathway is the one shown in Figure 5
. For the sake of readability, we have modeled the care pathway in the Business Process Modeling Notation
(BPMN is a standard ISO/IEC 19510:2013, cf. https://www.iso.org/standard/62652.html
, to model business processes). The reader should notice that BPMN is not the notation employed to concretely represent and encode a CG in the TESTMED system (to this aim, we used PROforma language [34
], as explained in Section 4
), but it is used here to show (in a comprehensive way) how care pathways usually look like.
The care pathway in Figure 5
includes, first of all, that the patient is subjected to some general blood analysis, which must be repeated for a second time after four hours. Once the analysis results are ready, the doctors assess them to decide if the patient should be hospitalized or not. Specifically, if the results are not good, the care pathway in Figure 5
provides instructions on performing further tests to the patient (in this case, a hemodynamics consulting and a coronary catheterization) and, based on the results obtained, to activate a further procedure concerning the hospitalization of the patient. On the other hand, if the analysis results are good, after 8–12 h, the patient is subjected (again) to some general blood analysis, whose results drive the next clinical steps to be performed to the patient, according to the care pathway in Figure 5
The enactment of the various clinical tasks takes place in different moments of the therapy. Furthermore, a collaboration between doctors and medical staff is crucial to enact the proper medical treatments for each patient. The components of the medical staff (i.e., nurses and general practitioners) are equipped with Android-based mobile devices and are notified of the progress of care pathways and of the clinical tasks that have to be enacted for supporting doctors (e.g., to perform a blood analysis to the patient, etc.). Figure 6
shows two screenshots of the GUI provided to the medical staff, which only allows for tactile interaction.
The TESTMED system provides the ability to properly orchestrate the clinical tasks, assigning them to (available) doctors or members of the medical staff, and to keep track of the status of the care pathway, by recording the results of the analysis and and doctors’ decisions. Reminders and notifications alert doctors and the medical staff if new data (e.g., the results of some analysis is ready to be analyzed—see Figure 4
b) are available for some patient. If this is the case, the doctor can decide to visualize further details about the analysis results and the execution status of the care pathway or simply accept the notification. It is worth noticing that the doctor can abort the enactment of the care pathway in any moment.
4. The Architecture of the TESTMED System
The TESTMED system is based on three main architectural components: a graphical user interface
, a back-end engine
, and a task handling server
. Figure 7
shows an overall view of the system architecture.
The system is implemented employing a multimodal user interface. On the one hand, doctors interact with a GUI (see Figure 4
) that is specifically designed for being executed on large mobile devices (e.g., tablets), and allows for tactile or vocal interaction. In particular, vocal interaction enables doctors to work flexibly in clinical scenarios where their visual and haptic attention (i.e., eyes and hands) are mainly busy with the patient’s visit. On the other hand, the GUI provided to members of the medical staff is thought to be visualized on small mobile devices (e.g., smartphones) and provides only a tactile interaction (see Figure 6
The back-end engine
with its services provides the ability to interrupt, activate, execute and monitor CGs and relevant data between doctors and the medical staff at run-time. In TESTMED, a collection of various programming languages is used to define each CG. First, the PROforma language [34
] is utilized to fulfill the role of modeling each CG into a set of clinical activities, data items, and the control flow between them. Then, starting from the resulted PROforma model, a configuration file is semi-automatically built in XML language to define all the necessary settings by which the multimodal interaction functionality is allowed and the integration of different system components is enabled. As a result, a CG will be finally represented as a guideline bean
, which will be deployed into the system and ready for execution.
The execution of CGs is completed with a precise routing of data, set of events and clinical activities, which all follow a well-defined process-aware and content-based approach in which activities are scheduled and messages are dispatched in an event- and data-driven way. The back-end engine manages and controls the routing process of all clinical activities, related data, and produced events between the corresponding parts including actors, services, and applications. Therefore, it guarantees a successful interaction among all participating units and services. Moreover, any software module that communicates with the engine for completing any defined activity can be viewed as an external service to be invoked when needed.
Basically, services are considered wrappers over pre-existing legacy systems, such as the Electronic Medical Record (EMR) systems employed in hospitals.
The routing engine performance depends on a primary scheduling unit which handles the accomplishment of activities that require temporal constraints (e.g., examinations and analytical experimental tests which should be designed and conducted in a timely manner), and communicates with the EMR system in order to (i)
search and query medical and administrative patient information, (ii)
organize and plan examinations, laboratory tests, medicine receipts, etc. according to the related medical procedure, and (iii)
get notified about events and laboratory testing results so that it can be forwarded to the assigned doctors. This interoperable work with the EMR system is realized by employing the Health Level 7 (HL7) standard protocol (HL7 is a set of international standards for transfer of clinical and administrative data between hospital information systems. http://www.hl7.org/
). The analyzing, processing and creation of HL7 messaging packets is organized and controlled by a specific HL7 processing unit.
It is worth noticing that all the conducted activities while performing a CG should be stored and registered, in order to keep track and retrieve accordingly all the events, activities and data, which relate to the medical case and its decision-making scenarios. This registered information might be potentially utilized for: (i)
creating medical and analytical reports; (ii)
documenting the suggested and followed care plan assigned to each patient so that it can be used as a legal reference in the future; (iii)
providing a database platform that maintains all the medical records and chosen treatment scenarios for all patients, which can be exploited to introduce a better and improved version of the documented CG after running further analysis on all these collected data; and (iv)
providing valuable support for forensic analysis [35
From a technical perspective, the multimodal interaction feature is developed with the utilization of different technologies including the Text-To-Speech (TTS) engines, the Microsoft Automatic Speech Recognition (ASR) and the Multi-touch for Java framework (MT4j, http://www.mt4j.org/
). As for the back-end, it is implemented using the Tallis engine (http://archive.cossac.org/tallis/Tallis_Engine.htm
), which is able to handle and manage the compatibility with the legacy systems installed in the hospital Policlinico Umberto I. This compatibility is facilitated and deployed using HL7 messages over Mirth (http://www.mirthcorp.com/products/mirth-connect
). Each of the above-mentioned software parts is J2EE-based and hosted on a TomEE (http://tomee.apache.org/apache-tomee.html
) application server. In particular, the communication between the back-end and the GUI of the assigned doctor is performed by a JMS-based notification engine, called RabbitMQ (http://www.rabbitmq.com/
). Finally, Apache Camel https://camel.apache.org/
) is used as rule and mediation engine in order to orchestrate all the above technologies and components.
Finally, a task handling server is in charge of communicating with both the back-end and the existing legacy systems via HL7 and RESTful messages. This server has the important role of informing the medical staff about when a clinical activity is needed to be performed in the context of a specific CG. On the other hand, medical staff members, as previously discussed, are provided with a dedicated front-end Android application that employs RESTful services to interact with the task handling server.
All the logic of a CG is therefore coded via the PROforma model and various XML-based configuration files. Those files describe the data to be provided by the user interfaces, the queues and messages (JMS and HL7) to be exchanged, the routing and the scheduling of the different interactions, etc. (In particular, all those files instruct the routing and mediation rules enacted by the Apache Camel framework.) All these files are bundled in a guideline bean
—a zipped archive that is then disassembled by the system and used to instruct the different components. For each new guideline to be deployed and enacted by the system, a new PROforma model and related XML-based files should be produced by the system engineer, on the basis of the BPMN process describing the care pathway (as the one of Figure 5
for the chest pain). Therefore, analogous to many middleware technologies and process-aware tools, the TESTMED system is general-purpose (no new code should be written when deploying a new CG) but requires technical configuration by system engineers, who, on the basis of the requirements of the specific care pathway to be implemented, design and deploy the guideline bean.
5. User Evaluation
The TESTMED system has been thought to be used in hospital wards for supporting doctors in the execution of CGs. In this context, medical staff and doctors must work in collaboration and coordination to perform the appropriate clinical activities to the patients. Hence, providing a satisfactory mobile interaction is crucial, as it allows for:
supporting the mobility of doctors for visiting the patients;
facilitating the information flow continuity by supporting instant and mobile access;
speeding up doctors’ work while executing CGs and performing clinical decision-making.
The latter point is also confirmed by a survey carried out by the Price Waterhouse Coppers’ Health Research Institute (HRI) [36
], which reported that 56% of doctors—over a large sample—were able to improve and speed up their decision-making thanks to the use of mobile technologies.
Despite the utilization of mobile devices and applications may significantly empower the ability of doctors and medical staff to collaborate and coordinate themselves, there are still some key challenges to be addressed. Among them, one of the most relevant challenges consist of realizing a GUI that is able to represent in a compact yet understandable way the description underlying a clinical activity and, at the same time, does not distract the doctors from visiting the patients [37
To achieve this objective, we realized the TESTMED system leveraging the user-centered design
(UCD) methodology [20
], which places the end users at the center of any design and development activity. To this end, we initially developed two mockups of the system (during months 4 and 9 of the project, respectively). Various usability studies (including thinking aloud techniques, focus groups, etc.) have been conducted on each mockup with real doctors, and the results of each user study have been used for incrementally improving the design of the GUI of the system. One of the main effects of applying UCD methodology was the introduction of the vocal interface in the second mockup, together with the basic touch interface that was solely presented in the first mockup. This was due to the fact that the users’ feedback on the first mockup indicated the need for the doctors to have their hands free while visiting a patient. That’s why we introduced the possibility to (also) vocally interact with the GUI.
On the basis of the outcomes of the above usability studies, we have iteratively produced two working prototypes of the system in months 12 and 18 of the project, respectively. We assessed them employing well-established evaluation methods involving the target users (i.e., real doctors). Results and findings of the user evaluation performed over the working prototypes are discussed in the next sections.
5.1. Evaluation Setting and Results of the First User Study
The two developed working prototypes have been tested with patients suffering from chest pain (cf. Section 2.3
), and therefore the related CG has been modeled, configured and deployed on the system to perform the testing.
The initial user study was performed in Policlinico Umberto I hospital in Rome with the help of the Department of Emergency and Admissions (DEA). In Figure 8
, a doctor is shown using the TESTMED system to enact the CG on a patient simulator. In this experiment, five postgraduate medical students and two doctors participated in the study. Given a patient simulator that was supposed to reflect a real patient with chest pain symptoms, the participants were asked to use the TESTMED system to visit the patient according to the related CG (see Figure 8
After the completion of the user testing, a questionnaire was provided to the participants with the aim to gather their background information and collect data about how they perceived the interaction with the system. Specifically, the questionnaire consisted of 11 statements covering aspects like easiness of use of the GUI, quality of the multimodal interaction, etc. The answers were evaluated through a 5-point Likert scale, which ranged from 1—strongly disagree to 5—strongly agree, to reflect how participants agreed/disagreed with the defined statements:
I have a good experience in the use of mobile devices.
The interaction with the system does not require any special learning ability.
I judge the interaction with the touch interface very satisfying.
I judge the interaction with the vocal interface very satisfying.
I think that the ability of interacting with the system through the touch interface or through the vocal interface is very useful.
The system can be used by non-expert users in the use of mobile devices.
The system allows for constantly monitoring the status of clinical activities.
The system correctly drives the clinicians in the performance of clinical activities.
The doctor may—at any time—access data and information relevant to a specific clinical activity.
The system is robust with respect to errors.
I think that the use of the system could facilitate the work of a doctor in the execution of its activities.
summarizes the results of the first user study. From such results, we can infer that the general attitude of the participants towards our system was positive. Results put in light that participants have considered the system as effective in the enactment of CGs, since it was able to concretely orchestrate doctors with executing the clinical activities included in the CG (cf. results of Q8). Furthermore, the system allowed doctors to constantly monitor the status of each clinical activity (cf. results of Q7) and to easily access information relevant to the specific activity under execution (cf. results of Q9). Participants also showed a fair amount of satisfaction with how the system behaves with respect to error handling (cf. results of Q10), learnability (cf. results of Q2), and ease of use for non-expert users (cf. results of Q6).
On the negative side, the interaction with the vocal interface was considered as quite unsatisfactory (cf. results of Q4), while high satisfaction was experienced using the touch interface (cf. results of Q3). Nonetheless, the participants agreed that a multimodal interaction involving both touch and vocal interface could be useful to facilitate the work of doctors (cf. results of Q5). It is worth noticing that the questionnaire allowed participants to also add feedback and comments in free text. Using this feature, five out of seven participants explicitly asked us to develop an improved vocal interaction for the system, enabling doctors to dynamically switch the interaction modality (from vocal to touch, or vice versa) when needed.
The responsiveness of the GUI is another aspect that was investigated in the range of the first working prototype. In this direction, further tests were performed to calculate the time required by doctors to perform a single step of the survey associated with the CG deployed into the system (see also Section 2.3
). Specifically, we assumed that a doctor completed a single step in the survey when s/he passed from one scene (in TESTMED, as discussed in Section 4
, MT4j is exploited to build the GUI frames of the system, referred to as scenes
, which enable to handle (multi)touch input events) of the GUI to the next one by answering the corresponding question of the survey. We monitored the time required by doctors to complete any scene associated with the CG’s survey, until its completion.
We ran this test twice: firstly, using the touch interface, and secondly using the vocal interface. A summary of the collected results is shown in Figure 9
, where each scene transition is represented on the x
-axis, and the corresponding time required for generating the new scene and displaying it on the screen is on the y
-axis. In our case study, which was focused on the chest pain CG, we needed three scene transitions before generating the final chest pain score.
The tests were performed using an ACER Iconia Tab W500 (ACER Inc., Xizhi, Taiwan) with a 1 Ghz AMD CPU and 2 GHz of RAM, which was running Windows 7 OS (Microsoft Redmond, Redmond, Washington, USA). With the exclusive use of the touch interface, the analysis shows an average time of 400 ms for completing a scene transition, compared to 6–700 ms required when just the vocal interface is used. The reason of the delay caused by using the vocal interface is due to the (extra) time needed by the system to contact the ASR engine (usually around 200–250 ms). Nonetheless, from the user point of view, this delay has a low impact on the overall responsiveness of the system, since timing that does not exceed 700 ms. to perform a scene transition is usually considered acceptable by the users [38
5.2. Evaluation Setting and Results of the Second User Study
The results and findings of the first user study were leveraged to refine the weak aspects of the first prototype, in order to realize a (more) robust second working prototype of the system. If compared with the first prototype, the second one provided a more elaborated design of the GUI, together with a redefinition of the interaction principles underlying the vocal interface. For example, in the first prototype, the vocal features of the system were always active during the enactment of a CG. This resulted in many “false positives”, i.e., the system wrongly recognized as proper vocal commands (by consequently activating unwanted functionalities) some words pronounced by the doctor during the patient’s visit. In the second prototype, to prevent false triggers, we decided to activate the vocal interface only after a specific (and customizable) key vocal instruction pronounced by the doctor.
Leveraging the second working prototype, we performed a second user study employing the same chest pain CG used in the first user study. In addition, the second user study took place at the DEA of Policlinico Umberto I in Rome. Seven users (different from the ones that were involved in the first user study) participated in the second user study, including six postgraduate medical students and one doctor. Like in the first user study, participants attended the patient simulator and were asked to use the second prototype of the system to enact the CG (see Figure 8
). They also completed the same questionnaire employed in the first user study to assess the effectiveness of the system. The results of this second user study are shown in Table 2
From the analysis of the results, it is evident that the good feelings obtained in the first user study about the working of the system have been confirmed by the results of this second user study. If we refer to Figure 10
, we can note that participants’ ratings in the second user study increased for all statements if compared with the first user study. Moreover, the results highlighted that the design of the second prototype made considerable progress, in particular because we precisely followed for its development the traditional design guidelines for building multimodal GUIs [10
One critical aspect was the interaction with the vocal interface (cf. statement Q4), which resulted as being quite unsatisfactory in the first user study, with an average rating of 2,7. Conversely, the improved vocal interface employed in the second working prototype was really appreciated by the participants in the study, with an average rating of 4. To confirm that the improvement of the vocal interface was not the result of a coincidence, we analyzed the ratings for the statement Q4 collected in the first and second user studies leveraging the 2-sample t
-test. This statistical test is applied to compare whether the average difference between two population means is really statistically significant or if it is due instead to random chance. The results of the 2-sample t
-test are summarized in Figure 11
. Statistical significance of the results is determined by looking at the p-value, which gives the probability that the collected results have been obtained randomly. If p
-value assumes a value of 0.05 or less, it is possible to conclude that the collected data are not due to a chance occurrence, which is the case of our data (p
-value is 0.0265099). This allows us to conclude that the improvement of the vocal interface was not the result of a coincidence.
Finally, we also performed a traditional System Usability Scale (SUS) questionnaire for precisely measuring the usability of the second working prototype. SUS is one of the most widely used methodologies for post-test data collection (43% of post-tests are of the SUS type [39
]). It consists of a 10-item questionnaire. Any question is evaluated with a 5-point Likert scale that ranges from 1—strongly agree to 5—strongly disagree. Once completed, an overall score is assigned to the questionnaire. Such a score can be compared with several benchmarks presented in the research literature in order to determine the level of usability of the GUI being evaluated. In our test, we made use of the benchmark presented in [39
] and shown in Figure 12
. From the analysis of the SUS completed by the seven participants of the second user study, the final average ratings were of 77.5 and 78.4 for the GUIs used by the doctors and medical staff, respectively, which correspond to a rank of ’B+’ in the benchmark presented in [39
]. This means that the GUI of TESTMED has on average a good usability rate, even if there is still room for improvements.