Use of an Electronic Medication Management Support System in Patients with Polypharmacy in General Practice: A Quantitative Process Evaluation of the AdAM Trial

Polypharmacy is associated with a risk of negative health outcomes. Potentially inappropriate medications, interactions resulting from contradicting medical guidelines, and inappropriate monitoring, all increase the risk. This process evaluation (PE) of the AdAM study investigates implementation and use of a computerized decision-support system (CDSS). The CDSS analyzes medication appropriateness by including claims data, and hence provides general practitioners (GPs) with full access to patients’ medical treatments. We based our PE on pseudonymized logbook entries into the CDSS and used the four dimensions of the Medical Research Council PE framework. Reach, which examines the extent to which the intended study population was included, and Dose, Fidelity, and Tailoring, which examine how the software was actually used by GPs. The PE was explorative and descriptive. Study participants were representative of the target population, except for patients receiving a high level of nursing care, as they were treated less frequently. GPs identified and corrected inappropriate prescriptions flagged by the CDSS. The frequency and intensity of interventions documented in the form of logbook entries lagged behind expectations, raising questions about implementation barriers to the intervention and the limitations of the PE. Impossibility to connect the CDSS to GPs’ electronic medical records (EMR) of GPs due to technical conditions in the German healthcare system may have hindered the implementation of the intervention. Data logged in the CDSS may underestimate medication changes in patients, as documentation was voluntary and already included in EMR.


Introduction
Life expectancy around the world has risen as a result of improvements in the diagnosis and treatment of chronic and acute diseases, and better living conditions and hygiene [1]. Longer lives increase the likelihood of developing chronic diseases-if more than one disease occurs in one person at the same time-a condition known as multimorbidity [2]. As increasingly specialized clinical experts use increasingly complex pharmacotherapies to treat individual diseases, while insufficiently taking into account a patient's multimorbidity, polypharmacy-usually defined as the concurrent intake of at least five different chronic medications [3]-is becoming ever more common [4,5]. There is growing awareness that polypharmacy should itself be treated as a risk factor, since the parallel treatment of different diseases with pharmacotherapy can have contradicting or reinforcing effects that are potentially life-threatening [6]. Polypharmacy is, for example, associated with higher rates of hospitalization [7][8][9] and death [9], as well as decreased quality of life and higher symptom burden [10].
The use of computerized decision-support systems (CDSS) to prevent or manage problematic polypharmacy has been evaluated in previous studies, and been shown to improve prescribing quality and reduce the prescription of potentially inappropriate medication [11,12]. However, the results have not always been significant [13], and their impact has frequently only been considered independently of patient-relevant outcomes. When they have been linked to patient-relevant outcomes, results have been inconsistent and have lacked robustness [14,15]. Since such interventions are generally complex, knowledge is lacking on what parameters lead to what outcomes, and on whether interventions have actually been implemented as intended. To gain a better understanding of the processes underlying complex interventions, it is recommended that they are accompanied by a preplanned process evaluation [16][17][18]. A comprehensive process evaluation is also very helpful when complex interventions do not show the anticipated effects. In these cases, the aim of process evaluations is to find the reason(s) for the observed lack of effectiveness. However, few trials follow this advice [19,20].
To circumvent these problems in the AdAM study, a process evaluation based on data logged by a CDSS named "eMMa" (a detailed explanation of the functioning can be found in the Methods section) was carried out with the aim of gaining an insight into how and why the AdAM intervention works the way it does. The study protocol has been published elsewhere a priori [21]. This paper presents the results of the process evaluation of the AdAM intervention, whereby the recommendations of the Medical Research Council (MRC) framework for process evaluations of complex interventions [18] helped us decide on which results to present and on how to structure our report. They also enabled us to provide a multidimensional view of the actual interventions. We ultimately settled on the following research questions: (1) How many and what were the typical characteristics of the GPs and patients that took part in the intervention (i.e., the Intervention "Reach")? (2) What proportion of medication alerts were handled by GPs, and which were considered high priority (Intervention "Dose", not to be confused with the dosage of a specific drug)? (3) What proportion of participants received the intervention as intended, thus increasing the likelihood of success (Intervention "Fidelity")? (4) How did GPs integrate the intervention into their daily routines (Intervention "Tailoring")?

Intervention Reach
The "Reach" intervention dimension refers to the participants that were actually reached by the intervention. Overall, 42,719 patients were considered potentially eligible in the main intention-to-treat analysis, of whom 9268 patients enrolled and 9261 (22%) showed up in the software (here called the "active population": AP). The remaining 33,451 patients (78%) did not enroll in the study and made up the non-enrolled potentials (NEP), along with their respective GPs (n = 351, 34%) and practices (n = 248, 37%), if they agreed to participate in the trial (see Figure 1). Table 1 shows a comparison between the AP and NEP groups for patients. As only 7 of the enrolled patients did not appear in the software ("inactive population": IP), statistical analysis was not feasible for them. Of 925 eligible GPs from 676 practices, 574 GPs (62%) from 428 practices (63%) agreed to participate in the AdAM study. Of these, 465 GPs (50%) from 347 practices (51%) were identified as having actively used the software program and thus comprised the AP. The remaining 109 GPs (12%) and 81 practices (12%) belonged to the IP. Since this analysis focuses on the data gathered from the CDSS, no comparison with potentially eligible GPs and practices in the study region could be conducted. Tables 2 and 3 compare the AP to the IP and NEP for GPs and practices.   Compared to the NEP patient population, the AP patient group contains more men and is slightly older. The nursing care level is defined in German social security laws and specifies the need for nursing services and the welfare payment a patient is entitled to. A higher nursing care level indicates a greater need for nursing services and a higher welfare payment. With an increasing nursing care level, it was less likely that patients would receive the intervention.
On average, GPs that actively used the software were younger than those that had only inactive or non-enrolled patients. Group practices and practices that were randomized to the intervention group from the beginning were also more frequent users than practices that switched to the intervention group in later waves. No other characteristics had a significant impact on CDSS usage.

Intervention Dose
The "Dose" intervention dimension provides an insight into the extent to which the intervention was adopted and implemented, i.e., refers to the "dose" of the intervention the participants actually received. Figures 2 and 3 show the distribution of the number of alerts per patient and GP, as tracked by eMMa before and after the intervention. Overall, the numbers remained constant, indicating no improvement in prescribed medications. An analysis of cases in which a complete anamnesis had been performed and confirmed by GPs, showed a modest reduction in the median number of alerts per patient.  After adjusting the number of alerts per GP to take account of the number of treated patients (Figure 4), the alert count remained virtually constant (mostly a max. of +/− 1 alert per patient), but there is an inverse correlation between the overall number of patients treated by a GP and the change in the number of alerts. Stratifying by alert category gives an insight into the kinds of potential inappropriateness that were assessed with a higher priority (Table 4). Since the software did not generate alerts for Dear Doctor Letters, no analysis was conducted in this category. The number of alerts warning of an inappropriate dosage or the unsuitability of a medication in view of a patient's kidney function declined most frequently. GPs did not appear to pay much attention to alerts relating to potential allergies or duplicate prescriptions, and these actually increased. Table 4. Overall number of alerts per category at T 0 and change over the course of the study.

Analysis
Alert Category

Number of Alerts at T 0 (Proportion of Total Alerts)
Change at T 1 (%)  The reduction was greatest when severe alerts and patients that had received a complete anamnesis were the only groups taken into consideration. Potential drug-drug interactions are reacted to much more frequently when the CDSS rates them as "severity level 1".

Number of Alerts of Severity 1 at T 0 (Proportion of Total Alerts)
Poisson regression analysis resulted in the incidence rate ratios shown in Table 5. Sensitivity analysis for justified alerts is depicted in Table 6. Significant differences in incidence rate ratios are shown in bold.  *** p < 0,001, ** p < 0,01, * p < 0,05.
Overall, there were only a few significant reductions in alerts, and these were solely in the dosage category (Table 5). However, when alerts flagged as "justified" were left out of the analysis, the picture changed, and a significant reduction in dosage alerts could be detected in all subgroups ( Table 6). The same is true for almost all kidney alert subgroups. Point estimates for severity level 1 alerts are equal or lower than those of alerts overall, but the significance of the reduction was limited by low case numbers.

Intervention Fidelity
The "Fidelity" intervention dimension evaluates how frequently the intervention was implemented in such a way that it was actually possible to achieve the aim of the intervention. Table 7 shows how many patients had severe alerts at T 0 and how many of these cases were satisfactorily dealt with according to the criteria defined in the methods section (confirmed completed anamnesis and zero unjustified alerts of severity level 1 at T 1 ). The alerts were stratified by category. On a GP level, no participant fulfilled the Fidelity criteria in any given category for all their patients. As in the case of the Dose dimension findings, duplicate prescriptions and allergies received little attention in comparison to the other categories.
As far as the more highly prioritized categories are concerned, the GPs acknowledged and dealt with all the severe alerts in fewer than 30% of patients, indicating that the intervention goal was only moderately fulfilled.
Summarizing over all categories, severe alerts were only fully resolved or justified in 889 patients. Figure 5 compares this number to the number of potentially eligible patients, enrolled patients, patients receiving the intervention from their GPs, as well as the number of patients that were treated with an intensity that would have made complete fidelity possible. Figure 5 indicates the steps that had to be accomplished to fulfil Fidelity criteria and shows that many patients were lost on the way.

Intervention Tailoring
The "Tailoring" intervention dimension describes how participating GPs handled the intervention, and gives an indication how the intervention could be adapted to fit better into daily routines. Figure 6 shows on which days of the week T 0 was triggered for patients, i.e., when the first GP assessment appeared in the software. In Figure 7, the same distribution is shown for the months of the year and in Figure 8, for the whole intervention period.   These analyses show when GPs preferred or had the chance to conduct medication reviews. The vast majority of cases (89%) were initiated between Monday and Friday, with peaks on Tuesdays and Thursdays. Only a small portion (11%) were initiated on Saturdays, Sundays, and public holidays, when practices are usually closed. In terms of months, the CDSS was used more often in the second half of the year, and especially in September and December. This pattern remains similar when the whole intervention period is examined, except for 2020, when the rise in patient cases was dampened by the COVID-19 pandemic.
Since the intervention software was in a test phase and on several occasions had to be updated, we initially planned to conduct sensitivity analyses to account for major software releases and lengthy software inaccessibility due to technical problems. However, as no error logbook was available, these analyses could not be conducted.

Main Findings
Our process evaluation showed no relevant selection bias on either a patient, GP, or practice level in the participants included in the AdAM trial, compared to the eligible population in the study region (Intervention Reach). The reduction in the number of alerts was minimal (Intervention Dose) and all severe alerts were dealt with in only a modest number of participating patients, which was the final measure of a successful intervention (Intervention Fidelity). An analysis showed that the CDSS was most frequently used on days with long practice hours (Intervention Tailoring).

Our Findings in the Context of Existing Research
The inclusion criteria for our study were broad and the patient group correspondingly heterogeneous. Unlike other similar studies, the group was not preselected according to medication, disease, or age group [23,24]. As a result, the focus of GP training was on dealing with polypharmacy in general, and did not attempt to provide in-depth instruction on how to optimize specific cases, as is the case in polypharmacy trials with narrower inclusion criteria [25,26]. Moreover, some of the included patients did not profit from the intervention because there were very few or sometimes even no alerts that could be reacted to. This problem has also occurred in previous interventions [27].
Existing research into time-consuming documentation has shown that time efficiency is crucial for GPs, which was an implementation barrier in our study [28]. It is therefore plausible that a significant number of participating GPs failed to document all changes in the software. Previous studies have also indicated that integrating CDSS into the electronic health records of patients would improve the usefulness of medication reviews [23].
Furthermore, a recent systematic review found that physicians considered most alerts generated by CDSS to be unhelpful or inappropriate and therefore ignored them [29], which is confirmed by the low rate of documentation in eMMa. Moreover, a significant number of alerts required monitoring certain parameters such as blood potassium levels and cardiac rhythm, or changing drug intake schedules, at the same time as using the CDSS [30,31]. The software only registered the use of such strategies when the alert under consideration was marked as justified. The sensitivity analysis indicated that some GPs made use of this possibility, albeit only few, which was probably because the additional documentation time was not reflected in any improvement in the patient's medication.
Difficulties in making medication changes arise when specialists are involved, as GPs are unwilling to interfere with their colleagues' decisions [32], not least because patients like their specialists to be consulted before such decisions are made [33]. One solution may be to improve interdisciplinary cooperation so that complicated medication regimens, that require both time and practice, can be jointly assessed and improved. The involvement of pharmacists was not part of this intervention, but their support in conducting medication reviews would appear to be plausible, especially in view of the discussed implementation barriers, and existing literature, which indicates benefits in terms of both medication appropriateness [34] and patient-relevant outcomes such as quality of life [35]. Qualitative studies conducted in the AdAM trial suggest that the expertise of pharmacists is also appreciated by GPs [36] and patients [37]. In addition, as the AdAM intervention comprised only one voluntary two-hour education session, with accompanying online videos and FAQs, it is quite possible that better results could have been generated if training in polypharmacy and use of the software had been better, as can be seen in comparable trials [32,38,39].

Strengths and Limitations
The fact that the design of the AdAM study included an underlying process evaluation that had been planned and published beforehand, improves the methodological quality of the trial. Furthermore, this process evaluation addresses each step of the CDSS application process and responds to the urgent need for a deeper understanding of CDSS uptake reported in a recently published systematic review [40]. We could show that GPs attached more importance to severe alerts, and that alerts relating to medication dosage and kidney function were more frequently dealt with than those concerning e.g., drug-drug interactions or possible unsuitability due to a patient's age (Intervention Dose). However, it should be taken into account that the software also generated alerts when a drug was entered into the system without additional information on the daily dosage or a patient's renal function. The management of these alerts would not necessarily have resulted in any improvement in medication but simply have indicated that missing information had been entered into eMMa.
Our analyses also help understand the characteristics of participating GPs, and the kind of patients whose medication reviews they prioritized. Participating patients had a slightly lower average level of nursing care, indicating a barrier to the use of the intervention tools in nursing home patients and for home visits.
Overall, the documented changes were rather small, and all severe alerts were removed or justified in only few patients (Intervention Fidelity). However, it was not possible to distinguish between a patient's medication having been left unchanged, or a change not having been documented in the software. As both the intended intensity of the intervention (Intervention Dose) and the desired intervention goal (Intervention Fidelity) were rarely fulfilled, conclusions about the potential risk reduction attributable to the intervention can only be drawn to a limited extent. Since the CDSS could not be linked with the practice management systems, GPs had to document all changes twice, which time constraints may have prevented, resulting in an underestimation of the use of eMMa.
Results in the Intervention Tailoring dimension showed that in the beginning of the intervention period, when updates and technical difficulties frequently occurred, the enrollment of patients in eMMa was low. GPs that were involved in the early stages of the intervention may have given up on the software after encountering technical problems early on. Furthermore, only few patients enrolled in the early months of each year, which coincided with the flu season. To the best of our knowledge, little research has been conducted into the impact of seasonal fluctuations on implementing a real-world study, so further analysis of the data gathered in the AdAM study may help in the planning of future interventions in clinical settings. Additionally, the COVID-19 pandemic struck during the intervention period. This unforeseeable event was a major, but certainly not the only disruption to the daily care of patients with polypharmacy that was observed during the course of this intervention, which makes the interpretation of results even more difficult.

Recommendations for Research and Clinical Practice
It is necessary to conduct more in-depth training before beginning such an intervention. An integration of the intervention tool in the practice management system and further measures to increase time efficiency would also facilitate adaptation and implementation for GPs and generate more robust data for scientific analysis. It is necessary to investigate whether the integration of further healthcare professionals, such as specialized physicians and pharmacists, would result in more effective medication reviews, especially in complex cases.

Background Information on the AdAM Study
The approach of the AdAM intervention ("Anwendung für ein digital gestütztes Arzneimitteltherapie-und Versorgungsmanagement", or "application of digitally supported drug-therapy and care management") is described in detail elsewhere [41]. In short, the intervention foresees that GPs perform at least one medication review in adult patients receiving five or more chronic medications with the help of a CDSS (software was developed under the name "eMMa", which is an abbreviation for electronic medication management) that has been fed with all relevant medical information in the form of claims data from the statutory health insurance company BARMER. The primary aim is to decrease hospitalization and death rates among polypharmacy patients compared to a patient group receiving usual care.

eMMA
The AdAM intervention involved the application of a CDSS that examined the medication of patients with polypharmacy after claims data provided by the patients' statutory health insurance company had been entered into the system, and after GPs had confirmed the claims data and fed additional relevant information into the system themselves. The underlying software then generated alerts that were categorized by severity (of which only the two highest of four levels overall are analyzed here, since the two lower levels do not pose clinical significance) and type of potential inappropriateness. The alerts that could be seen by GPs and that were analyzed in this study are displayed in Figure 9. GPs then had the possibility to make medication changes and to discuss them with their patients. A detailed breakdown of the steps conducted by GPs is depicted in Figure 10, whereby both figures were previously published in our study protocol [21].

Theoretical Background of the Process Evaluation
The process evaluation is based on consensus recommendations in accordance with MRC guidance and the MRC process evaluation framework [18] and assesses four dimensions (Intervention Reach, Dose, Fidelity, Tailoring) of the implementation and application process. The defined dimensions and their adaptation to suit the implementation of the AdAM software are briefly explained below. A more detailed description can be found elsewhere [21].

Inclusion and Exclusion Criteria for Log Data Analysis
All patients to whom one of the following criteria applied, were included in the analysis of data extracted from the AdAM software: 1.
The GP confirmed in the software that an anamnesis had been completed (referred to as "completed anamnesis").

2.
The software was used to print a medication plan.

3.
At least five medications were entered into the software.
The inclusion criteria were prioritized in descending order: The first day on which criterion 1 was met was defined as T 0 . If this was never the case, criterion 2 and, if necessary, criterion 3 were treated analogously. Duplicates, i.e., patients whose pseudonym was included twice, were excluded after verification. In addition, patients were excluded if their GP only participated in the piloting test phase or had ceased to participate in the study before randomization (e.g., had retired). Patients that enrolled in eMMa after completion of the project were also excluded.
The Intervention Reach compares participants that fulfil the criteria to those that do not. All potentially eligible participants are therefore included in the analyses for that dimension.

(a) Intervention Reach
This dimension deals with the "reach" of the intervention, i.e., whether the selection and inclusion of study participants was carried out as foreseen in the study protocol, and how the study population differed from the defined population in terms of the variables given in Appendix A, Table A1. These comparisons were conducted at the level of patients, physicians, and practices, and were used to determine structural similarity between the groups, and whether, for example, any particular group of patients was prioritized in the intervention.
For this purpose, all patients receiving the eMMa intervention (=active population, AP) were compared to: • Study participants that had enrolled but did not receive the intervention (=inactive population, IP); • Persons that fulfilled the entry criteria for the intention-to-treat population and were on the list of patients provided to participating GPs but did not take part in the intervention (=non-enrolled potentials, NEP).
Analogously, all GPs and practices enrolled in the AdAM study that cared for at least one AP patient, irrespective of whether they also treated patients in the IP or NEP groups were compared to: • Enrolled GPs and practices that cared for at least one IP patient, but no AP patient; • Enrolled GPs and practices that cared for at least one NEP patient, but no AP or IP patient. An overview can be found in Figure 11. Pseudonymized data for patient comparisons originated from BARMER's data warehouse (a database in which all claims data are stored pseudonymously). Comparisons at GP and practice level were carried out using pseudonymized data from the association of statutory healthcare physicians in the study region (KVWL). Group comparisons were carried out using logistic regression and two-sided tests carried out with a significance level of alpha = 5%. Group membership was defined as the dependent variable. This dimension applies to the AP group only and assesses the extent of reductions in alerts in patients that received the AdAM intervention two months after patient data were originally entered into the software. In addition, prioritization associated with the severity and categories of alerts were also analyzed ( Figure 9). The alerts were divided into justified (marked as processed or commented on by the GP) and unjustified alerts (not marked as processed or commented on by the GP). The number of alerts was measured at two points in time: the first timestamp (date) occurred when T 0 is triggered, and automatically two months later (referred to as T 1 ).
In this dimension, the main analysis is of the reduction in alerts between T 0 and T 1 , stratified by severity and the category of alerts. In addition, sensitivity analyses were performed that only included unjustified alerts at T 1 .
Further sensitivity analyses only included the population for which GPs had confirmed that the anamnesis of the patient had been completed and that all medication been entered into the software. This was the original plan, whereby T 0 was to be triggered in the software by pressing a button, and only then was it possible to deal with alerts. Before release, the software developers decided against making this process compulsory.
In order to adjust for clustering at a GP level, a multilevel Poisson model was calculated using the pseudonymized GP ID as a random effect. The total number of alerts was the dependent variable in the model, and T 0 and T 1 were the predictors. All models were stratified by age and sex.

(c) Intervention Fidelity
This dimension applies only to the AP group and examines the trustworthiness of the intervention, i.e., whether the software was used in such a way that a successful intervention (reduction in hospitalization and death) was possible. Alerts rated at the highest severity level were used to operationalize this dimension, as they were considered strongly indicative of a need for action. Furthermore, the GP had to have completed the anamnesis to show the software was being used as intended.
As long as all alerts at this level had been resolved or justified at T 1 , they were considered to have been successfully dealt with in terms of Fidelity.
For this dimension, we reported the proportion of patients whose serious warnings were completely resolved at T 1 . In addition, the analyses were stratified according to alert category.

(d) Intervention Tailoring
In contrast to the other dimensions, the focus here was on the individual adjustments GPs made in order to better integrate the intervention into daily practice routines. For this purpose, the temporal dimension of software use was investigated. Consequently, data are only analyzed for the AP group. Specifically, we looked at the number of patients whose data were called up for the first time on particular days of the week or in particular months, and looked for a concentration of such events during certain periods (e.g., at the weekend), or seasonal dependencies.

Conclusions
There are indications that the CDSS helped participating physicians prescribe fewer high-risk medications by encouraging them to adjust dosages, and to modify prescriptions to take account of renal function impairment. However, the intervention does not appear to have been used intensively, whereby it should be taken into consideration that utilization may have been under-reported in the log data. Overall, the results of the process evaluation indicate that the extent of the implementation of the AdAM intervention was weaker than anticipated.