Detecting Medication Risks among People in Need of Care: Performance of Six Instruments

Introduction: Numerous tools exist to detect potentially inappropriate medication (PIM) and potential prescribing omissions (PPO) in older people, but it remains unclear which tools may be most relevant in which setting. Objectives: This cross sectional study compares six validated tools in terms of PIM and PPO detection. Methods: We examined the PIM/PPO prevalence for all tools combined and the sensitivity of each tool. The pairwise agreement between tools was determined using Cohen’s Kappa. Results: We included 226 patients in need of care (median (IQR age 84 (80–89)). The overall PIM prevalence was 91.6 (95% CI, 87.2–94.9)% and the overall PPO prevalence was 63.7 (57.1–69.9%)%. The detected PIM prevalence ranged from 76.5%, for FORTA-C/D, to 6.6% for anticholinergic drugs (German-ACB). The PPO prevalences for START (63.7%) and FORTA-A (62.8%) were similar. The pairwise agreement between tools was poor to moderate. The sensitivity of PIM detection was highest for FORTA-C/D (55.1%), and increased to 79.2% when distinct items from STOPP were added. Conclusion: Using a single screening tool may not have sufficient sensitivity to detect PIMs and PPOs. Further research is required to optimize the composition of PIM and PPO tools in different settings.


Introduction
Older people are more frequently affected by polypharmacy, and more susceptible to adverse drug reactions (ADR), than younger people due to multimorbidity and physiological aging processes [1][2][3][4][5]. As a guidance for clinicians, a number of consensus-based instruments have been developed listing potentially inappropriate medication (PIM) to be avoided or used with caution in older people. Instruments alerting physicians to potential prescribing omissions (PPO) have also been developed [6][7][8].
Internationally prominent examples include START/STOPP criteria, and EU(7)-PIM, while the PRISCUS and FORTA lists are German developments [9][10][11][12]. More recently, the STOPPFall list has been developed by a European geriatrics society task force, which alerts prescribers to fall risk increasing

Data Source and Study Population
BaCoM is a multicenter prospective registry study of patients in need of care with three study centers in Bavaria, Germany (LMU Munich, UK Würzburg and FAU Erlangen, registered in the German Clinical Trials Register: DRKS 26039).
The analyzed BaCoM participants were those with and without a prior history of COVID-19, who had to be at risk of PIM or PPO, and therefore had to be above 65 years of age and take one or more long term medications. They were enrolled by their respective GP or study physicians and in need of care or support. The latter was defined as receipt of financial support by public care insurance according to an officially assessed care level ("Pflegegrad"), or a score of ≥5 on the 7-point Clinical Frailty Scale (CFS) [16,17]. Exclusion criteria were an estimated life expectancy of <6 months (as judged by the recruiting physician), unclear legal residency status, and persons without health insurance.
Data were collected by trained study assistants, including sociodemographic and health status data to describe the study population. Apart from clinical frailty, the health status also included data on cognitive function (assessed by a Six-Item Screening Tool) and a Montreal Cognitive Assessment Test Blind (MoCA-BLIND) in those with less than three errors in the Six-Item Screening Tool [18][19][20]. Medical diagnoses, medications taken, and vital signs such as blood pressure, heart rate, and forced expiratory volume in 1 s (FEV1) were documented to apply PIM and PPO instruments. Medication schedules and diagnosis lists were either provided by the GP or collected by the study team at the site at which the participant received care, e.g., nursing homes or, in the case of outpatient care, at the participant's home. The database source therefore partly comprised codes referring to International Statistical Classification of Diseases and Related Health Problems (ICD-coded) diagnosis lists and standardized medication schedules, but also handwritten lists extracted from nursing records.

Definition of PIMs and PPOs
We included a total of six different instruments designed to detect PIMs or PPOs or both. A brief description of each tool, highlighting the structure, number of items, and data categories required for their application, is provided in Table 1. All PIM instruments included in this study were applicable to patients aged 65 years or older (without restrictions), and comprised the FORTA list, STOPP, EU(7)-PIM, PRISCUS, German-ACB, and STOPPFall [9][10][11][12][13][14]. From the FORTA list, we only considered medications listed as "C = questionable" and "D = avoid", according to the authors' recommendations [12]. FORTA, STOPP, EU(7)-PIM, and PRISCUS are generic tools, in the sense that they were designed to cover medication risks across all drug groups, whereas German-ACB [14] and STOPPFall [13] were specifically developed to identify anticholinergic and fall risk increasing drugs (FRIDs), respectively.
In the German-ACB, we only classified as PIM medications with an ACB score of ≥3 [14]. For STOPP-Fall, we considered all 14 drug groups classified as FRIDs, but only defined them as PIM when participants' risk of falls was increased by one or more of the conditions listed in the accompanying STOPPFall deprescribing tool (e.g., diuretics in the case of hypotension). As PPO tools we included START [9] and FORTA-A (i.e., medications listed as "A = indispensable").

Measurement of PIMs and PPOs
All medications were coded using the Anatomical Therapeutic Chemical (ATC) classification and the diagnoses were coded using ICD-10 [21,22]. Where medication doses were required to apply the included PIM/PPO instruments, daily doses were calculated from the instructions provided. When dosage information was missing, these medications were not included in criteria that considered dose. In cases where dosing instructions were "as required", these were not taken into account in criteria considering only long-term medication.
Criteria that explicitly considered the duration of intake (e.g., longer than six weeks) were not considered in any patients because this information was not commonly available. Where medical diagnoses were required to apply the respective PIM or PPO instruments, we only considered explicitly documented diagnoses (i.e., did not assume diagnoses based on medication profiles).
The PIM-defining criteria from each tool were transcribed into a programming language and applied to the data using RStudio V.2022.07.2.

Data Analysis
In order to examine the prevalence of each PIM and PPO instrument, all instruments were first applied separately, and the prevalence was calculated as the proportion of patients (and 95% confidence interval) with one or more respective PIM or PPO. As a result, each medication taken by each patient was classified as a PIM (or not) or a PPO (or not) according to each tool. In order to examine the sensitivity of each PIM and PPO instrument, we defined PIMs and PPOs identified by any of the respective instruments as the gold standard. The sensitivity for each tool was then calculated as the proportion (and 95% confidence interval) of all PIMs/PPOs detected by each respective instrument. Similarly, we calculated the proportion of PIMs/PPOs uniquely detected by each respective instrument, i.e., not by any of the others. The concordance among the different tools was determined by an analysis of interrater reliability using Cohen's Kappa and overlaps between tools visualized using Venn diagrams [23]. In order to determine which proportions of PIMs/PPOs would be detected by which combination of PIM/PPO tools, we started with the instruments with the highest PIM/PPO prevalence. We then considered, which other tool would detect the most additional PIMs/PPOs not detected by the first tool, etc. The findings were visualized using a Pareto chart. All confidence intervals were calculated using the exact binomial test [24]. Table 2 shows the characteristics of the study population, comprising 226 participants with a median (IQR) age of 84 (80 to 89) years, with most (76.6%) aged ≥ 80 years and about one fifth (22.6%) being ≥ 90 years old. The majority (71.2%) of participants were female, and three quarters (74.6%) were residents of long-term care facilities.   The median (IQR) score on the CFS was 6 (5 to 7), consistent with moderate frailty, and over half (53.3%) of participants achieved less than 18 points on the MoCA Blind Assessment, consistent with mild cognitive impairment [17][18][19]. The median (IQR) on the Charlson Comorbidity Index (CCI) was 3 (1 to 5), corresponding to moderate severity of comorbidities, and a quarter (26.1%) suffered from severe comorbidities (CCI score ≥ 5) [25].

Characteristics of the Study Population
Three quarters (75.2%) of patients had a documented diagnosis of hypertension, almost a third (31.9%) had atrial fibrillation, and more than 20% were affected by diabetes (27.0%), dyslipidemia (27.9%), and heart failure (23.5%). In addition, 21.2% had a documented diagnosis of depression.

Application of PIM and PPO Instruments to Available Data
Of the total 114 criteria of the START/STOPP tool, 91 items (30 items for START and 61 items for STOPP) could be applied. For the remaining 23 criteria, the data required were not available for the sample [27]. Missing but required data includes laboratory data, metrics from medical exams, vital signs from the past, and the date when a diagnosis was made or a medication was prescribed. For FORTA, we excluded vaccination-and cancerrelated sections because vaccinations and ongoing chemotherapy were not consistently documented in the available data. For all other instruments, data was available to apply all items. Table 3 shows that considering all PIM tools together, the PIM prevalence (proportion (95% CI) of patients with ≥ 1 PIM) was 91.6 (87.2-94.9)%, 79.6% had two or more PIMs, and more than half (57.1%) had four or more PIMs. However, the PIM prevalence varied considerably by tool, and was highest for FORTA C/D (76.5 (70.5-81.9)%), followed by STOPP (65.9 (59.4-72.1)%), EU (7)

Cumulative Sensitivity of Combining PIM Instruments
The Pareto chart in Figure 2a shows (as bars) the percentage of all PIMs detected by FORTA-C/D, while the remaining bars show the percentage of new PIMs additionally detected by each tool, after application of the previous tool(s). The line shows the cumulative sensitivity (i.e., the percentage of PIMs detected) resulting from the addition of each tool. Since PRISCUS did not identify any PIMs exclusively, this tool was not considered in this analysis. Starting with FORTA C/D (which had the highest sensitivity, of 55.1%), adding STOPP achieves a cumulative sensitivity of 79.2%, and further adding EU(7)-PIM achieves a sensitivity of 94.1%. Figure 2b shows that after application of FORTA-C/D and STOPP, adding PIM criteria for four drugs (apixaban, rivaroxaban, and sodium picosulfate from the EU(7)-PIM list; diuretics from STOPPFall) increases the sensitivity by 10.6% to 89.8%. The addition of criteria relating to opioids, antiepileptics and antipsychotics (from STOPPFall), and metoclopramide (from EU(7)-PIM), increases the sensitivity further by 3.7% to 93.5%.

(3) Cumulative sensitivity of combining PIM instruments
The Pareto chart in Figure 2a shows (as bars) the percentage of all PIMs detected by FORTA-C/D, while the remaining bars show the percentage of new PIMs additionally detected by each tool, after application of the previous tool(s). The line shows the cumulative sensitivity (i.e., the percentage of PIMs detected) resulting from the addition of each tool. Since PRISCUS did not identify any PIMs exclusively, this tool was not considered in this analysis. Starting with FORTA C/D (which had the highest sensitivity, of 55.1%), adding STOPP achieves a cumulative sensitivity of 79.2%, and further adding EU(7)-PIM achieves a sensitivity of 94.1%. Figure 2b shows that after application of FORTA-C/D and STOPP, adding PIM criteria for four drugs (apixaban, rivaroxaban, and sodium picosulfate from the EU(7)-PIM list; diuretics from STOPPFall) increases the sensitivity by 10.6% to 89.8%. The addition of criteria relating to opioids, antiepileptics and antipsychotics (from STOPPFall), and metoclopramide (from EU(7)-PIM), increases the sensitivity further by 3.7% to 93.5%.

Summary of Findings
This cross-sectional study of a convenience sample of 226 people in need of care, aged ≥ 65 years, in Bavaria (Germany) shows that the vast majority of participants received polypharmacy (92.5%). The vast majority (91.6%) also received at least one PIM after the application of six PIM tools together, with 79.6% receiving two or more PIMs, and over half (57.1%) receiving four or more PIMs. Similarly, most (82.7%) participants had at least one PPO considering FORTA-A and START together, and 50.0% had two or more PPOs. More than three quarters of the analyzed patients (76.1%) were affected by both PIMs and PPOs.
No single PIM instrument reached full PIM coverage, and the detected PIM prevalence varied considerably by tool, ranging from 76.5% for FORTA C/D to 6.6% for German-ACB ≥ 3. Pairwise agreement between the PIM tools was poor to moderate and highest between PRISCUS and German-ACB (Cohen's Kappa 0.42 (0.23-0.59)). FORTA C/D had the highest sensitivity of PIM detection (it identified 55.1% of all PIMs), and it also detected the most PIMs not identified by any other tool. However, stratification by drug group revealed that while FORTA-C/D had a high sensitivity for the detection of benzodiazepine, other psycholeptic, spironolactone, psychoanaleptic, and betablocker PIMs, it only detected a minority of low dose aspirin, opioid, and non-opioid analgesic PIMs. We found that combining items included in FORTA C/D and STOPP achieved a cumulative sensitivity of PIM detection of 79.2%, which could be further increased to 89.8% by additionally considering criteria relating to apixaban, rivaroxaban, and sodium picosulfate from the EU(7)-PIM list, and diuretics from STOPPFall.
The PPO prevalence was similar for both instruments used (63.7% for START and 62.8% for FORTA A), but considerably lower than for their combined use (82.7%), consistent with each tool also identifying unique PIMs. While FORTA-A detected all hypertension and diabetes PPOs, START detected no hypertension PPOs (0.0%) and very few diabetes PPOs (3.9%), but substantially more PPOs than FORTA-A for heart failure (100.0% vs 53.3%), depression (100.0% vs 0.0%), and atrial fibrillation (80.0% vs 30.3%).

Comparison to Literature
Numerous previous studies have used several of the PIM and PPO tools used in this study to examine the PIM and/or PPO prevalence in different settings. According to a recent review of PIM prevalence studies [6], the proportions of study participants affected by PIMs was 44.3% for FORTA (vs. 76.5% in this study) and ranged from 26.7% to 67.3% for STOPP (vs. 65.9% in this study), from 37.5% to 90.6% for EU(7) PIM (vs. 61.9% in this study) and from 13.7% to 68.5% for PRISCUS (vs. 12.8% in this study). Campbell et al. (2010) found that 10.8% of a sample of African American adults aged ≥ 70 years were exposed to at least one drug with strong anticholinergic properties (vs. 6.6% in this study) [29,30]. The prevalence of PIMs according to STOPPFall was 85.4% in one study of hospitalized patients (vs. 36.3% in this study). According to the same review [6], the proportions of study participants affected by PPOs ranged from 19.8% to 64.2% for START (vs. 62.8% in this study). Compared to this data, this study of patients in need of care found the PIM prevalence to be at the high end for FORTA and STOPP/START, in the middle for EU(7)-PIM, and at the low end for PRISCUS, German-ACB and STOPPFall. This may reflect that PRISCUS is a German development, was published in 2010, and contributed to the EU(7) PIM list, while FORTA is a more recent development, and START/STOPP is less well known in the German setting. The discrepancy in the results for STOPPFall however is explained by differing measurement methods. While Damoiseaux-Volman et al. (2022) considered any use of STOPPFall medications as PIMs, we considered them as PIMs only if their users also had risk factors for falls specified in the STOPPFall deprescribing tool [31].
In contrast to prevalence studies using one tool, comparisons of two or more PIM or PPO tools in the same study population are much less common. In a Norwegian population of geriatric wards of people aged 65 or older taking one or more medication, the PIM prevalence was 62.4-69.2% for EU(7)PIM, which is comparable to our findings (61.9%) [32]. In a Kuwaiti population of primary care patients aged 65 years or older, the PIM prevalence was lower for FORTA (44.3%) than for STOPP (55.7%), which is in contrast to our findings (76.5% vs 65.9%, respectively) [33]. In a German population of 3189 Subjects, the PIM prevalence was highest for EU(7)PIM (70.1%), followed by FORTA (55.9%), and PRISCUS (24.7%), whereas in this study, FORTA-CD detected more PIMs (73.9%) than STOPP (65.9%) [34]. These findings highlight that the study population may not only influence the prevalence of polypharmacy, but also the relative performance of different instruments.

Strengths and Limitations
To our knowledge, this is the first study to examine the sensitivity of PIM and PPO detection considering PIM and PPO instruments alone and in combination, which we considered most relevant to the German setting. Our analysis sheds light on the prevalence of PIMs and PPOs in a vulnerable population in need of care, which is often underrepresented in clinical research. We were able to collect a comprehensive data set, which enabled us to apply the vast majority of items included in each tool. However, a small number of items (19 items from the STOPP tool) could not be applied due to missing data, implying that the detected prevalence may be an underestimation. The main limitations of this study are its relatively small sample size and the potential selection bias resulting from convenience sampling. Nevertheless, study participants were included from a variety of settings, and our sample included study participants irrespective of their physical or mental health, or their cognitive abilities.

Implications for Clinical Practice and Research
Our findings demonstrate a very high prevalence of PIMs and PPOs among this vulnerable sample of patients in need of care, with the vast majority of study participants affected by PIM, PPO, or both. These findings alone reinforce the need to regularly and comprehensively review all medications these patients are taking. Our findings suggest that using a single tool may leave a substantial number of PIMs and PPOs undetected, but that by combining FORTA-C/D and STOPP, as well as FORTA-A and START, into comprehensive tools, the proportion of detectable PIMs and PPOs can be considerably increased. Nevertheless, it is clear that any combination of PIM tools applied without computerized support may not comprehensively detect all medication risks associated with polypharmacy, given the vast number of possible drug-drug and drug-disease interactions.
It is also clear that detection of PIMs and PPOs alone does not suffice to improve patient outcomes, which additionally requires clinical judgment to identify actually inappropriate medication, as well as effective interventions to overcome barriers to PIM deprescribing. This study has examined how the sensitivity of PIM and PPO detection can be enhanced in older people in need of care by combining prominent PIM and PPO instruments, but our findings should be confirmed in other settings. In addition, our findings should be supplemented by research characterizing the extent to which PIM and PPO tools identify medication that actually requires medication changes (i.e., deprescribing or initiation of drugs), which interventions may overcome pertinent barriers to which medication changes, as well as the effects of such changes on outcomes that matter to patients.

Conclusions
Instruments which explicitly highlight common and clinically relevant potentially inadequate medication (PIM) and/or potential prescribing omissions (PPOs) may support clinicians in identifying targets for medicines optimization among older people with polypharmacy. However, this study shows that PIM and PPO instruments differ considerably, both in terms of the quantity and nature of medication related problems they detect, and that it therefore matters which tool is used in which setting. Our study also demonstrates that using a single existing tool may not have sufficient sensitivity to detect PIMs and PPOs, and that combining distinct items from two or more instruments may con-siderably increase the sensitivity. Further research is required to optimize the composition of PIM and PPO screening instruments in terms of both the sensitivity and specificity in different settings.