Medical error has been identified as a leading cause of avoidable harm [1
], and patients outside hospitals have been shown at particular risk of drug-related medical errors [3
]. Failure to prescribe necessary drugs, patients failing to adhere to drug treatment, or intoxication caused by drug interactions are all examples of such errors. Many of these errors are preventable, and nearly half of all preventable adverse drug effects have been shown to be serious enough to cause hospitalization. The errors frequently occur in phases of prescribing or monitoring drug use. This highlights the importance of safety interventions in these stages of outpatient care [2
Drug interactions are when concurrent use of multiple drugs, a drug and a food, or a drug and a beverage causes changes in the effects of a drug. Such changes may consist of the appearance of side effects, or in the suppression of desirable effects. The likelihood of drug interactions increases with the number of drugs taken. This also often correlates with age. Up to 7% of hospitalizations have been shown attributable to drug interactions [4
Polypharmacy means concurrent use of multiple drugs, which also applies to taking more than one drug for a single condition. Increased use of drugs has important health benefits, but polypharmacy increases risk of drug interactions. Åstrand et al. [5
] conducted a cross-sectional study of drug prescriptions in Jämtland, Sweden, over a period of 30 years. They found a pronounced 61% increase in polypharmacy, and found the risk of potentially interacting drugs strongly correlated with this increase. Hovstadius et al. [6
] looked at the development of polypharmacy in a 4-year period between 2005 and 2008. Using the Swedish Prescribed Drug Register they were able to analyze data over the entire Swedish population according to drugs prescribed per individual. Applying a definition of polypharmacy as receiving five or more prescription drugs during a three-month period, and of excessive polypharmacy as receiving ten or more drugs within the same period, they found an 8% increase in polypharmacy, and a 16% increase in excessive polypharmacy. The study showed a steady increase in polypharmacy, excessive polypharmacy, and drug use in general, in spite of the well-known risks to patient health [6
Lack of adherence to treatment is a frequent source of drug-related medical error. Non-hospitalized patients have greater influence over their adherence to drug treatment, making them central in the prevention of medical errors [3
]. Patient-centered care and patient participation have been recognized as important means of improving adherence, and have also been recognized as means of increasing health-literacy and improving patient safety and satisfaction. When patients are able to self-manage their symptoms, this leads to more efficient healthcare, and also improves the quality of life. Participation is achieved by facilitating active engagement in decision-making. Achieving this requires that caregivers respect patients’ knowledge of their own bodies, and regard this knowledge as complementary to their professional knowledge. Participation also requires empowering patients with the information and the resources that enable them to participate [7
Assuming a patient-centered perspective on drug treatment suggests exploring patient needs for information regarding drugs. Kusch, Haefeli and Seidling [8
] investigated patients’ desires for drug information through a systematic literature review sourcing studies on patients’ drug information needs as well as studies on patient inquiries to drug information services. Topics were identified, and their frequency of occurrence was calculated. The results consistently showed adverse drug reactions and drug–drug interactions as the most desired topics. Discussing their findings, Kusch et al. [8
] indicated the need for making drug information more accessible to patients. To do so, they suggested building information databases based on patient-oriented topics, such as those identified by themselves.
In Scandinavia, much work has been undertaken to develop information systems for drug information and drug interactions. Such systems help healthcare professionals navigate large and constantly revised amounts of drug information. Böttiger et al. [9
] discussed the Swedish Finnish Interaction X-referencing database (SFINX), a predecessor of the current Janusmed [10
], describing the integration of its database with Swedish and Finnish healthcare systems to serve thousands of physicians and pharmacists. A subsequent study [11
] looked at the impact of SFINX in primary healthcare centers. The study compared 15 centers where the SFINX system had been implemented with 5 centers where it had not. The centers where the system had been implemented showed a significant reduction in prescriptions leading to serious drug interactions [11
Janusmed is today freely available as a web-based service. Although developed for a professional audience, rumors caught the attention of Nörby et al. [12
] that pregnant women had also adopted the site. This prompted them to investigate the pregnant women’s experiences of using the Drugs and Birth Defects section of the service. The findings showed that 11% of the women were already familiar with the service in advance of their participation in the study. They also reported using several other information resources about drugs related to pregnancy. Some became more anxious from accessing the content, but the vast majority found the information valuable and easy to understand. They also found it to confirm and support information provided by healthcare professionals. Nörby et al. went on to suggest that communication and patient compliance might improve if patients and professionals referred to the same sources [12
mHealth is meanwhile emerging as a promising means of delivering patient-oriented healthcare outside hospitals. mHealth is short for mobile healthcare, meaning the delivery of healthcare services and information through mobile devices. The mobile platform allows flexible and ubiquitous access to online and offline services and information. It therefore lends itself in particular to outpatient needs, and to individualized and patient-oriented services [13
]. mHealth applications have been found to have been implemented predominantly through experimentation with technology rather than through strategic planning [14
]. At the same time, these systems impact human lives on a large scale, potentially reaching millions of users and directly affecting their health. Just as it is imperative that information be accurate, it is also imperative that mHealth services not lead to user error, and thereby to adverse health effects [13
Usability in critical systems can be a matter of life or death. Nielsen [15
] commented on Koppel et al. [16
], identifying 22 ways that automated hospital systems could result in the wrong medication being dispensed to patients. Nielsen [15
] identified most of the problems as well-known usability issues that had been understood for decades. In a similar vein, Reolon et al. [13
] showed that studies on healthcare systems have found significant usability problems inviting a range of human errors. These were found to have the potential to lead to injury or even death. Within mHealth they likewise found a range of usability problems as well as a lack of user-centered design. They argued that patient-oriented systems would likely be in particular need for usability due to a lack of training as well as a significant proportion of elderly users.
Usability studies have begun to appear within eHealth and mHealth [13
]. More recent advances [17
] have developed and validated instruments for self-reported usability. To some extent, these have also been adopted in practice [19
]. The usability testing that is needed to collect self-reported metrics is, however, known to be resource intensive. Heuristic evaluations have meanwhile proven to be particularly cost-efficient in driving the design improvement [21
]. Research on the use of heuristic evaluations within mHealth shows papers on the topic to be sparse [13
]. Adam and Vang [22
], meanwhile, conducted an expert-based checklist evaluation on drug–drug interaction websites, measuring information capacity, patient usability and readability. No studies appear to have been conducted specifically on the quality of user interfaces in terms of usability for patient-oriented drug interaction checkers. The objective of this paper has therefore been to explore the prevalence and characteristics of usability issues facing patients using publicly available drug interaction checkers in the Scandinavian market. This was conducted to facilitate future improvements to this emerging market, and to raise awareness of the usability issues users face today.
Two research questions have guided the study:
We delimited the context of exploration to the Scandinavian market in order to facilitate relevant and actionable insights. We also delimited it to drug interaction checkers accessible on mobile platform in order to appreciate the promise of mHealth for patient-oriented and outpatient uses. We did so also to stress usability requirements, seeing that mobile platforms place additional demands on usability.
2. Related Work
Patient-centered healthcare aims to empower patients and involve them in decision-making. This has been shown to lead to better engagement and adherence to treatment, causing better health outcomes. Allowing patients to participate requires informing them according to their own abilities and needs. As a means of promoting patient-centered healthcare, Kusch, Haefeli, and Seidling [8
] conducted a scoping review in order to investigate patients’ desires for drug information beyond that most basic and mandatory to be able to conduct drug treatment. This was done through a literature review sourcing studies on patient inquiries to drug information services, in addition to qualitative studies on patients’ needs for drug information. Topics were independently extracted from the two sets of studies and subsequently integrated in a coherent set of topics. Frequency of occurrence was determined for each topic, and further analyses were conducted to assess the potential for individualization of the most frequent topics. Both sets of studies showed predominant interest in safety-related topics, with adverse drug reactions and drug–drug interactions figuring as the most frequent topics regardless of country, setting, or study design. Discussing the results, Kusch et al. [8
] highlighted the diversity and nuances of information about adverse drug reactions and drug–drug interactions and how this suggests the need for individualization. There are for example patients whose adherence to drug therapy declines with the mere mention of adverse drug reactions, but there are also patients demanding every piece of information regardless of likelihood of occurrence. Kusch et al. [8
] suggested the need for making drug information accessible and individualized for patients. Generating drug information databases based on patient-oriented topics was suggested as an initial step. They added that extensive research has been done on system quality of professional drug–drug interaction checkers, but that little has been done on similar checkers directed at patients [8
Adam and Vang [22
] conducted evaluations of content and patient usability in patient-oriented drug–drug interaction websites. This was done to investigate whether information on publicly available websites was valid and accurate, and to evaluate the capacity of the websites to address laypersons. Information capacity, patient usability and readability were assessed using an expert-based approach with grading scales. Websites with drug–drug interaction checkers were sampled using the Google search engine supplemented by suggestions from clinical practitioners. Sites had to be freely available without payment, and had to provide drug–drug interaction checking functionality capable of identifying the drugs applied in testing. Sites in languages other than English were excluded. Information capacity was assessed using a grading scale awarding one point for the presence of each of the features of alert for drug–drug, drug–food, and drug–alcohol interactions, presence of severity grading, and therapeutic duplication. Patient usability was assessed based on the presence of alert icons, color coding, severity rating scale, medication pick list, and the ability to store a profile account. Patient readability was assessed using the Flesch–Kincaid grading model, as well as the Flesch Reading Ease score [22
]. The results showed that the majority of websites were not optimally patient-oriented due to poor information capacity and limited readability. The authors found many features missing in many of the checkers, and found information difficult to interpret. Half of the websites did not provide any severity ratings, while those that did, demonstrated substantial variations [22
]. Adam and Vang [22
] recognized a lack of validation of the grading scales applied. They also recognized the lack of patient involvement, pointing to practical problems of involving patients with their actual drug regimens. In conclusion, they found that further research is necessary to improve patient-oriented drug–drug interaction information systems [22
Highlighting heuristic evaluations for their cost-efficiency, Reolon et al. [13
] investigated the state of the art of heuristics-based usability studies in mHealth. They argued the need for adapting sets of heuristics both for mobile platform and for the domain of healthcare, aiming to investigate to what extent such adaptations have been made. They conducted a systematic literature review of articles published between 2007 and 2014 mentioning heuristic evaluation of mHealth applications. Applying their selection criteria on title, keywords and abstract, and then through full-text analyses, it reduced results to only seven articles, showing a need for more research on the topic. Contrary to expectations, the results showed studies predominantly applying traditional heuristics, such as those of Nielsen [23
]. Four out of seven studies applied them as-is or only lightly modified. Only one study applied heuristics adapted specifically for the mHealth context [24
], while another two applied Nielsen’s heuristics adapted for mobile context [25
]. Analyses of the heuristics applied showed considerable similarities converging on Nielsen’s heuristics, suggesting these to be sufficiently generic to cover usability independently of device or application. A tendency was, at the same time, identified towards supplementing Nielsen’s heuristics with domain specific or platform specific ones [13
3. Materials and Methods
To attain reliable and valid results, we referred to frameworks for planning and conducting usability studies [26
]. The scope of this paper was to investigate user interface-related usability as experienced by a general public of laypersons. Following from the research objective, our aim was not to assess drug interaction checkers according to any benchmark or predefined goal, nor to generate any kind of ranking. Rather, we wished to provide an explorative account indicating actionable areas for improvement. Even if the checkers evaluated were publicly released, the market as a whole was considered immature. Evaluations were as such formative with the goal of driving design improvement. Issue-based metrics, meaning qualitative and quantitative data based on the detection of usability problems, were applied to support problem discovery and comparison of alternative designs [26
]. Performance metrics and self-reported metrics were ruled out because these depend on user testing. User testing across multiple systems would require significant recruitment to avoid the risk of learning effects. Involving representative users with their personal drug regimens would also pose both practical and ethical challenges surpassing the scope of this paper.
As is commonly recommended in usability studies, we chose a mixed-methods approach, collecting both quantitative and qualitative data based on heuristic evaluations. These were integrated through analyses for triangulation and enhanced understanding of usability problems faced by users [29
]. The specific design chosen was that of convergent mixed methods, gathering both qualitative and quantitative data for independent analysis and subsequent side-by-side interpretation. Single-case analyses were conducted to identify themes of usability issues within each checker, and cross-case analyses were done to shed light on themes common for the Scandinavian context of drug interaction checkers. Both served the purpose of generating actionable insights for developers and decision makers. The study design is illustrated in Figure 1
3.1. Sampling of Drug Interaction Checkers
We performed a sampling to answer the first research question of which interaction checkers are available to the Scandinavian public today. To identify publicly available drug information systems likely encountered by patients, we identified the current leading health category websites for each country of Denmark, Norway, and Sweden. These were used to find popular matches using the Similarsites.com website similarity service [30
]. Websites providing a match of more than 80% similarity to the sites based on category, content, word usage, and search terms were selected. The results were supplemented by websites suggested by clinical practitioners in these three countries. We then applied several exclusion criteria. The drug interaction checkers had to be freely available to the public and had to be aimed at Scandinavian audiences. In order to know which websites fit the criteria, we examined each website’s main page for information about drug interactions. We then queried the website’s internal search for the terms “interaction” and “drug interaction” in the local language. Results were then reviewed for resources on drug–drug interactions.
3.2. Heuristic Evaluations
Having identified available drug interaction checkers, we evaluated these according to the method of heuristic evaluation [28
]. To accommodate different task flows across checkers, we defined tasks in terms of end-goals rather than specified steps. The three tasks applied are presented in Table 1
The drug pairings applied for the tasks were adapted from Adam and Vang [22
], who identified five clinically significant drug–drug interaction pairs with a broad coverage of key therapeutic areas. We consulted a medical practitioner to confirm the relevance of the drug pairings for Scandinavian contexts, as well as to provide an additional, clinically non-significant pairing to allow testing of use cases with non-interacting drugs. Task 1 was to be performed with each of the drug pairings. Task 2 was performed with only the drugs from the first column (Drug X). Task 3 was performed using all drugs in combination, as well as using only Pair 6 for use case testing scenarios of no interaction. There were thus six variations for Tasks 1 and 2, and two variations for Task 3, adding up to 14 tasks. The pairings are presented in Table 2
Although tailoring the set of heuristics applied is often recommended [27
], we found the mHealth heuristics developed by Monkman and Kushniruk [24
] too restrictive and specific for efficient application as mnemonic devices. We preferred instead the heuristics of Nielsen presented in Table 3
The drug interaction checkers were evaluated independently by each evaluator in randomized order. The tasks were executed while inspecting the interface step by step. Evaluations were conducted using the personal smartphones of the evaluators, one of which was an iPhone XS, and the other an iPhone XR. One evaluator was using the native Safari browser, while the other used the Chrome browser. Durations of sessions were limited to a daily maximum of four hours to prevent results being affected by fatigue. Usability issues were recorded in a table, along with screenshots identifying the location of the issue. Severity ratings were given upon completion of all tasks for a given system, rather than issue-by-issue to contribute to coherent ratings. The scheme for severity ratings followed Nielsen [32
], as described in Table 4
Upon having evaluated each system by each task, the evaluators independently reviewed the data across all systems to ensure consistency of records, mappings to heuristics, and severity ratings. By way of researcher triangulation, the evaluators then met over two sessions to discuss the issues identified to consolidate duplicates and to come to agreement on description, severity ratings and mappings to heuristics for each issue. An explicit evaluation protocol was shared between the evaluators to be kept at hand during evaluations in order to ensure uniform procedures. The evaluation protocol included the set of heuristics applied along with a short description of these as well as a table for recording usability issues according to predefined variables such as system name, task, description, applicable heuristics, screenshot, and severity assessment.
3.3. Quantitative Analysis
Initial analyses were performed using quantitative descriptive statistics for detection of patterns in the issue-based data. The analyses were based on the number of issues and their association to heuristics. We calculated the number and proportion of issues by severity rating for each system individually, as well as across all systems. The number and proportion of issues by heuristic applied was likewise calculated for single-case and cross-case analysis. Finally, the likelihood of an evaluator detecting an individual issue was calculated as a means of indicating the reliability of the findings. We used bar chart visualizations to facilitate identification of emergent patterns in the data.
3.4. Qualitative Analysis
The data were prepared through cleaning, coding, and sorting usability issues for qualitative analyses. We coded issues by reading through all issues pertaining to each system, before reviewing them again to categorize them according to the heuristic applied, as well as inductively by theme. Themes were finally revised and merged, if necessary. Emergent patterns were identified, and results were compiled as a narrative usability review for each system. Upon completion of single-case analyses, we performed cross-case analyses by reviewing the findings of all systems. This was done once for immersion and a second time for inductive coding and thematization. Categories were revised for merging and emergent patterns across systems were identified. As with single-case analyses, results were compiled as a cross-case narrative review.
Allowing patients to act as decision-makers regarding their own health requires empowering them with information. The systems conveying this information must be usable in order for empowerment to take place. In this paper, we identified six drug interaction checkers publicly available to Scandinavian audiences: three in Norway, two in Denmark, and one in Sweden. The Norwegian checkers all utilized the same professionally oriented database. The two Danish checkers, on the other hand, utilized the same interface to serve content from different databases—one professionally oriented and one patient-oriented. The Swedish checker served only professionally oriented content and did so using multiple topically oriented databases. The service providers were national authorities, drug industry associations and private eHealth businesses. All of them served content from national authoritative sources, however. All except one of the checkers primarily targeted professional audiences as a means of clinical decision support, while at the same time allowing public access. Only Medicinkombination.dk targeted patient audiences with highly readable patient-oriented content. This constitutes an important contribution to patient-oriented healthcare in Denmark.
Although this paper has not aimed to investigate content specifically, it was noted that checkers targeting professional audiences did so frequently using acronyms and technical terminology. Acronyms are particularly difficult for laypersons to interpret as web searches for their meaning will often yield incorrect or poorly targeted results. Content will consequently not only be hard to read, but sometimes even inaccessible. As Nörby et al. [12
] have shown, this does not stop patients from using these checkers, but is suggested to cause anxiety for some users. It also prevents the benefits of empowering and engaging patients in regard to their health.
The heuristic evaluations showed all of the checkers applying patterns of progressive disclosure, from summarized listings of drug interactions to detailed descriptions and even external links for further reading. This might seem an appropriate pattern to preserve in order to accommodate patients’ varying attitudes towards supplementary drug information, as identified by Kusch, Haefeli and Seidling [8
]. Although the content of Medicinkombination.dk was very readable, it was also very brief. This may represent a precautious approach to provide content without causing anxiety. It may, however, also limit the positive effects of improved communication with professionals and improved compliance, as suggested by Nörby et al. [12
6.1. Prevalence and Characteristics of Usability Issues
In terms of usability issues and their characteristics, a large number of issues were detected across all of the drug interaction checkers. These were predominantly minor issues, but a considerable number of major issues were found in all but one of the checkers. Numbers of issues cannot be compared across studies, but we regard these as high numbers in view of the limited extent of the tasks applied. Qualitative analyses identified a positive correlation between the number of usability issues and the complexity of a checker. Care should be taken not to interpret this as a causal relation justifying more usability issues as a tradeoff for providing functionality. One would rather expect the basics to precede the extras. Considering their relation to basic design principles, an argument could be made that the high number of usability issues may indicate lack of systematic user-centered development. It could also result from losing track of user goals over time through incremental addition of features. Addition of features will at some point warrant redesign of a system, but limited resources may instead lead to shortsighted solutions being implemented. These can then accumulate to reduce usability and system quality. We suggest these as the most likely culprits worth examining for those drug interaction checkers identified in this paper with high numbers of usability issues.
Analyses, moreover, showed a very low degree of mobile-adaptive design. Three out of five of the checkers were essentially non-supportive of mobile devices. Adaptive design may have been a premium in the past, but is today an essential component of system quality for the Web. The proportion of mobile users on the Web is steadily increasing, at this point matching desktop users in Sweden [33
]. Intermittent tasks, such as checking drug interactions, lend themselves in particular to mobile use, making it readily apparent that service providers cannot afford to ignore this segment of users. Lack of adaptive design forces users to handle zooming and repositioning of the viewport. This causes a constant strain on efficiency, effectiveness, and satisfaction of use, rapidly convincing users to leave and never return.
Many of the checkers also showed poor leverage of basic design principles such as visual and typographic hierarchies to guide users through the interface. Minimalist and aesthetic design utilizes consistency to highlight important information through purposefully breaking this consistency. This is a way of guiding users through complex information environments. Lack of consistency generates noise, and this noise prevents anything from standing out to the user. Usability issues relating to lack of minimalist and aesthetic design thus causes unnecessary cognitive efforts spent at interpreting the interface as opposed to attaining user goals. This leads to user dissatisfaction.
Supporting users through search suggestions, thesauri and means of managing search results was also identified as particularly important for this context. All but one of the checkers already provided search suggestions. In the only case where this was not provided, we noted this as a major usability issue in view of users having to type unfamiliar and hard-to-spell drug names without any help. Without search suggestions users would also not get early feedback that their search terms would not be recognized, and as such not yield any results. These findings of ours support Adam and Vang’s [22
] inclusion of search suggestions (medication pick list) as a criterion for patient usability. A natural extension of helping users type their intended queries is to help them attain the intended results even when they provide inaccurate input. In this respect we noted a lack of thesauri in all of the systems. Thesauri would be particularly useful due to the need for spelling unfamiliar words with multiple plausible spellings. As an example, the term “thyroxine” may reasonably be spelled as “thyroxin”, “tyroxin”, or “tyroksin” in Scandinavian languages. In addition, it would be reasonable to accept English spellings as well as popular terms such as “vitamin d” for technical terms such as “kolekalsiferol”. Thesauri were not considered by Adam and Vang [22
]. These means of user support are particularly important as they affect not only long-term satisfaction, but even short-term ability to retrieve information at all.
In terms of single-case analyses, nearly twice the mean number of usability issues were identified for the Swedish checker Janusmed. Even catastrophic issues were identified related to basic design principles such as providing appropriate feedback to users. The structure of the service was especially complex and incoherent. This may suggest organic growth insufficiently supported by goal-oriented and user-centered approaches. Janusmed on the other hand did provide more utility in terms of both content and functionality, employing promising interaction idioms for constructing queries. The Norwegian Interaksjoner.no took a minimalist approach, providing core functionality accompanied by clear and coherent feedback and visual hierarchy. This resulted in a particularly low number of usability issues. It failed, however, at providing patients with much-needed support in spelling difficult drug names, resulting in major usability issues in that area. Felleskatalogen.no represented a middle ground of functionality and coherence. It was one of only two services providing mobile-adaptive design. Providing necessary user support also resulted in an average amount of usability issues, but with a notable distribution towards the lower end of the severity spectrum. Legemiddelsok.no had particular issues with adaptive design, causing excessive viewport handling. The most notable trend beyond that was a complete lack of basic information welcoming users, explaining the purpose of the service, or how to use it. Medicinkombination.dk distinguished itself by providing highly accessible patient-oriented content. Its complete lack of adaptive design did not however help in providing accessibility for public audiences on mobile devices. A catastrophic issue was also identified where users might be presented with results for the previous query rather than the current. This could potentially lead to serious medical error.
6.2. Practical Implications
Our findings indicate the need for developing patient-oriented drug information databases where these are currently unavailable. They also show the importance of making these databases accessible to patients through usable and user-centered interfaces. This echoes recommendations by Kusch, Haefeli, and Seidling [8
]. In this paper, we also identify a cost-efficient example of how this can be done, as exemplified by Medicinkombination.dk. Simply developing such databases allows their implementation through technical frameworks already available. As shown by the Norwegian FEST database, such databases may be served openly through multiple providers. They may also be delivered to mobile applications or other innovative technologies. Beyond this, the significance of our findings lies in bringing awareness to usability problems preventing patients from attaining goals in terms of health needs and quality of life. We have done so by specifying the problems and the drug interaction checker to which they belong to, in order to help decision-makers appreciate the importance of addressing these needs as well as to facilitate action. Single-case and cross-case analyses identify themes of issues within and across the checkers evaluated. These themes inform more general areas of focus for development. They may also indicate organizational or methodological issues causing these themes to occur, such as lack of user-centered approaches. Once recognized, such issues may be addressed.
6.3. Reliability and Validity
Reliability of the study has been striven for by providing transparency of the procedures for data collection, analysis and interpretation. In terms of heuristic evaluations, a discussion prevails on the number of evaluators needed for reliable results. Some suggest five evaluators are enough to detect 80% of all usability issues, while others claim this is not nearly enough [26
]. Nielsen [35
] and Molich [36
] appear to agree, however, that five evaluators are sufficient to drive useful design iteration. Nielsen adds to this that more evaluations with fewer evaluators are more efficient than fewer evaluations with more evaluators [35
]. The circumstances of this study have not permitted external recruitment, and evaluations were performed by the paper’s first two authors. Both were in their last term of a three-year Bachelor program in Interaction Design. Both additionally had more than four years of professional experience with user interface design. While this number of evaluators is less than what is commonly recommended, it is not without precedence [37
Calculations were performed to estimate the reliability of the findings. The number of issues detected by each evaluator was divided by the total number of issues detected in the study. This individual detection rate was then averaged across the two evaluators for a mean probability of detection. This probability was used as a basis for making an estimation of the proportion of usability issues likely detected for the systems [26
] (pp. 116–117). The calculations indicated an individual detection rate of 65%, suggesting a total coverage of 87% of all usability issues for the systems. This considerably exceeds Nielsen and Landauer’s [34
] estimates of about 50% coverage by two evaluators, and also exceeds the 80% commonly sought in usability evaluations. We suggest that this high reliability is a result of the limited scope of tasks applied across the five systems evaluated.
Other means taken to ensure reliability were triangulation of qualitative and quantitative data sources, as well as having researchers work independently and check each other’s findings for researcher triangulation. The transparency of procedures has been strived for through accurate description of the methodology and choice thereof in the methodology section.
Validity has been strived for by providing rich, thick description of the findings, as well as by triangulating data sources. Usability is defined as contingent upon users’ contextualized and lived experience. Expert-based methods such as heuristic evaluations contain inherent risk of disconnect from this experience. On the other hand, Molich [36
] found usability issues identified through heuristic evaluations to rarely provide false positives. Nonetheless, the evaluations must be understood as conditioned by presumptions about behaviors of the target audience. These were based on general populations rather than grounded on user-centered research. Audiences for these drug interaction checkers are known to be broad, but also to contain higher proportions of elderly users and other segments with particular need for drug information [11
]. Potential variations in behavior based on this factor have not been accounted for in this paper. Addressing this could be done through triangulation by contextually situated and user-centered methods, such as usability testing with targeted participants or involving targeted participants as lead users. Presumptions about the audience also affected severity assessments in terms of the anticipated frequency of occurrence. This could potentially affect their validity. The absence of user account functionality might for example have been a more prominent theme, had evaluations proceeded from an assumption of regularly returning users.
Adam and Vang [22
] assessed patient usability in terms of presence or absence of alert icons, color coding, severity rating scale, medication pick list and the ability to save a user profile. Whereas their grading scale would have awarded the checkers sampled in this paper with full scores on all of the factors of alert icons, color coding, and severity rating scales, our heuristic evaluations identified even catastrophic usability issues arising from the very same features. We found that not only medication pick lists are needed for usability, but also handling of alternate spellings and misspellings. We also found that requiring a login to a user profile can be detrimental to usability depending on assumptions about the target audience and the specific implementation. This highlights limitations of the approach chosen by Adam and Vang [22
] compared to the use of heuristic evaluations to show a more nuanced picture of usability.
This paper applied issue-based metrics for their potential for open-ended exploration of usability within and across multiple services available in the Scandinavian market. They were also chosen for their capacity to inform actionable insights. The exclusion of performance metrics and self-reported satisfaction limits the extent to which overall usability could be assessed and quantified. This was a choice made for delimiting the scope of the study, eliminating the need for extensive recruitment and user testing.
Although we have adopted a patient-oriented perspective, findings were rarely found to pertain to patients to the exclusion of professionals. Even content-related findings, such as use of acronyms and technical terminology, could be argued as relevant to professionals in training, or to those less familiar with certain medical topics. A patient-oriented approach, as such, might be suggested as a kind of lead user approach, highlighting overlooked needs that nonetheless benefit primary audiences.
This paper has explored the availability of patient-oriented drug interaction checkers in the Scandinavian countries of Denmark, Norway, and Sweden. It has also explored the prevalence and characteristics of usability problems in these checkers which prevent patients from benefiting from them. In our study, we identified six such checkers: three Norwegian, two Danish, and one Swedish. The service providers were national authorities, drug industry associations and privately owned eHealth businesses. All of the checkers served content from national authoritative sources. All except one of the checkers primarily targeted professional audiences for clinical decision support, while at the same time allowing public access. Only Medicinkombination.dk targeted patient audiences with patient-oriented and highly readable content.
A large amount of usability issues was detected across all of the drug interaction checkers evaluated, and a considerable number of major issues were identified in all but one of the checkers. Three catastrophic usability issues were detected, each deemed catastrophic for its potential to lead to serious medical errors. Although the numbers of usability issues cannot be compared across studies, these should be considered high in view of the limited extent of the tasks. Catastrophic issues signify issues unacceptable in publicly released systems, but the large amount of minor issues also suggests insufficient systematic development and testing.
The qualitative analyses identified four cross-case themes of usability issues: there was a notable lack of mobile-adaptive design, there was a general lack of patient-oriented content, there was a general lack of adherence to basic design principles, and there was a clear positive correlation between system complexity and number of usability issues.
Our findings show a beginning towards accommodating patient needs, but also show Scandinavian audiences faced with a limited number of drug interaction checkers primarily targeting medical professionals. These are known to be used by patients for their utility but fail to accommodate them in terms of information and system quality. Empowering patients to participate in decision-making affecting their personal health and quality of life calls for developing patient-oriented drug information databases where these do not currently exist. These need to be presented through clear, coherent and supportive user interfaces, acknowledging the behaviors of the patients interacting with them. Developers and decision-makers should ensure that future development initiatives are conducted based on systematic goal-oriented and user-centered design approaches and research, to ensure that drug interaction-checkers inform patients according to their own needs and abilities rather than implementation-centric or arbitrarily emergent mental models.
In accordance with the findings of this paper, we suggest that further research should focus on exploring patient needs and behaviors in accessing drug-related information. Whereas Kusch, Haefeli and Seidling [8
] approach this in terms of topics desired by patients, future research should focus more on qualitative accounts of how this information is actually accessed and used. This would provide strong foundations for future development of patient-oriented information databases. It could also inform the development of mobile applications taking advantage of the individualization, mobility and multi-functionality of mobile devices. We also suggest future studies involving representative users to validate and complement the findings of both this paper and previous research on this topic.