Usability Evaluation Ecological Validity: Is More Always Better?
Abstract
:1. Introduction
2. Models of Ecological Validity
2.1. User Roles
2.2. Environment
2.3. Training
2.4. Test Scenario
2.5. Patient Involvement
2.6. Software
2.7. Hardware
3. Materials and Methods
3.1. Health Information Technology
3.2. Risk Analysis to Inform Scenario Development
3.3. Participants
3.4. Study Design and Test Conditions
3.5. Measurements
3.6. Procedures
3.7. Statistical Analyses
3.8. Ethical Considerations
4. Results
5. Discussion
5.1. Principal Findings
5.2. Comparison to the Literature and Implications of Findings
5.3. Strengths and Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sonderegger, A.; Sauer, J. The Influence of Laboratory Set-up in Usability Tests: Effects on User Performance, Subjective Ratings and Physiological Measures. Ergonomics 2009, 52, 1350–1361. [Google Scholar] [CrossRef] [PubMed]
- Morten, H. User Testing in Industry: A Case Study of Laboratory, Workshop, and Field Tests. In Proceedings of the User Interfaces for All: Proceedings of the 5th ERCIM Workshop; GMD: Sankt Augustin, Germany, 1999; pp. 59–72. [Google Scholar]
- Sauer, J.; Seibel, K.; Rüttinger, B. The Influence of User Expertise and Prototype Fidelity in Usability Tests. Appl. Ergon. 2010, 41, 130–140. [Google Scholar] [CrossRef] [PubMed]
- Sauer, J.; Sonderegger, A. Methodological Issues in Product Evaluation: The Influence of Testing Environment and Task Scenario. Appl. Ergon. 2011, 42, 487–494. [Google Scholar] [CrossRef] [PubMed]
- Sauer, J.; Sonderegger, A. The Influence of Prototype Fidelity and Aesthetics of Design in Usability Tests: Effects on User Behaviour, Subjective Evaluation and Emotion. Appl. Ergon. 2009, 40, 670–677. [Google Scholar] [CrossRef] [PubMed]
- Kushniruk, A.; Nohr, C.; Jensen, S.; Borycki, E.M. From Usability Testing to Clinical Simulations: Bringing Context into the Design and Evaluation of Usable and Safe Health Information Technologies. Contribution of the IMIA Human Factors Engineering for Healthcare Informatics Working Group. Yearb. Med. Inform. 2013, 8, 78–85. [Google Scholar] [PubMed]
- Borycki, E.M.; Kushniruk, A.W. Towards an Integrative Cognitive-Socio-Technical Approach in Health Informatics: Analyzing Technology-Induced Error Involving Health Information Systems to Improve Patient Safety. Open Med. Inform. J. 2010, 4, 181–187. [Google Scholar] [CrossRef] [PubMed]
- Marcilly, R.; Schiro, J.; Genin, M.; Somers, S.; Migaud, M.-C.; Mabile, F.; Pelayo, S.; Del Zotto, M.; Rochat, J. Detectability of Use Errors in Summative Usability Tests of Medical Devices: Impact of the Test Environment. Appl. Ergon. 2024, 118, 104266. [Google Scholar] [CrossRef] [PubMed]
- Boothe, C.; Strawderman, L.; Hosea, E. The Effects of Prototype Medium on Usability Testing. Appl. Ergon. 2013, 44, 1033–1038. [Google Scholar] [CrossRef]
- Uebelbacher, A. The Fidelity of Prototype and Testing Environment in Usability Tests. Doctoral Thesis, University of Fribourg, Fribourg, Switzerland, 2014. [Google Scholar]
- van Berkel, N.; Clarkson, M.J.; Xiao, G.; Dursun, E.; Allam, M.; Davidson, B.R.; Blandford, A. Dimensions of Ecological Validity for Usability Evaluations in Clinical Settings. J. Biomed. Inform. 2020, 110, 103553. [Google Scholar] [CrossRef]
- Thomas, J.C.; Kellogg, W.A. Minimizing Ecological Gaps in Interface Design. IEEE Softw. 1989, 6, 78–86. [Google Scholar] [CrossRef]
- Nielsen, J. Usability Engineering; Academic Press: Boston, MA, USA, 1993; ISBN 978-0-12-518405-2. [Google Scholar]
- Kjeldskov, J.; Skov, M.B.; Stage, J. Does Time Heal? A Longitudinal Study of Usability. In Proceedings of the Australian Computer-Human Interaction Conference 2005 (OzCHI’05), Canberra, Australia, 21–25 November 2005. [Google Scholar]
- Rubin, J. Handbook of Usability Testing: How to Plan, Design, and Conduct Effective Tests; Wiley Technical Communication Library; Wiley: New York, NY, USA, 1994; ISBN 978-0-471-59403-1. [Google Scholar]
- Park, H.; Monkman, H.; Wenger, A.; Lesselroth, B. Portrait of Ms. Diaz: Empirical Study of Patient Journey Mapping Instruction for Medical Professional Students. Knowl. Manag. E-Learn. Int. J. 2020, 12, 469–487. [Google Scholar] [CrossRef]
- Dahl, Y.; Alsos, O.A.; Svanæs, D. Fidelity Considerations for Simulation-Based Usability Assessments of Mobile ICT for Hospitals. Int. J. Hum.-Comput. Interact. 2010, 26, 445–476. [Google Scholar] [CrossRef]
- Schmuckler, M.A. What Is Ecological Validity? A Dimensional Analysis. Infancy 2001, 2, 419–436. [Google Scholar] [CrossRef] [PubMed]
- Kieffer, S.; Sangiorgi, U.B.; Vanderdonckt, J. ECOVAL: A Framework for Increasing the Ecological Validity in Usability Testing. In Proceedings of the 2015 48th Hawaii International Conference on System Sciences, Kauai, HI, USA, 5–8 January 2015; pp. 452–461. [Google Scholar]
- Wang, Y.; Mehler, B.; Reimer, B.; Lammers, V.; D’Ambrosio, L.A.; Coughlin, J.F. The Validity of Driving Simulation for Assessing Differences between In-Vehicle Informational Interfaces: A Comparison with Field Testing. Ergonomics 2010, 53, 404–420. [Google Scholar] [CrossRef] [PubMed]
- Dahl, Y.; Alsos, O.A.; Svanæs, D. Evaluating Mobile Usability: The Role of Fidelity in Full-Scale Laboratory Simulations with Mobile ICT for Hospitals. In Human-Computer Interaction. New Trends; Jacko, J.A., Ed.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5610, pp. 232–241. ISBN 978-3-642-02573-0. [Google Scholar]
- Ben-Menahem, S.M.; Nistor-Gallo, R.; Macia, G.; Von Krogh, G.; Goldhahn, J. How the New European Regulation on Medical Devices Will Affect Innovation. Nat. Biomed. Eng. 2020, 4, 585–590. [Google Scholar] [CrossRef]
- Kjeldskov, J.; Skov, M.B. Studying Usability In Sitro: Simulating Real World Phenomena in Controlled Environments. Int. J. Hum.-Comput. Interact. 2007, 22, 7–36. [Google Scholar] [CrossRef]
- Sun, X.; May, A. A Comparison of Field-Based and Lab-Based Experiments to Evaluate User Experience of Personalised Mobile Devices. Adv. Hum.-Comput. Interact. 2013, 2013, 619767. [Google Scholar] [CrossRef]
- Kaikkonen, A.; Kallio, T.; Kekäläinen, A.; Kankainen, A.; Cankar, M. Usability Testing of Mobile Applications: A Comparison between Laboratory and Field Testing. J. Usability Stud. 2005, 1, 4–16. [Google Scholar]
- Kjeldskov, J.; Graham, C.; Pedell, S.; Vetere, F.; Howard, S.; Balbo, S.; Davies, J. Evaluating the Usability of a Mobile Guide: The Influence of Location, Participants and Resources. Behav. Inf. Technol. 2005, 24, 51–65. [Google Scholar] [CrossRef]
- Sauer, J.; Sonderegger, A.; Heyden, K.; Biller, J.; Klotz, J.; Uebelbacher, A. Extra-Laboratorial Usability Tests: An Empirical Comparison of Remote and Classical Field Testing with Lab Testing. Appl. Ergon. 2019, 74, 85–96. [Google Scholar] [CrossRef]
- Nielsen, C.M.; Overgaard, M.; Pedersen, M.B.; Stage, J.; Stenild, S. It’s Worth the Hassle!: The Added Value of Evaluating the Usability of Mobile Systems in the Field. In Proceedings of the 4th Nordic Conference on Human-Computer Interaction Changing Roles—NordiCHI ’06, Oslo, Norway, 14–18 October 2006; pp. 272–280. [Google Scholar]
- Duh, H.B.-L.; Tan, G.C.B.; Chen, V.H. Usability Evaluation for Mobile Device: A Comparison of Laboratory and Field Tests. In Proceedings of the 8th Conference on Human-Computer Interaction with Mobile Devices and Services—MobileHCI ’06, Helsinki, Finland, 12–15 September 2006; p. 181. [Google Scholar]
- Kjeldskov, J.; Skov, M.B.; Als, B.S.; Høegh, R.T. Is It Worth the Hassle? Exploring the Added Value of Evaluating the Usability of Context-Aware Mobile Systems in the Field. In Mobile Human-Computer Interaction—MobileHCI 2004; Brewster, S., Dunlop, M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3160, pp. 61–73. ISBN 978-3-540-23086-1. [Google Scholar]
- Baillie, L.; Schatz, R. Exploring Multimodality in the Laboratory and the Field. In Proceedings of the 7th International Conference on Multimodal Interfaces—ICMI ’05, Torento, Italy, 4–6 October 2005; p. 100. [Google Scholar]
- Usability.gov, Scenarios, Usability.Gov, Improving the User Experience. 2013. Available online: https://www.usability.gov/how-to-and-tools/methods/scenarios.html (accessed on 16 February 2024).
- Virzi, R.A.; Sokolov, J.L.; Karis, D. Usability Problem Identification Using Both Low- and High-Fidelity Prototypes. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Common Ground—CHI ’96, Vancouver, BC, Canada, 13–18 April 1996; pp. 236–243. [Google Scholar]
- Säde, S.; Nieminen, M.; Riihiaho, S. Testing Usability with 3D Paper Prototypes—Case Halton System. Appl. Ergon. 1998, 29, 67–73. [Google Scholar] [CrossRef] [PubMed]
- Sauer, J.; Franke, H.; Ruettinger, B. Designing Interactive Consumer Products: Utility of Paper Prototypes and Effectiveness of Enhanced Control Labelling. Appl. Ergon. 2008, 39, 71–85. [Google Scholar] [CrossRef] [PubMed]
- Andre, A. Automated External Defibrillator Use by Untrained Bystanders: Can the Public-Use Model Work? Prehospital Emerg. Care 2004, 8, 284–291. [Google Scholar] [CrossRef] [PubMed]
- Logier, R.; Jeanne, M.; De Jonckheere, J.; Dassonneville, A.; Delecroix, M.; Tavernier, B. PhysioDoloris: A Monitoring Device for Analgesia/Nociception Balance Evaluation Using Heart Rate Variability Analysis. In Proceedings of the 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, Buenos Aires, Argentina, 31 August–4 September 2010; pp. 1194–1197. [Google Scholar]
- Peute, L.W.; Lichtner, V.; Baysari, M.T.; Hägglund, M.; Homco, J.; Jansen-Kosterink, S.; Jauregui, I.; Kaipio, J.; Kuziemsky, C.E.; Lehnbom, E.C.; et al. Challenges and Best Practices in Ethical Review of Human and Organizational Factors Studies in Health Technology: A Synthesis of Testimonies: A Joint Contribution from the International Medical Informatics Association’s Human Factors Engineering and the European Federation for Medicatl Informatics’ Human and Organizational Factors of Medical Informatics Working Groups. Yearb. Med. Inform. 2020, 29, 58–70. [Google Scholar] [CrossRef] [PubMed]
- Toulouse, E.; Masseguin, C.; Lafont, B.; McGurk, G.; Harbonn, A.; Roberts, J.A.; Granier, S.; Dupeyron, A.; Bazin, J.E. French Legal Approach to Clinical Research. Anaesth. Crit. Care Pain Med. 2018, 37, 607–614. [Google Scholar] [CrossRef] [PubMed]
- Wiklund, M.E.; Kendler, J.; Strochlic, A.Y. Usability Testing of Medical Devices; CRC Press: Boca Raton, FL, USA, 2011; ISBN 978-1-4398-1183-2. [Google Scholar]
- Nielsen, J. Applying Discount Usability Engineering. IEEE Softw. 1995, 12, 98–100. [Google Scholar] [CrossRef]
- Krug, S. Rocket Surgery Made Easy: The Do-It-Yourself Guide to Finding and Fixing Usability Problems; Voices That Matter; New Riders: Berkeley, CA, USA, 2010; ISBN 978-0-321-65729-9. [Google Scholar]
- Kushniruk, A.W.; Patel, V.L. Cognitive and Usability Engineering Methods for the Evaluation of Clinical Information Systems. J. Biomed. Inform. 2004, 37, 56–76. [Google Scholar] [CrossRef] [PubMed]
- Kushniruk, A.; Turner, P. A Framework for User Involvement and Context in the Design and Development of Safe E-Health Systems. Stud. Health Technol. Inform. 2012, 180, 353–357. [Google Scholar]
- Jensen, S.; Nøhr, C.; Rasmussen, S.L. Fidelity in Clinical Simulation: How Low Can You Go? Stud. Health Technol. Inform. 2013, 194, 147–153. [Google Scholar]
- Schulz, K.F.; Altman, D.G.; Moher, D.; CONSORT Group. CONSORT 2010 Statement: Updated Guidelines for Reporting Parallel Group Randomised Trials. BMJ 2010, 340, c332. [Google Scholar] [CrossRef]
- Ogrinc, G.; Davies, L.; Goodman, D.; Batalden, P.; Davidoff, F.; Stevens, D. SQUIRE 2.0 (Standards for QUality Improvement Reporting Excellence): Revised Publication Guidelines from a Detailed Consensus Process. BMJ Qual. Saf. 2016, 25, 986–992. [Google Scholar] [CrossRef] [PubMed]
- Peute, L.W.; Driest, K.F.; Marcilly, R.; Bras Da Costa, S.; Beuscart-Zephir, M.-C.; Jaspers, M.W.M. A Framework for Reporting on Human Factor/Usability Studies of Health Information Technologies. Stud. Health Technol. Inform. 2013, 194, 54–60. [Google Scholar] [PubMed]
- Bevan, N.; Barnum, C.; Cockton, G.; Nielsen, J.; Spool, J.; Wixon, D. The “Magic Number 5”: Is It Enough for Web Testing? In Proceedings of the CHI ’03 Extended Abstracts on Human Factors in Computing Systems—CHI ’03, Ft. Lauderdale, FL, USA, 5–10 April 2003; p. 698. [Google Scholar]
- Barnum, C.M. Usability Testing Essentials: Ready, Set … Test! 2nd ed.; Morgan Kaufmann: Amsterdam, The Netherlands, 2021; ISBN 978-0-12-816942-1. [Google Scholar]
- Lewis, J.R. Usability: Lessons Learned … and Yet to Be Learned. Int. J. Hum.-Comput. Interact. 2014, 30, 663–684. [Google Scholar] [CrossRef]
- Caron, A.; Vandewalle, V.; Marcilly, R.; Rochat, J.; Dervaux, B. The Optimal Sample Size for Usability Testing, From the Manufacturer’s Perspective: A Value-of-Information Approach. Value Health 2022, 25, 116–124. [Google Scholar] [CrossRef]
- Food and Drug Administration. Applying Human Factors and Usability Engineering to Medical Devices—Guidance for Industry and Food and Drug Administration Staff; Food and Drug Administration: Rockville, MD, USA, 2016.
Type No. | Error Type | Description of the Error | Level of Severity |
---|---|---|---|
1 | ANI-ECG * confusion | Participant confuses ECG with ANI | Mild |
2 | No detection of overdosage | The participant does not recognize when an ANIm value over 80 for an unconscious patient represents a medication overdose | Moderate |
3 | Focus on ANI | The participant only uses the ANI index and neglects other data sources to evaluate the patient’s discomfort level | Moderate |
4 | Considering poor-quality data | The participant does not consider the quality of signal acquisition and bases her/his decision on poor-quality data | Moderate |
5 | Considering out-of-date data | The participant does not reset the ECG signal and bases her/his decision on out-of-date or erroneous data | Moderate |
6 | High ANI misunderstanding | The participant erroneously interprets the meaning of a high ANI on the screen | Severe |
7 | Low ANI misunderstanding | The participant erroneously interprets the meaning of a low ANI on the screen | Severe |
8 | Considering other patient data | The participant does not reset the values from the previous patient and bases her/his decisions on erroneous data | Severe |
Ecological Validity Dimension | The Low-Fidelity Condition | The High-Fidelity Condition |
---|---|---|
1. User roles | Physicians and nurses specialized in intensive care. Participants had a minimum of two months of experience in intensive care and received training on the pain monitor two days before the test. | |
2. Environment | Administrative room without other types of medical equipment and devices. | Simulated resuscitation rooms were similar to actual resuscitation rooms. The room was equipped with furniture and real medical technology and devices (infusion pumps, ECG * monitor, etc.). The simulation space mimicked a real resuscitation room in terms of temperature, ambient sounds, interruptive alarms, and disinfectant smell. |
3. User training | All participants attended a training session at least two days before the test. This matches current training protocols with new equipment. | |
4a. Scenarios, breadth | We did not set off monitor alarms to interrupt the participants. | Interrupting alarms from the monitors interrupted the participants, just like in real-life resuscitation rooms. |
4b. Scenarios, depth | Five goal-based scenarios to test all eight identified use errors. Each scenario was performed twice, once in each condition. We furnished a summary of the patient’s case, including a description of the patient (e.g., age, gender, conditions), the clinical course, and a list of medications taken. | |
4c. Scenarios, behavior | Participants were asked to verbalize how they would respond and what actions they would take. | Participants were asked to act on the mannequin as they would in real life. |
5. Patient involvement | We did not include a patient or representation (i.e., a mannequin). Instead, the test moderator described the patient’s status. We provided screenshots of the patient parameters required for medical decision-making. | We used a mannequin capable of reproducing physiologically realistic reactions of the human body. |
6. Hardware | Participants are shown screenshots printed on paper and a video on a computer screen with no possibility of interaction with the computer. | The ANI * pain monitor is an interactive screen framed by a plastic shell. Users can interact by directly pressing the interactive buttons on the screen. Depending on which buttons are pressed, parameters are modified, or windows are opened on the interface. |
7. Software | The test was primarily performed using screenshots of the pain monitor. Scenario 4 (testing error 8) required the participant to see blinking ANI curves; thus, a video of the blinking screen was shown instead of a screenshot. Participants could see screenshots of data typically rendered on ECG, respiratory, and ANI pain monitors. | The test was performed with an actual ANI pain monitor. Participants could see the patient’s data typically rendered on ECG, respiratory, and ANI pain monitors in a live environment. |
Profile | Number (Females; Males) | Mean Age in Years (SD) |
---|---|---|
Anesthesiologists | 15 (9; 6) | 28.26 (2.54) |
Nurses | 15 (9; 6) | 31.93 (6.14) |
Total | 30 (18; 12) | 30.01 (5.05) |
Type No | Error Type | Description of the Error | Severity Level | Low- Fidelity | High- Fidelity |
---|---|---|---|---|---|
1 | ANI *-ECG * confusion | Participant confuses ECG with ANI | Mild | 0 (0%) | 0 (0%) |
2 | No detection of overdosage | The participant does not recognize when an ANIm value over 80 for an unconscious patient represents a medication overdose | Moderate | 4 (13%) | 2 (7%) |
3 | Focus on ANI | The participant only uses the ANI index and neglects other data sources to evaluate the patient’s discomfort level | Moderate | 0 (0%) | 0 (0%) |
4 | Considering poor-quality data | The participant does not consider the quality of signal acquisition and bases her/his decision on poor-quality data | Moderate | 1 (3%) | 0 (0%) |
5 | Considering out-of-date data | The participant does not reset the ECG signal and bases her/his decision on out-of-date or erroneous data | Moderate | 0 (0%) | 0 (0%) |
6 | High ANI misunderstanding | The participant erroneously interprets the meaning of a high ANI on the screen | Severe | 3 (10%) | 1 (3%) |
7 | Low ANI misunderstanding | The participant erroneously interprets the meaning of a low ANI on the screen | Severe | 6 (20%) | 7 (23%) |
8 | Considering other patient data | The participant does not reset the values from the previous patient and bases her/his decisions on erroneous data | Severe | 3 (10%) | 4 (13%) |
Error Severity | Low-Fidelity | High-Fidelity |
---|---|---|
Mild (1 possible error type) | 0 | 0 |
Moderate (4 possible error types) | 5 | 2 |
Severe (3 possible error types) | 12 | 12 |
Total (8 possible error types) | 17 | 14 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Marcilly, R.; Monkman, H.; Pelayo, S.; Lesselroth, B.J. Usability Evaluation Ecological Validity: Is More Always Better? Healthcare 2024, 12, 1417. https://doi.org/10.3390/healthcare12141417
Marcilly R, Monkman H, Pelayo S, Lesselroth BJ. Usability Evaluation Ecological Validity: Is More Always Better? Healthcare. 2024; 12(14):1417. https://doi.org/10.3390/healthcare12141417
Chicago/Turabian StyleMarcilly, Romaric, Helen Monkman, Sylvia Pelayo, and Blake J. Lesselroth. 2024. "Usability Evaluation Ecological Validity: Is More Always Better?" Healthcare 12, no. 14: 1417. https://doi.org/10.3390/healthcare12141417
APA StyleMarcilly, R., Monkman, H., Pelayo, S., & Lesselroth, B. J. (2024). Usability Evaluation Ecological Validity: Is More Always Better? Healthcare, 12(14), 1417. https://doi.org/10.3390/healthcare12141417