HEUXIVA: A Set of Heuristics for Evaluating User eXperience with Voice Assistants
Abstract
1. Introduction
2. Background
2.1. Voice Assistants
- Effective Communication: The interaction between the user and the voice assistant is bidirectional, involving a continuous exchange of information and roles (sender and receiver) [13].
- Multi-user: The device can recognize the voice of other people, making it usable by everyone present in the same location [17].
- Security and Privacy: The device has a privacy policy that specifies what data they collect, why they collect it, and how the user can update, manage, export, and delete it.
- Multi-linkable: The device allows linking/integrating other devices and external services/apps and controlling their use, such as smart home devices (controlling lights, appliances, temperature, and air conditioning, etc.), music services, among others.
- Culturizable/Adaptable: The device recognizes/generates expressions and sets of words that cannot be deduced from the meanings of the words forming them, all according to the geographical location of the user.
- Voice Interface: The device provides the corresponding information through the voice interface.
- Guidance and Assistance: The device guides and assists the user with problems related to the use and installation/configuration of it.
2.2. User Experience
- Useful: The product must be useful and satisfy a user’s need.
- Usable: The system must be easy to use and quick to learn.
- Desirable: The design elements should be attractive and interesting to the user to cause appreciation and emotion.
- Findable/locatable: Information should be navigable and easy to find within and outside a system.
- Credible: The image of the company or system must be trustworthy.
- Valuable: The system must provide added value or contribute to the user to satisfy his needs.
- Accessible: The system must be able to adapt to users with some type of disability.
2.3. User Experience Evaluation
- Heuristic evaluation: A method in which a group of 3 to 5 evaluators analyze an interface, identifying positive and negative aspects according to a set of rules called heuristics [1].
- User testing (thinking aloud): A method involving representative users who navigate and interact with a system while performing predefined tasks, verbalizing their thoughts and actions aloud [28].
3. Related Work
4. Material and Methods
4.1. Methodology Applied for Developing HEUXIVA
4.2. First Iteration: Development Process for HVA
- 10 voice assistants’ features: effective communication, effective, activity management, customizable, multi-user, security and privacy, multi-linkable, culturizable/adaptable [9], voice interface, guidance and assistance.
- 3 usability attributes: effectiveness, efficiency, and satisfaction [35].
- Voice Assistant Usability Issues: formal inspection made in Stage 2, Infrequent Users’ Experiences of Intelligent Personal Assistants by Cowan [9].
- 12 heuristics need to be refined mainly by improving their checklists and definition.
- A new heuristic needs to be added to cover the user response aspects.
- It was decided to carry out a second iteration repeating the last three steps of the methodology.
4.3. Second Iteration: Development Process for HEUXIVA
5. Results
5.1. Results Obtained in the Iteration 1: Validation Through Heuristic Evaluation
- Numbers of correct and incorrect associations of problems to heuristics
- Number of usability/UX problems identified
- Number of specific usability/UX problems identified
- Number of identified usability/UX problems that qualify as more severe (how catastrophic the usability/UX problem detected is)
- Number of identified usability/UX problems that qualify as more critical (how severe and frequent the problem detected is)
5.2. Results Obtained in Iteration 1: Validation Through Expert Judgment
5.3. Results Obtained in Iteration 2: Validation Through Expert Judgment
5.4. Results Obtained in Iteration 2: Validation Through User Testing
5.4.1. User Test Design
5.4.2. Participant Selection
5.4.3. Results Obtained
6. HEUXIVA: Heuristics for Evaluating User eXperience with Voice Assistants
7. Discussions
7.1. About the Results Obtained in Validation Stage (First and Second Iteration)
7.2. Comparative Analysis with Existing Heuristics and Evaluation Methods
7.3. Novel Contributions and Creation of New Heuristics
8. Limitations
9. Conclusions and Future Work
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Inputs, Outputs, and Activities for Each Step Performed in Iteration 1
| Step | Input | Output | Activities Performed | 
|---|---|---|---|
| Step 1: Exploratory Stage | - | ① Information about voice assistant devices (three definitions, ten features, and the necessity and taxonomy of voice assistants); ② one proposal for usability attributes and one proposal for UX attributes; and ③ five sets of related heuristics | Conduct a literature review about voice assistants (definitions and features); usability/UX attributes; existing sets of usability/UX heuristics related, and other relevant information. | 
| Step 2: Experimental Stage | ① Information about voice assistant devices; ② one proposal for usability attributes and one proposal for UX attributes; and ③ five sets of related heuristics | ④ Voice assistant usability issues | Conduct a formal inspection made by two researchers. Identify usability issues during the formal inspection of the device. | 
| Step 3: Descriptive Stage | ① Information about voice assistant devices; ② one proposal for usability attributes and one proposal for UX attributes; and ③ five sets of related heuristics; ④ Voice assistant usability issues | ⑤ Selected information about voice assistants; ⑥ ten features of voice assistants; ⑦ three UX attributes from one proposal; ⑧ usability issues found; and ⑨ five selected sets of heuristics | Group all the information collected. Sort and prioritize the information using a three-level scale (3: highly important; 2: somewhat important; 1: not important). Select the relevant information to develop the set of heuristics. | 
| Step 4: Correlational Stage | ⑤ Selected information about voice assistants; ⑥ ten features of voice assistants; ⑦ three UX attributes from one proposal; ⑧ usability issues found; and ⑨ five selected sets of heuristics | ⑩ Matched all features, attributes, existing heuristics, and other related elements together | Match the ten voice assistant features with the three UX attributes, and the five sets of heuristics [10,11,31,33,37]; and the usability issues. | 
| Step 5: Selection Stage | ⑩ Matched features, attributes, existing heuristics, and other related elements | ⑪ Classified heuristics (1 heuristic to keep; 39 heuristics to adapt; and 11 heuristics to eliminate) | Review Nielsen heuristics [31], conversational agents heuristics [10], heuristics for evaluating chatbots [11], and the ergonomic criteria for voice user interfaces [37]. Determine what heuristics to: keep, adapt, and eliminate. | 
| Step 6: Specification Stage | ⑩ Matched features, attributes, existing heuristics, and other related elements; ⑪ Classified heuristics (1 heuristic to keep; 39 heuristics to adapt; and 11 heuristics to eliminate) | ⑫ Set 12 of voice assistant heuristics, HVA (first iteration) | Specify 12 UX heuristics for voice assistants (HVA), including: id, name, definition, explanation, voice assistant feature, examples, UX attribute, and existing heuristics related. | 
| Step 7: Validation Stage | ⑫ Set 12 of voice assistant heuristics, HVA (first iteration) | ⑬ Heuristic evaluation results: effectiveness of HVA; ⑭ Expert judgment results (survey) | Perform a heuristic evaluation with six evaluators (three evaluators for the control group, and three evaluators for the experimental group). Perform a survey for experts to review the heuristics. | 
| Step 8: Refinement Stage | ⑬ Heuristic evaluation results: effectiveness of HVA; ⑭ Expert judgment results (survey) | ⑮ Refining document: (1) 12 heuristics to refine, 1 heuristic to add; (2) repeat steps 5–8 | Document the improvements to be performed in the specification of HVA. It is decided to repeat stages 5–8. | 
Appendix B. Inputs, Outputs, and Activities for Each Step Performed in Iteration 2
| Step | Input | Output | Activities Performed | 
|---|---|---|---|
| Step 6: Specification Stage | ⑮ Refining document: 12 heuristics to refine, 1 heuristic to add; ⑩ Matched features, attributes, existing heuristics, and other related elements; ⑫ Set 12 of voice assistant heuristics, HVA (first iteration) | ① Set 12 of voice assistant heuristics, HEUXIVA (second iteration) | Refine the specification of the 13 UX heuristics for voice assistants (HVA), including: id, name, definition, explanation, voice assistant feature, examples, UX attribute, and existing heuristics related. | 
| Step 7: Validation Stage | ① Set 12 of voice assistant heuristics, HEUXIVA (second iteration) | ② Heuristic evaluation results: effectiveness of HEUXIVA; ③ Expert judgment results (survey); ④ User tests results | Perform a heuristic evaluation with X evaluators (X evaluators for the control group, and X evaluators for experimental group). Perform a survey for eight experts to review the heuristics. Perform a thinking-aloud test to evaluate a case study with twelve users. | 
| Step 8: Refinement Stage | ① Set 12 of voice assistant heuristics, HEUXIVA (second iteration); ② Heuristic evaluation results: effectiveness of HEUXIVA; ③ Expert judgment results (survey); ④ User tests results | ⑤ Set of 13 voice assistant heuristics, HEUXIVA (second iteration) | Refine and improve the final specification of 13 UX heuristics for voice assistants (HEUXIVA). | 
Appendix C. Set of Heuristics for Voice Assistants Developed at Each Iteration
| First Iteration (HVA) | Second Iteration (HEUXIVA) | 
|---|---|
| HVA1: System Status Visibility | HEUXIVA1: System Status Visibility | 
| HVA2: Feedback and Help Users Prevent Errors | HEUXIVA2: System Guidance and Capabilities | 
| HVA3: Brevity and Relevance of Information | HEUXIVA3: Effective and Fluid Communication | 
| HVA4: Natural Communication | HEUXIVA4: Environment Match Between Assistant and User Language | 
| HVA5: Match Between the System and the Real World | HEUXIVA5: Information Accuracy | 
| HVA6: Consistent Voice Interface | HEUXIVA6: User Control and Freedom | 
| HVA7: User Control and Freedom | HEUXIVA7: Consistent Voice Interface | 
| HVA8: Flexibility and Personalization | HEUXIVA8: Voice Shortcuts, Flexibility and Personalization | 
| HVA9: Help Users Recognize, Diagnose, and Fix Errors | HEUXIVA9: Error Prevention | 
| HVA10: System Guidance and Capabilities | HEUXIVA10: Help Users Recognize, Diagnose, and Fix Errors | 
| HVA11: Reliability and Data Privacy | HEUXIVA11: Data Privacy | 
| HVA12: Guides and Documentation | HEUXIVA12: Voice Assistant Reliability | 
| HEUXIVA13: Guides and Documentation | 
Appendix D. First Iteration, Step 2: “Experimental Stage”
| ID | Problem | Occurrence Example | Explanation (Why It Affects the User) | Severity | 
|---|---|---|---|---|
| P1 | Device ignores user | When making a request, action and/or question, the device will sometimes “wake up” (perform the listening action) and ignore the user without providing feedback as to why it will not perform the requested action. | When ignored, the user feels uncertain about what the problem is, and why it does not work. | 4 (catastrophic problem) | 
| P2 | Difficulty initializing device | When connecting the device for the first time, the pairing process becomes difficult for the user because the device does not provide feedback until it is fully configured. Also, when reconnecting it to the same location, the device would present connection errors, and it must be manually reset. | When the device presents difficulties in initializing (user’s first impressions) it generates the person’s intention not to use it. | 4 (catastrophic problem) | 
| P3 | Lack of manual or instructions | There is no guide to reset the device. | Without a complete user guide or manual, the person must manually search the Internet for external explanations. | 3 (major problem) | 
| P4 | Device does not understand language and jargon | When the user expresses themself using language and technical terms, the device ends the conversation early and/or says, “I’m sorry, I didn’t understand”. | Since the device does not know the language of the place in which it is located, it causes the user to change the way they speak, in addition to not generating a fluid conversation. | 3 (major problem) | 
| P5 | Device provides incoherent responses | When asking the device about certain topics (e.g., the user’s mood), it provides incoherent answers and changes the context of the conversation. | By providing incoherent and/or unrelated responses to the topic, the device generates uncertainty in the user about the device’s capabilities (the limits that the device has). | 3 (major problem) | 
| P6 | The device does not recognize the user’s voice when in a noisy environment | When the device is in a noisy environment (e.g., television on), it does not distinguish the user’s voice despite having voice recognition. | If the user cannot be detected by the device, the user must increase the volume of their voice, raise the pitch and/or turn off the device that is providing noise near the device. | 3 (major problem) | 
| P7 | Device does not provide useful information to user | When asking about the weather in the city of Punta Arenas, Chile, the device gives the weather in the city of Puntarenas, Costa Rica. | It is annoying for the user that it gives different results since it is supposed to know their location when connected to their home network and provide information accordingly. | 3 (major problem) | 
| P8 | The device has limited memory | When you ask the device about a topic that was discussed less than 30 s ago, it does not remember what was discussed. | If the device does not remember what the user told it in the previous request, it gives the impression that the user is not being listened to and/or paid attention to. | 2 (minor problem) | 
| P9 | The device does not have orientation on the volume up and down buttons | When trying to manually increase the volume of the device, the user becomes disoriented when trying to increase/decrease the volume. | The user may be confused as they must press the buttons at random to find out which button they wanted to select. | 2 (minor problem) | 
| P10 | Device ends conversations prematurely | When interacting with the device, it stops talking after less than 1 s, causing the user to have to start the conversation again with the activation phrase. | As the device ends conversations at its discretion, it makes the user realize that they are talking to a machine/robot. | 2 (minor problem) | 
| P11 | Inconsistent language | When the device is playing music on Spotify and the user disconnect it using his/her phone, the device displays a message in English despite being set to Spanish. | A message in another language causes confusion for the user because they may not understand what the device is communicating. | 2 (minor problem) | 
| P12 | The device does not understand search requests | When the user asks the device to “Search Barso”, it responds “Sorry, I didn’t understand”, even though the device can perform Google searches. | By not understanding search queries, the user may become uncertain about whether the device works or can be useful. | 2 (minor problem) | 
| P13 | Device does not manage voice pairings with external devices | The process of linking the device to external devices must be manual, using the mobile phone application (Google Home). | Since the action of managing links is not performed by voice, the user is forced to do them manually using the device’s mobile application. | 2 (minor problem) | 
Appendix E. First Iteration, Step 3: “Descriptive Stage”
| Topic | Value According to Relevance | Explanation | ||
|---|---|---|---|---|
| 3: Highly Important | 2: Somewhat Important | 1: Not Important | ||
| Voice assistant information | Name and definition of voice assistant [4]; Name and definition of voice assistant [7]; Name and definition of voice assistant [8]; Need to create a UX evaluation method for voice assistants [12]. | Taxonomy of voice assistants [40] | - | The different definitions of voice assistants and the need to create a UX evaluation method for them were deemed highly relevant and their taxonomy was somewhat relevant. | 
| Voice assistant features | Effective Communication; Effective; Activity Management; Customizable; Multi-user; Security and Privacy; Multi-linkable; Culturizable/adaptable; Voice Interface; Guidance and Assistance | - | - | All features were considered highly relevant. | 
| UX attributes | Useful; Usable; Desirable; Findable/locatable; Credible; Valuable; Learning Capacity; Effectiveness; Efficiency; Satisfaction | - | Accessibility | Out of the three proposals for UX attributes collected in Stage 1, only Accessibility was not considered due to its complexity. | 
| Sets of heuristics | 11 R. Langevin’s heuristics [10]; 10 Nielsen’s heuristics [31] | 5 L. M. Sanchez-Adame’s heuristics [11]; 8 C. Nowacki and A. Gordeeva’s heuristics [37] | - | Two sets of heuristics were deemed highly important, and 3 sets were considered somewhat relevant. | 
| Usability/UX problems | Formal inspection by researchers (see Appendix D) | R. Cowan’s problems with the experience of people who use IPAs occasionally [9] | - | Two sets of usability/UX problems were considered relevant enough. | 
| Other related elements | - | Zwakman’s VUS questionnaire [33] | - | One related element was selected. | 
Appendix F. First Iteration, Step 4: “Correlational Stage”
| Feature | Usability/UX Attribute | Heuristic Related | Usability/UX Problems (Obtained from Formal Inspection and R. Cowan’s Problems [9] | VUS Items | 
|---|---|---|---|---|
| Effective communication | Effectiveness; Efficiency; Useful | H2: Context (partially covered feature) H3: Naturalness (partially covered feature) C1: Visibility of system status (slightly covered feature) C5: Error prevention (fully covered feature) C8: Aesthetic, minimalist and engaging design (partially covered feature) C9: Help users recognize, diagnose and recover from errors (fully covered feature) C10: Context preservation (partially covered feature) N1: Visibility of system status (slightly covered feature) N5: Error prevention (partially covered feature) N9: Help users recognize, diagnose, and recover from errors (slightly covered feature) E1.2: Immediate feedback (partially covered feature) E5: Error management (slightly covered feature) E5.2: Quality of error messages (partially covered feature) E7.1: Short a long-term memory (partially covered feature) | P1: Device ignores user P5: Device provides incoherent responses P10: Device ends conversations prematurely P11: Inconsistent language | I1: I thought the response from the voice assistant was easy to understand | 
| Effective | Effectiveness; Efficiency; Useful | H1: Complexity (slightly covered feature) H2: Context (slightly covered feature) C6: Help and guidance (partially covered feature) E5: Error management (slightly covered feature) | P7: Device does not provide useful information to user P8: The device has limited memory P12: The device does not understand search requests | I2: I thought the information provided by the voice assistant was not relevant to what I asked I10: I found the voice assistant difficult to use | 
| Activity management | Useful; Credible; Valuable; Satisfaction; Learning capacity | C3: User control and freedom (slightly covered feature) C7: Flexibility and efficiency of use (partially covered feature) N3: User control and freedom (slightly covered feature) N7: Flexibility and efficiency of use (slightly covered feature) E2.1: Brevity (slightly covered feature) E2.2: Information density (slightly covered feature) E3.1: Explicit user action (partially covered feature) E3.2: User control (slightly covered feature) | P2: Device ignores user P3: Difficulty initializing device PP: Trust issues when assigning activities to the device | I5: I felt the voice assistant enabled me to successfully complete my tasks when I required help I7: The voice assistant had all the functions and capabilities that I expected it to have | 
| Customizable | Satisfaction; Useful; Desirable | C7: Flexibility and efficiency of use (partially covered feature) N3: User control and freedom (slightly covered feature) N7: Flexibility and efficiency of use (slightly covered feature) E4.1: Flexibility (partially covered feature) E4.2: User’s experience level (partially covered feature) E7.2: Environment (partially covered feature) E8.2: Behavior (partially covered feature) | No associated problem found/detected | I6: I found it frustrating to use the voice assistant in a noisy and loud environment I8: I found it difficult to customize the voice assistant according to my needs and preferences | 
| Multi-user | Effectiveness; Useful | H2: Context (slightly covered feature) C10: Context preservation (partially covered feature) E4.3: Multi-user (partially covered feature) | P6: The device does not recognize the user’s voice when in a noisy environment | An associated item was not found/detected | 
| Security and privacy | Credible; Satisfaction; Findable/locatable | C11: Trustworthiness (partially covered feature) E8.2: Behavior (slightly covered feature) | PP: Trust, data privacy, transparency and data ownership issues | An associated item was not found/detected | 
| Multi-linkable | Useful; Valuable; Effectiveness | C9: Help users recognize, diagnose and recover from errors (slightly covered feature) N9: Help users recognize, diagnose and recover from errors (slightly covered feature) | P13: Device does not manage voice pairings with external devices PP: Problems with integration with apps, platforms and systems | An associated item was not found/detected | 
| Culturizable/adaptable | Efficiency; Satisfaction; Desirable | H2: Context (partially covered feature) H3: Naturalness (partially covered feature) C2: Match between system and the real world (partially covered feature) C4: Consistency and standards (partially covered feature) N2: Match between system and the real world (slightly covered feature) N4: Consistency and standards (partially covered feature) N8: Aesthetic and minimalist design (slightly covered feature E4: Adaptability (slightly covered feature) E4.1: Flexibility (partially covered feature) E4.3: Multi-user (partially covered feature) | P4: Device does not understand idioms and jargon | An associated item was not found/detected | 
| Voice interface | Effectiveness; Efficiency; Useful | H3: Naturalness (partially covered feature) C1: Visibility of system status (slightly covered feature) C6: Help and guidance (partially covered feature) N1: Visibility of system status (partially covered feature) E6: Consistency (slightly covered feature) E6.2: External consistency E8.1: Identity | PP: Hands-free interaction support issues | An associated item was not found/detected | 
| Guidance and assistance | Effectiveness; Useful; Valuable; Satisfaction; Findable/locatable | N10: Help and documentation (partially covered feature) | No associated problem found/detected | An associated item was not found/detected | 
Appendix G. First Iteration, Step 5: “Selection Stage”
| ID | Name | Action | References | Voice Assistant Feature Covered | Applicability | 
|---|---|---|---|---|---|
| H1 | Complexity | Adapt | [11] | Effectiveness | (1) Useful | 
| H2 | Context | Adapt | [11] | Effective communication; Effectiveness; Multi-user; Culturizable/adaptable | (1) Useful | 
| H3 | Naturalness | Adapt | [11] | Effective communication; Culturizable/adaptable; Voice interface | (2) Important | 
| C1 | Visibility of system status | Adapt | [10] | Effective communication; Voice interface | (3) Critical | 
| C2 | Match between system and the real world | Adapt | [10] | Culturizable/adaptable | (2) Important | 
| C3 | User control and freedom | Adapt | [10] | Activity management | (3) Critical | 
| C4 | Consistency and standards | Adapt | [10] | Culturizable/adaptable | (2) Important | 
| C5 | Error prevention | Adapt | [10] | Effective communication | (3) Critical | 
| C6 | Help and guidance | Adapt | [10] | Effectiveness; Voice interface | (3) Critical | 
| C7 | Flexibility and efficiency of use | Adapt | [10] | Activity management; Customizable | (2) Important | 
| C8 | Aesthetic, minimalist and engaging design | Adapt | [10] | Effective communication | (3) Critical | 
| C9 | Help users recognize, diagnose and recover from errors | Adapt | [10] | Effective communication; Multi-linkable | (3) Critical | 
| C10 | Context preservation | Adapt | [10] | Effective communication; Multi-user | (3) Critical | 
| C11 | Trustworthiness | Adapt | [10] | Security and privacy | (3) Critical | 
| N1 | Visibility of system status | Adapt | [31] | Effective communication; Voice interface | (2) Important | 
| N3 | User control and freedom | Adapt | [31] | Activity management; Customizable | (3) Critical | 
| N4 | Consistency and standards | Adapt | [31] | Culturizable/adaptable | (2) Important | 
| N5 | Error prevention | Adapt | [31] | Effective communication | (2) Important | 
| N6 | Recognition rather than recall | Adapt | [31] | Activity management | (1) Useful | 
| N7 | Flexibility and efficiency of use | Adapt | [31] | Activity management; Customizable | (2) Important | 
| N8 | Aesthetic and minimalist design | Adapt | [31] | Culturizable/adaptable | (1) Useful | 
| N9 | Help users recognize, diagnose and recover from errors | Adapt | [31] | Effective communication; Multi-linkable | (2) Important | 
| N10 | Help and documentation | Adapt | [31] | Guidance and assistance | (3) Critical | 
| E1.2 | Immediate feedback | Adapt | [37] | Effective communication | (3) Critical | 
| E2.1 | Brevity | Adapt | [37] | Activity management | (3) Critical | 
| E2.2 | Information density | Adapt | [37] | Activity management | (3) Critical | 
| E3.1 | Explicit user action | Adapt | [37] | Activity management | (3) Critical | 
| E3.2 | User control | Adapt | [37] | Activity management | (3) Critical | 
| E4 | Adaptability | Adapt | [37] | Culturizable/adaptable | (2) Important | 
| E4.1 | Flexibility | Adapt | [37] | Customizable; Culturizable/adaptable | (1) Useful | 
| E4.2 | User’s experience level | Adapt | [37] | Customizable | (1) Useful | 
| E4.3 | Multi-user | Adapt | [37] | Multi-user; Culturizable/adaptable | (3) Critical | 
| E5 | Error management | Adapt | [37] | Effective communication; Effectiveness | (1) Useful | 
| E5.2 | Quality of error messages | Adapt | [37] | Effective communication | (1) Useful | 
| E6 | Consistency | Adapt | [37] | Voice interface | (1) Useful | 
| E6.2 | External consistency | Adapt | [37] | Voice interface | (1) Useful | 
| E7.1 | Short a long-term memory | Adapt | [37] | Effective communication | (1) Useful | 
| E7.2 | Environment | Adapt | [37] | Customizable | (2) Important | 
| E8.1 | Identity | Adapt | [37] | Voice interface | (3) Critical | 
| E8.2 | Behavior | Adapt | [37] | Customizable; Security and privacy | (1) Useful | 
Appendix H. First Iteration, Step 8: “Refinement Stage”
| ID | Refinement Section | Description | Action | Source | 
|---|---|---|---|---|
| HVA1 | Definition | Include “illumination aspects”. | Add | Heuristic evaluation | 
| Reduction for better understanding. | Modify | Expert judgment | ||
| Checklist | Include the following elements: 
 | Add | Heuristic evaluation | |
| HVA2 | Name, Definition, Explanation | Modify to make them more comprehensible and representative. | Modify | Expert judgment | 
| Checklist | Include the following elements: 
 | Add | Heuristic evaluation | |
| Remove the following item: The device warns of possible situations when carrying out a particular action. | Remove | Expert judgment | ||
| HVA3 | Definition, Explanation | Remove the description related to “short or minimal activation command”. | Remove | Expert judgment | 
| Specification table | Include the concept of “coherence”. | Add | Expert judgment | |
| Review the ease of use of heuristic. | Analyze | Expert judgment | ||
| Checklist | Include the following elements: 
 | Add | Expert judgment | |
| Remove the following elements: 
 | Remove | Expert judgment | ||
| HVA4 | Name, Definition | Modify to make them more comprehensible and representative. | Modify | Expert judgment | 
| Specification table | Remove the concept of “coherence”. | Remove | Expert judgment | |
| Checklist | Include the following elements: 
 | Add | Heuristic evaluation | |
| HVA5 | Name, Explanation | Specify for better understanding. | Modify | Expert judgment | 
| Definition | Include the concept of “idiolect”. | Add | Heuristic evaluation | |
| Checklist | Include the following element: The artifact recognizes the user’s particular way of speaking in requests. | Add | Heuristic evaluation | |
| Specification table | Analyze why HVA5 obtained 50% of correct associations. | Analyze | Heuristic evaluation | |
| HVA6 | Checklist | Include the following element: The device maintains its formal language even in error situations | Add | Expert judgment | 
| Expand checklist listing. | Analyze, Add | Expert judgment | ||
| HVA8 | Name, Definition | Incorporate concepts: voice shortcut, customization/adaptation. | Add | Expert judgment | 
| Checklist | Include the following elements: 
 | Add | Expert judgment | |
| HVA9 | Checklist | Include the following element: The device clearly indicates the possible causes of errors. | Add | Expert judgment | 
| HVA10 | Checklist | Include the following elements: 
 | Add | Heuristic evaluation | 
| HVA11 | Specification table | Review the ease of use of heuristic. | Analyze | Expert judgment | 
Appendix I. Criteria Used to Evaluate the Effectiveness of a New Set of Usability/UX Heuristics (From [2,41])
| Criterion Description | Formula | 
| 
 | where 
 | 
| 
 | 
 | 
| 
 | where 
 | 
| 
 | where 
 | 
| 
 | where 
 | 
Appendix J. Full HEUXIVA Specification, Using the Template Proposed in the Methodology Applied
| ID | HEUXIVA1 | 
|---|---|
| Name | System Status Visibility | 
| Definition | The device must indicate to the user via voice, sound and/or illumination every action that is performed. | 
| Explanation | The device must deliver communication sufficiently intuitive for the user through the intonation of the voice of the assistant, of emphasis at the beginning and end of the conversation giving way to the user to continue the dialog with the artifact. Likewise, to provide the user with the status of the system, the assistant must communicate every action performed, to be performed or being performed in the same context/situation or request. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency, Satisfaction UX: Useful, Valuable | 
| Voice Assistant Feature | Effective conversation, Voice interface, Activity management | 
| Set of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok Google, play music  Ok, playing *song name* on Spotify *Music starts playing* | 
| Non-compliance:  Ok Google, tell me about my reminders for today *Device lights turn on*  *Silence* | 
| ID | HEUXIVA2 | 
|---|---|
| Name | System Guidance and Capabilities | 
| Definition | The device must guide the user through dialog and activities using words that the user recognizes (and does not increase their cognitive ability). It should also clarify in a simple way its capabilities. | 
| Explanation | The device must be capable of establishing a conversation with the user. Where it guides and orients the user throughout the dialog so that the device can function correctly, and the user does not get lost in the process. In turn, if the device does not have a feature and/or cannot carry out a user’s request, the device must explain in a simple way why it does not have and/or cannot execute the action in natural language. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Effectiveness, Satisfaction UX: Useful, Desirable, Usable, Valuable | 
| Voice Assistant Feature | Culturizable/adaptable, Voice interface, Effective communication, Activity management | 
| Set of heuristics related | |
| Checklist | 
 | 
| Example | Compliance:  Ok Google, read my email  My version does not allow me to perform this action, however, update 1.2 allows it | 
| Non-compliance:  Ok Google, read my email  I’m sorry, I didn’t understand you | 
| ID | HEUXIVA3 | 
|---|---|
| Name | Effective and Fluid Communication | 
| Definition | The device must adapt to the context and situations that arise in the conversation, as well as remembering previous requests and conversations with the user. | 
| Explanation | The device must communicate as effectively as possible with the user, respecting the context of the conversation and being prepared to pause, conversation fillers and interruptions, as well as failures in the dialog, detours, and in turn the device must be able to remember previous conversations with the user and/or requests from the user. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency UX: Useful, Usable | 
| Voice Assistant Feature | Effective communication, Effective, Multi-user | 
| Set of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance: *User whispers*  It’s time to sleep *GA whispers*  I will play music to sleep, may you rest | 
| Non-compliance: *GA playing music*  OK Google, pause  According to the RAE, “pause” means brief interruption of an action or movement. | 
| ID | HEUXIVA4 | 
|---|---|
| Name | Environment Match Between Assistant and User Language | 
| Definition | The device must understand the user’s particular way of speaking, in addition to interacting in their language; with words, phrases and concepts familiar to the user. | 
| Explanation | The device must be verbally adapted to the geographical location in which it is located, giving way to conversations using the language and concepts or expressions that the user uses daily. | 
| Priority | (2) Important | 
| UX/Usability attribute | Usability: Effectiveness UX: Useful, Valuable, Desirable | 
| Voice Assistant Feature | Culturizable/adaptable, Multi-user, Voice interface | 
| Set of heuristics related | 
 | 
| Checklist | 
 | 
| Examples | Compliance:  Ok Google. How are you?  I feel very well *2 s later*  Ok Google, reproduce música  Reproduciendo *name of song* en Spotify. | 
| Non-compliance: *User speaking English*  Ok Google, What time is it? *GA answer in Spanish*  Son las 8 de la mañana | 
| ID | HEUXIVA5 | 
|---|---|
| Name | Information Accuracy | 
| Definition | The responses delivered by the device must be relevant, brief and according to what is requested by the user. Similarly, the device must provide truthful information during interaction with the user. | 
| Explanation | For actions/requests to be more efficient and effective, the device’s responses must be coherent and truthful, that is, the information provided must be logical, realistic and true. In turn, to capture the user’s attention, the responses must be brief and contain the most essential and/or important part of what is requested. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency, Satisfaction UX: Useful, Valuable | 
| Voice Assistant Feature | Effective conversation, Effective, Voice interface | 
| Set of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok, Google, when did World War II start?  The Second World War began on 1 September, 1939. | 
| Non-compliance:  Ok Google, what is the temperature?  The current temperature in Valparaíso is 11 °C, for tomorrow a temperature of 16° is expected with a maximum of 15° and a minimum of 7° and a probability of rain of 20%. | 
| ID | HEUXIVA6 | 
|---|---|
| Name | User Control and Freedom | 
| Definition | The device allows the user to perform, redo, and undo actions or requests. | 
| Explanation | The device allows actions requested by the user and at their request. Sometimes redo and undo these requests when the user deems it. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Satisfaction UX: Credible, Valuable, Learning capacity, Useful | 
| Voice Assistant Feature | Activity management, Effective communication, Customizable | 
| Set of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok Google, delete my 7 p.m. alarm  Ok, alarm deleted | 
| Non-compliance:  Ok Google, delete my 7 p.m. alarm  I can’t delete the alarm | 
| ID | HEUXIVA7 | 
|---|---|
| Name | Consistent Voice Interface | 
| Definition | The device must be able to provide the information through voice and being consistent in its personality. | 
| Explanation | The device should be able to provide information and/or answers ideally through the voice interface and, in turn, in the interaction with the user, the device should follow standards in the user’s personality, that is, have a consistent voice/tone, language style and sounds, so as not to confuse the user. | 
| Priority | (2) Important | 
| UX/Usability attribute | Usability: Satisfaction UX: Credible, Desirable, Useful, Valuable | 
| Voice Assistant Feature | Effective conversation, Voice interface, Culturizable/adaptable, Activity management | 
| Set of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance: *It is 1:00 p.m. on 28 July.  Ok Google, read my reminders  Today you have 2 reminders, one at 2:30 p.m. “Take pills” and another at 6:00 p.m. “Walk”. Do you want me to mention the week’s reminders? | 
| Non-compliance: *GA is playing music on Spotify*  Ok Google, how are you? *GA indicates in a feminine voice*  I feel great today. *User unlinks the GA connection with Spotify* *GA indicates in masculine voice*  Error when playing Spotify | 
| ID | HEUXIVA8 | 
|---|---|
| Name | Voice Shortcuts, Flexibility and Personalization | 
| Definition | The device must answer depending on the environment in which the user is located, providing shortcuts according to the context, allowing customization and adapting according to the needs of the user. | 
| Explanation | The device must have flexibility to adapt to the needs and capabilities of users, this being the type of user (novice, expert), physical environments and aspects of device customization. In addition to providing voice shortcuts to perform an action more quickly. | 
| Priority | (2) Important | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency, Satisfaction UX: Usable, Learning capacity | 
| Voice Assistant Feature | Customizable, Multi-user, Multi-linkable | 
| Sets of heuristics related | |
| Checklist | 
 | 
| Example | Compliance:  Ok Google, music.  Ok, reproducing *song name* on Spotify (The user can say “music” instead of “play music”) | 
| Non-compliance:  Ok Google, music.  I’m sorry, I didn’t understand. | 
| ID | HEUXIVA9 | 
|---|---|
| Name | Error Prevention | 
| Definition | The device must provide the necessary information to warn the user when an error is about to occur. | 
| Explanation | When the user requests an action that could change the context of the interaction and/or an error is about to be triggered, the system must warn the user, communicating the consequences of the action that is about to be performed. | 
| Priority | (2) Important | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency UX: Useful | 
| Voice Assistant Feature | Effective Conversation, Voice interface | 
| Sets of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok Google, play music  Ok, playing music  Ok Google, call mom  When calling, the music will stop, do you still want to call? | 
| Non-compliance:  Ok Google, read me today’s news  Here you have today’s news *Reads the news*  Ok Google, I want to watch a Youtube video  Ok, playing recommended videos on Youtube *Stops reading the news* | 
| ID | HEUXIVA10 | 
|---|---|
| Name | Help Users Recognize, Diagnose, and Fix Errors | 
| Definition | Error messages should be expressed in simple language (not codes), accurately indicate the problem, and constructively suggest a solution that mostly uses voice commands or actions. | 
| Explanation | At the time of an error or problem occurring during interaction with the device, that is, while the user is using the device, it manifests and implies the error in a language understandable to it and provides an appropriate solution and help, all this preferably through the voice interface. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Effectiveness, Efficiency UX: Valuable, Useful | 
| Voice Assistant Feature | Culturizable/adaptable, Voice interface, Multi-linkable | 
| Sets of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok Google, call Fernanda O.  I’m sorry, I can’t do that. To make a call you must first link the device with Google’s Duo App. | 
| Non-compliance: *There’s an alarm programmed for the next day and 9 p.m.*  Ok Google, create a new alarm for tomorrow at 9 p.m.  Sorry, I didn’t understand | 
| ID | HEUXIVA11 | 
|---|---|
| Name | Data Privacy | 
| Definition | The device must inform the user about the privacy and use of personal data. Likewise, it must grant the possibility of rejecting the collection and analysis of their data, thus being transparent and truthful with the user. | 
| Explanation | The device must request the user’s permission for the use of the data that will be collected over time, and the user must have the possibility to reject this option. | 
| Priority | (3) Critical | 
| UX/Usability attribute | Usability: Satisfaction UX: Valuable, Credible | 
| Voice Assistant Feature | Security and privacy, Activity management | 
| Sets of heuristics related | 
 | 
| Checklist | 
 | 
| Examples | Compliance: *Initializing the device for the first time*  Hello, our conversations help me improve, do you allow me to collect data?  No thanks  Okay, the data from our conversations will not be collected. | 
| Non-compliance: *Initializing the device for the first time*  Hello, our conversations help me improve, do you allow me to collect data?  No thanks  If you do not accept, I will not be able to function properly | 
| ID | HEUXIVA12 | 
|---|---|
| Name | Voice Assistant Reliability | 
| Definition | Reliability must be transmitted through the behavior of the device both in interaction with the user and when the user is inactive. | 
| Explanation | The device must communicate to the user how active listening works to generate more trust between the user and the device. In turn, this should be activated only by using the activation command. | 
| Priority | (3) Important | 
| UX/Usability attribute | UX: Valuable, Credible | 
| Voice Assistant Feature | Customizable, Security and privacy | 
| Sets of heuristics related | |
| Checklist | 
 | 
| Examples | Compliance:  Ok Google, call Daniela  Ok, calling Daniela | 
| Non-compliance:  *Talking to another person in the environment* *GA device activates*  Calling Daniela | 
| ID | HEUXIVA13 | 
|---|---|
| Name | Guides and Documentation | 
| Definition | The device must provide simple and comprehensive physical or electronic documentation of the internal and external workings of the device, either through a request from the user or external search. | 
| Explanation | The device must be provided with a user manual/guide for easy first use and installation/reinstallation to a new location for novice and/or first-time users. This being through the voice assistant (preset explained/described installation instructions before connecting it to the WIFI). This should contain all the information and usage examples necessary for the user to interact with the device properly. The appliance must provide internal, external information (device buttons/its operation) and of configuration about it. | 
| Priority | (2) Important | 
| UX/Usability attribute | Usability: Effectiveness, Satisfaction UX: Findable/locatable, Valuable, Usable | 
| Voice Assistant Feature | Guidance and assistance | 
| Sets of heuristics related | 
 | 
| Checklist | 
 | 
| Examples | Compliance: The device has a physical instruction manual and an online one on its website. | 
| Non-compliance: The device has no information on basic functions. | 
Appendix K. Coverage Matrix Linking Voice Assistant Features, Heuristics, Checklist Items, and Problem Types
| Voice Assistant Feature | HEUXIVA Heuristic | Checklist Item (Example) | Problem Type (UX Aspect) | Example (Compliance/Non-Compliance) | 
|---|---|---|---|---|
| Effective communication | HEUXIVA1, HEUXIVA2, HEUXIVA3, HEUXIVA5, HEUXIVA6, HEUXIVA7, HEUXIVA9 | (HEUXIVA1) The device has lighting signals when it interacts with the user. | Lack of system feedback | ✅ The device lights up and says “I’m listening”. ❌ No response after the “wake” word. | 
| (HEUXIVA6) The artifact executes the user’s requests. | Lack of control | ✅ “Stop music”. Command immediately halts playback. ❌ Must wait for assistant to finish speaking. | ||
| (HEUXIVA9) The device rephrases unclear input for confirmation. | Ambiguous input handling | ✅ “Did you mean alarm for 7 AM or 7 PM?”/❌ Executes wrong command without clarifying. | ||
| Effective | HEUXIVA3, HEUXIVA5 | (HEUXIVA3) The device provides a continuous conversation option and maintains context between consecutive interactions. | Context loss | ✅ Understands follow-up question: “And what about tomorrow?”/❌ Requires repeating full command each time. | 
| Activity management | HEUXIVA1, HEUXIVA2, HEUXIVA6, HEUXIVA7, HEUXIVA11 | (HEUXIVA2) The devices allow users to perform and manage functional tasks (such as scheduling appointments or setting alarms) through voice commands. | Task management and functionality coverage | ✅ The assistant successfully schedules a meeting or sends a message via voice command./❌ The assistant fails to complete management actions or requires manual confirmation on a secondary device. | 
| Customizable | HEUXIVA6, HEUXIVA8, HEUXIVA12 | (HEUXIVA12) The device performs tasks accurately even under varying conditions (e.g., background noise). | Performance | ✅ Recognizes commands in noisy environments./❌ Fails to respond when music is playing. | 
| Multi-user | HEUXIVA3, HEUXIVA4, HEUXIVA8 | (HEUXIVA4) The device recognizes and differentiate the voices of multiple users, allowing everyone in the same environment to interact with the assistant naturally. | Multi-user inclusiveness | ✅ The assistant identifies different household members and adapts responses (e.g., personalized calendar or music)./❌ Only responds to the registered user’s voice, ignoring others in the same space. | 
| Security and privacy | HEUXIVA11, HEUXIVA12 | (HEUXIVA11) The device requests authorization for the use of the data collected during the dialog. | Transparency issue | ✅ “Do you agree to save this recording?”/❌ Stores voice data automatically. | 
| Multi-linkable | HEUXIVA8, HEUXIVA10 | (HEUXIVA8) The device allows linking or integrating external services and smart devices (e.g., music apps, lighting, appliances, temperature control) and enables their management through voice commands. | Integration | ✅ The assistant connects to Spotify and smart lights, allowing full control by voice./❌ Integration with external apps or devices fails or requires manual configuration. | 
| Culturizable/adaptable | HEUXIVA2, HEUXIVA4, HEUXIVA7, HEUXIVA10 | (HEUXIVA10) The device suggests possible solutions or recovery options. | Lack of recovery adaptation | ✅ “Try saying the command again”./❌ Offers no instruction to fix issue. | 
| Voice interface | HEUXIVA1, HEUXIVA2, HEUXIVA4, HEUXIVA5, HEUXIVA7, HEUXIVA9, HEUXIVA10 | (HEUXIVA5) The response of the device is coherent and cohesive with the user request. | Irrelevant or excessive information | ✅ Gives only relevant weather data./❌ Reads the entire Wikipedia page. | 
| (HEUXIVA7) The device uses a consistent tone, vocabulary, and personality across interactions. | Inconsistent persona | ✅ Maintains friendly tone and terminology./❌ Changes voice or phrasing randomly. | ||
| Guidance and assistance | HEUXIVA13 | (HEUXIVA13) The device provides access to guides or helps resources through voice. | Lack of support resources | ✅ “You can say ‘Help’ to learn available commands”./❌ No help option available. | 
References
- Nielsen, J.; Molich, R. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Empowering People—CHI ’90, Seattle, WA, USA, 1–5 April 1990; pp. 249–256. [Google Scholar] [CrossRef]
- Quiñones, D.; Rusu, C.; Rusu, V. A methodology to develop usability/user experience heuristics. Comput. Stand. Interfaces 2018, 59, 109–129. [Google Scholar] [CrossRef]
- Rzepka, C.; Berger, B.; Hess, T. Voice assistant vs. Chatbot–examining the fit between conversational agents’ interaction modalities and information search tasks. Inf. Syst. Front. 2022, 24, 839–856. [Google Scholar] [CrossRef]
- Santos, J.; Rodrigues, J.J.P.C.; Casal, J.; Saleem, K.; Denisov, V. Intelligent personal assistants based on internet of things approaches. IEEE Syst. J. 2016, 12, 1793–1802. [Google Scholar] [CrossRef]
- Santos, J.; Rodrigues, J.J.P.C.; Silva, B.M.C.; Casal, J.; Saleem, K.; Denisov, V. An IoT-based mobile gateway for intelligent personal assistants on mobile health environments. J. Netw. Comput. Appl. 2016, 71, 194–204. [Google Scholar] [CrossRef]
- Han, S.; Yang, H. Understanding adoption of intelligent personal assistants: A parasocial relationship perspective. Ind. Manag. Data Syst. 2018, 118, 618–636. [Google Scholar] [CrossRef]
- Aymerich-Franch, L.; Ferrer, I. Investigating the use of speech-based conversational agents for life coaching. Int. J. Hum. Comput. Stud. 2022, 159, 102745. [Google Scholar] [CrossRef]
- Massai, L.; Nesi, P.; Pantaleo, G. PAVAL: A location-aware virtual personal assistant for retrieving geolocated points of interest and location-based services. Eng. Appl. Artif. Intell. 2019, 77, 70–85. [Google Scholar] [CrossRef]
- Cowan, B.R.; Pantidi, N.; Coyle, D.; Morrissey, K.; Clarke, P.; Al-Shehri, S.; Earley, D.; Bandeira, N. “What can i help you with?” infrequent users’ experiences of intelligent personal assistants. In Proceedings of the 19th International Conference on Human-Computer Interaction with Mobile Devices and Services, Vienna, Austria, 4–7 September 2017; pp. 1–12. [Google Scholar]
- Langevin, R.; Lordon, R.J.; Avrahami, T.; Cowan, B.R.; Hirsch, T.; Hsieh, G. Heuristic evaluation of conversational agents. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Virtual, 8–13 May 2021; pp. 1–15. [Google Scholar]
- Sánchez-Adame, L.M.; Mendoza, S.; Urquiza, J.; Rodríguez, J.; Meneses-Viveros, A. Towards a set of heuristics for evaluating chatbots. IEEE Lat. Am. Trans. 2021, 19, 2037–2045. [Google Scholar] [CrossRef]
- Zwakman, D.S.; Pal, D.; Triyason, T.; Vanijja, V. Usability of voice-based intelligent personal assistants. In Proceedings of the 2020 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 21–23 October 2020; pp. 652–657. [Google Scholar]
- Google Actions on Google Glossary (Dialogflow). 2024. Available online: https://developers.google.com/assistant/df-asdk/glossary (accessed on 15 January 2025).
- Google Nest Explore What You Can Do with Google Nest and Home Devices. 2025. Available online: https://support.google.com/googlenest/answer/7130274 (accessed on 15 January 2025).
- Google Nest Customize Smart Plug or Smart Switch Voice Commands with Device Type. 2025. Available online: https://support.google.com/googlenest/answer/9921419 (accessed on 15 January 2025).
- Google Nest Customize Your News Experience. 2025. Available online: https://support.google.com/googlenest/answer/7551674 (accessed on 15 January 2025).
- Google Nest Guests and Your Google Connected Home Devices. 2025. Available online: https://support.google.com/googlenest/answer/7177221 (accessed on 15 January 2025).
- García, N.H.; Martínez, I.L.; Gutiérrez, M.S.; Veracruzana, X. Development of new commands for Google Assistant using Dialogflow, Firebase and NodeMCU (ESP8266) as an intermediary. Abstr. Appl. 2020, 29, 74–87. [Google Scholar]
- Google Nest FAQs on Privacy: Google Nest. 2025. Available online: https://support.google.com/googlenest/answer/9415830 (accessed on 15 January 2025).
- Google Assistant What It Can Do—Get Started. 2025. Available online: https://assistant.google.com/learn/ (accessed on 15 January 2025).
- Google Assistant Control Smart Home Devices with Google Assistant. 2025. Available online: https://support.google.com/assistant/answer/7314909? (accessed on 15 January 2025).
- ISO 9241-210:2019; Ergonomics of Human-System Interaction—Part 210: Human-Centred Design for Interactive Systems. ISO: Geneva, Switzerland, 2019.
- Park, J.; Han, S.H.; Kim, H.K.; Cho, Y.; Park, W. Developing elements of user experience for mobile phones and services: Survey, interview, and observation approaches. Hum. Factors Ergon. Manuf. Serv. Ind. 2013, 23, 279–293. [Google Scholar] [CrossRef]
- Lykke, M.; Jantzen, C. User experience dimensions: A systematic approach to experiential qualities for evaluating information interaction in museums. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval, Chapel Hill, NC, USA, 13–17 March 2016; pp. 81–90. [Google Scholar]
- Morville, P. User Experience Design. Semantic Studios. 2004. Available online: https://semanticstudios.com/user_experience_design/ (accessed on 15 January 2025).
- Lewis, J.R. Usability: Lessons Learned. and Yet to Be Learned. Int. J. Hum. Comput. Interact. 2014, 30, 663–684. [Google Scholar] [CrossRef]
- Kendrick, A. Formative vs. Summative Evaluations. Nielsen Norman Group. 2019. Available online: https://www.nngroup.com/articles/formative-vs-summative-evaluations/ (accessed on 15 January 2025).
- Nielsen, J. Thinking Aloud: The #1 Usability Tool. Nielsen Norman Group. 2012. Available online: https://www.nngroup.com/articles/thinking-aloud-the-1-usability-tool/ (accessed on 15 January 2025).
- Experience Research Society UX Expert Evaluation. 2024. Available online: https://experienceresearchsociety.org/ux-methods/ux-expert-evaluation/ (accessed on 15 January 2025).
- Harley, A. UX Expert Reviews. Nielsen Norman Group. 2018. Available online: https://www.nngroup.com/articles/ux-expert-reviews/ (accessed on 15 January 2025).
- Nielsen, J. 10 Usability Heuristics for User Interface Design. Nielsen Norman Group. 2024. Available online: https://www.nngroup.com/articles/ten-usability-heuristics/ (accessed on 15 January 2025).
- Brooke, J. SUS-A quick and dirty usability scale. Usability Eval. Ind. 1996, 189, 4–7. [Google Scholar]
- Zwakman, D.S.; Pal, D.; Triyason, T.; Arpnikanondt, C. Voice usability scale: Measuring the user experience with voice assistants. In Proceedings of the 2020 IEEE International Symposium on Smart Electronic Systems (iSES) (Formerly iNiS), Chennai, India, 14–16 December 2020; pp. 308–311. [Google Scholar]
- Nielsen, J. Severity Ratings for Usability Problems. Nielsen Norman Group. 1994. Available online: https://www.nngroup.com/articles/how-to-rate-the-severity-of-usability-problems/ (accessed on 15 January 2025).
- ISO 9241-11:2018; Ergonomics of Human-System Interaction—Part 11: Usability: Definitions and Concepts. ISO: Geneva, Switzerland, 2018. Available online: https://www.iso.org/standard/63500.html (accessed on 1 June 2022).
- Nielsen, J. Usability 101: Introduction to Usability. Nielsen Norman Group. 2012. Available online: https://www.nngroup.com/articles/usability-101-introduction-to-usability/ (accessed on 15 January 2025).
- Nowacki, C.; Gordeeva, A.; Lizé, A.-H. Improving the usability of voice user interfaces: A new set of ergonomic criteria. In Proceedings of the Design, User Experience, and Usability. Design for Contemporary Interactive Environments: 9th International Conference, DUXU 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, 19–24 July 2020; Proceedings, Part I. Springer: Berlin/Heidelberg, Germany, 2020; pp. 117–133. [Google Scholar]
- Google Store Nest Mini—Overview. 2025. Available online: https://store.google.com/us/product/google_nest_mini?hl=en-US (accessed on 15 January 2025).
- Scapin, D.L.; Bastien, J.M.C. Ergonomic criteria for evaluating the ergonomic quality of interactive systems. Behav. Inf. Technol. 1997, 16, 220–231. [Google Scholar] [CrossRef]
- de Barcelos Silva, A.; Gomes, M.M.; da Costa, C.A.; da Rosa Righi, R.; Barbosa, J.L.V.; Pessin, G.; De Doncker, G.; Federizzi, G. Intelligent personal assistants: A systematic literature review. Expert Syst. Appl. 2020, 147, 113193. [Google Scholar] [CrossRef]
- Quiñones, D.; Ojeda, C.; Herrera, R.F.; Rojas, L.F. UXH-GEDAPP: A set of user experience heuristics for evaluating generative design applications. Inf. Softw. Technol. 2024, 168, 107408. [Google Scholar] [CrossRef]

| Experimental Group | Control Group | Observations | |
|---|---|---|---|
| Number of evaluators | 3 | 3 | - | 
| Set of heuristics used | Heuristics for evaluating voice assistants (HVA) | Conversational agents’ heuristics (CAH) [10] | - | 
| Amount of heuristics | 12 | 11 | - | 
| Total of problems identified | 31 | 26 | - | 
| Total of the correct associations | 15 | 17 | - | 
| Total of the incorrect associations | 16 | 9 | - | 
| Percentage of the correct associations (CA) | CA1 = 48.8% | CA2 = 65.38% | CA1 < CA2, it is concluded that the control set performs better than the proposed set, as it has a higher percentage of correct associations (HVA requires refinement). | 
| Percentage of the incorrect associations (IA) | IA1 = 51.62% | IA2 = 34.62% | IA1 > IA2, it is concluded that the control set is better, since the proposed set has a higher percentage of incorrect associations (HVA requires refinement). | 
| Problems identified by both groups (P1) | 7 | (P2) identified more problems than (P3), it is concluded that the proposed set performs better than the control set. | |
| Problems identified by the experimental group (P2) | 24 | - | |
| Problems identified by the control group (P3) | - | 19 | |
| Number of specific problems identified | 19 | 14 | - | 
| Effectiveness in terms of number of specific problems identified (ESS) | ESS1 = 61.29% | ESS2 = 53.84% | ESS1 > ESS2, the proposed set identified more specific problems than the control set, then it works better. | 
| Number of problems identified and qualified with a severity greater than 2 | 13 | 19 | - | 
| Effectiveness in terms of number of problems identified and qualified with a severity greater than 2 (ESV) | ESV1 = 41.93% | ESV2 = 70.07% | ESV1 < ESV2, it is concluded that the control set is better, since it finds more problems rated as severe than the proposed set (HVA requires refinement). | 
| Number of problems identified and qualified with a criticality greater than 4 | 15 | 22 | - | 
| Effectiveness in terms of number of problems identified and qualified with a criticality greater than 4 (ESC) | ESC1 = 48.38% | ESC2 = 84.61% | ESC1 < ESC2, it is concluded that the control set encounters more problems rated as critical than the proposed set (HVA requires refinement). | 
| Heuristic | D1—Utility | D2—Clarity | D3—Ease of Use | D4—Need of Additional Elements | 
|---|---|---|---|---|
| HVA1: System Status Visibility | 5.0 | 5.0 | 4.0 | 4.3 | 
| HVA2: Feedback and Help Users Prevent Errors | 4.6 | 4.6 | 3.6 | 4.3 | 
| HVA3: Brevity and Relevance of Information | 4.0 | 4.6 | 2.6 | 4.6 | 
| HVA4: Natural Communication | 4.6 | 4.0 | 3.3 | 5.0 | 
| HVA5: Match Between the System and the Real World | 4.3 | 4.3 | 4.3 | 4.3 | 
| HVA6: Consistent Voice Interface | 3.3 | 4.6 | 4.0 | 3.6 | 
| HVA7: User Control and Freedom | 4.3 | 4.6 | 3.3 | 3.6 | 
| HVA8: Flexibility and Personalization | 4.3 | 4.0 | 4.0 | 4.6 | 
| HVA9: Help Users Recognize, Diagnose, and Fix Errors | 4.3 | 4.3 | 4.3 | 4.0 | 
| HVA10: System Guidance and Capabilities | 4.6 | 4.3 | 4.6 | 4.0 | 
| HVA11: Reliability and Data Privacy | 4.0 | 4.3 | 2.6 | 5.0 | 
| HVA12: Guides and Documentation | 4.0 | 4.3 | 3.3 | 4.6 | 
| Average per dimension | 4.3 | 4.4 | 3.7 | 4.4 | 
| Heuristic | D1—Utility | D2—Clarity | D3—Ease of Use | D4—Need of Additional Elements | 
|---|---|---|---|---|
| HEUXIVA1: System Status Visibility | 5.0 | 4.9 | 4.8 | 5.0 | 
| HEUXIVA2: System Guidance and Capabilities | 4.4 | 4.1 | 4.1 | 4.5 | 
| HEUXIVA3: Effective and Fluid Communication | 4.6 | 4.3 | 4.3 | 4.5 | 
| HEUXIVA4: Environment Match Between Assistant and User Language | 4.5 | 4.8 | 4.1 | 4.8 | 
| HEUXIVA5: Information Accuracy | 4.4 | 4.1 | 3.9 | 4.9 | 
| HEUXIVA6: User Control and Freedom | 5.0 | 4.0 | 4.8 | 4.9 | 
| HEUXIVA7: Consistent Voice Interface | 4.3 | 4.3 | 3.9 | 4.8 | 
| HEUXIVA8: Voice Shortcuts. Flexibility and Personalization | 4.5 | 4.3 | 4.0 | 4.8 | 
| HEUXIVA9: Error Prevention | 4.6 | 4.8 | 4.4 | 4.6 | 
| HEUXIVA10: Help Users Recognize, Diagnose, and Fix Errors | 4.9 | 5.0 | 4.9 | 4.9 | 
| HEUXIVA11: Data Privacy | 4.4 | 4.9 | 4.3 | 5.0 | 
| HEUXIVA12: Voice Assistant Reliability | 4.4 | 4.3 | 4.0 | 4.5 | 
| HEUXIVA13: Guides and Documentation | 3.8 | 5.0 | 4.9 | 4.5 | 
| Average per dimension | 4.5 | 4.5 | 4.3 | 4.7 | 
| D1—Utility | D2—Clarity | D3—Ease of Use | D4—Need of Additional Elements | |
|---|---|---|---|---|
| Average first iteration | 4.3 | 4.4 | 3.7 | 4.4 | 
| Average second iteration | 4.5 | 4.5 | 4.3 | 4.7 | 
| Heuristic | ID | Iteration | D1—Utility | D2—Clarity | D3—Ease of Use | D4—Need of Additional Elements | 
|---|---|---|---|---|---|---|
| User Control and Freedom | HV7 | 1º It. | 4.3 | 4.6 | 3.3 | 3.6 | 
| HEUXIVA6 | 2º It. | 5.0 | 4.0 | 4.8 | 4.9 | |
| Consistent Voice Interface | HVA6 | 1º It. | 3.3 | 4.6 | 4.0 | 3.6 | 
| HEUXIVA7 | 2º It. | 4.3 | .3 | 3.9 | 4.8 | 
| Task (T) | Percentage of Task Fulfillment | Average Time | Observations | Most Expressed Emotions | Usability/UX Problems Related (P) | Heuristic Related (HEUXIVA) | 
|---|---|---|---|---|---|---|
| T1: Make a call | 100% | 79.6 s | All users performed the task correctly. Users appreciated that their requests were carried out quickly and efficiently. It was evident that there is a need for the device to communicate to the user through voice, sound, and/or light that it is performing an action. | Neutral (41.7%) Happiness (33.3%) | P1: The user forgets the activation word P2: The device does not understand what the user says | P1 is covered by HEUXIVA6 P2 is covered by HEUXIVA4 | 
| T2: Check available flights | 100% | 131 s | All users performed the task correctly. Users become confused when they realize that sometimes the device cannot understand their requests. It is necessary to reconsider whether the activation command is too complex for users. | Irritation (41.7%) and Confusion (33.3%) | P3: The device does not effectively communicate the error P4: The user forgets the activation word | P3 is covered by HEUXIVA10 P4 is covered by HEUXIVA6 | 
| T3: Speak colloquially with the device | 100% | 77.8 s | All users performed the task correctly. Users expected the device to provide a response that matches their request. If the device performs an activity or delivers a response different from what was requested, users indicated that they tend to doubt both themselves and the device. | Neutral (41.7%) and Happiness (33.3%) | P5: The device does not respond immediately after the request is completed P6: The device provides incoherent and unrelated responses to what the user requested P7: The device does not understand the user’s idiolect | P5 is covered by HEUXIVA3 P6 is covered by HEUXIVA5 P7 is covered by HEUXIVA4 | 
| T4: Make queries in the area/field of History | 100% | 122.3 s | All users performed the task correctly. Based on user feedback, it highlighted the importance of the device communicating in the user’s language and providing information in an understandable manner. | Neutral (33.3%) and Happiness (33.3%) | P8: The device provides extensive and confusing information P9: The device does not follow instructions | P8 is covered by HEUXIVA5 P9 is covered by HEUXIVA1 | 
| T5: Set an alarm | 100% | 118.8 s | All users performed the task correctly. Users appreciated that the voice assistant responded to their requests; however, they became confused when they made a request, and the device did not carry out the specified action. | Happiness (41.7%) | P10: The device provides extensive information P11: The device interrupts the user while they are giving instructions | P10 is covered by HEUXIVA5 P11 is covered by HEUXIVA3 | 
| T6: Delete or edit an alarm | 100% | 85.3 s | All users performed the task correctly. Some users were confused and annoyed when they noticed that the device did not properly carry out the request they had just made. | Confusion (41.7%) and Irritation (33.3%) | P12: The device does not allow editing of an instruction P13: The device performs a function different from what was requested P14: The device does not effectively communicate the error P15: The device ignores the user P16: The device does not distinguish commands from questions | P12 is covered by HEUXIVA6 P13 is covered by HEUXIVA12 P14 is covered by HEUXIVA10 P15 is covered by HEUXIVA1 P16 is covered by HEUXIVA4 | 
| T7: Customize assistant attributes | 100% | 79.1 s | All users performed the task correctly. Users expected the voice assistant to allow them to perform the same actions they do when interacting with their mobile phone. This surprised some users when the device redirected them to the mobile interface. | Irritation (33%) and Neutral (33%) | P17: The device requests manual configurations to be made P18: The device provides lengthy instructions | P17 is covered by HEUXIVA7 P18 is covered by HEUXIVA5 | 
| T8: Find device | 91.66% (11 of 12) | 168.2 s | Most users completed the task. Users showed annoyance and/or frustration when they noticed that the device was not following their instructions, was delivering incorrect responses, and was also ignoring them. | Confusion (41.7%) and Irritation (33.3%) | P19: The device ignores the user (does not perform or respond to requests) P20: The device performs a function different from what the user requested | P19 is covered by HEUXIVA1 P20 is covered by HEUXIVA12 | 
| Heuristic | Number of Problems | Problems Related | 
|---|---|---|
| HEUXIVA5: Information accuracy | 4 | 
 | 
| HEUXIVA6: User control and freedom | 3 | 
 | 
| HEUXIVA4: Environment match between assistant and user language | 3 | 
 | 
| HEUXIVA1: System status visibility | 3 | 
 | 
| HEUXIVA3: Effective and fluid communication | 2 | 
 | 
| HEUXIVA10: Help users recognize, diagnose, and fix errors | 2 | 
 | 
| HEUXIVA12: Voice assistant reliability | 2 | 
 | 
| HEUXIVA7: Consistent voice interface | 1 | 
 | 
| ID | Name | Description | Voice Assistant Feature | Usability/UX Attribute | 
|---|---|---|---|---|
| HEUXIVA1 | System status visibility | The device must indicate to the user via voice, sound or light every action that is being performed. | Effective communication, Voice interface, Activity management | Effectiveness, Efficiency, Useful, Valuable, Satisfaction | 
| HEUXIVA2 | System guidance and capabilities | The device must guide the user through dialog and activities using words that the user recognizes (and does not increase their cognitive abilities). It should also clarify in a simple way its capabilities. | Culturizable/adaptable, Voice interface, Effective communication, Activity management | Useful, Effectiveness, Wearable, Satisfaction, Desirable, Valuable | 
| HEUXIVA3 | Effective and fluid communication | The device must adapt to the context and situations that arise in the conversation, as well as remembering previous requests and conversations with the user. | Effective communication, Effective, Multi-user | Efficiency, Effectiveness, Useful, Wearable | 
| HEUXIVA4 | Environment match between assistant and user language | The device must understand the user’s particular way of speaking, in addition to interacting in their language with words, phrases and concepts familiar to the user. | Culturizable/adaptable, Multi-user, Voice interface | Useful, Effectiveness, Valuable, Desirable | 
| HEUXIVA5 | Information accuracy | The responses delivered by the device must be relevant, brief and according to what is requested by the user. Similarly, the device must provide truthful information during interaction with the user. | Effective communication, Effective, Voice interface | Effectiveness, Efficiency, Useful, Valuable, Satisfaction | 
| HEUXIVA6 | User control and freedom | The device allows the user to perform, redo, and undo actions or requests. | Activity management, Effective communication, Customizable | Credible, Valuable, Satisfaction, Learning capacity, Useful | 
| HEUXIVA7 | Consistent voice interface | The device must be able to provide information through voice and being consistent in its personality. | Effective communication, Voice interface, Culturizable/adaptable, Activity management | Satisfaction, Credible, Desirable, Useful, Valuable | 
| HEUXIVA8 | Voice shortcuts, flexibility and personalization | The device should answer depending on the environment in which the user is located, providing shortcuts according to the context, allowing customization and adapting according to the needs of the user. | Customizable, Multi-user, Multi-linkable | Effectiveness, Efficiency, Satisfaction, Usable, Learning capacity | 
| HEUXIVA9 | Error prevention | The device must provide the necessary information to warn the user when an error is about to occur. | Effective communication, Voice interface | Effectiveness, Efficiency, Useful | 
| HEUXIVA10 | Help users recognize, diagnose, and fix errors | Error messages should be expressed in simple language (not codes), accurately indicate the problem, and constructively suggest a solution that mostly uses voice commands or actions. | Culturizable/adaptable, Voice interface, Multi-linkable | Valuable, Useful, Effectiveness, Efficiency | 
| HEUXIVA11 | Data privacy | The device must inform the user about the privacy and use of personal data. Likewise, it must grant the possibility of rejecting the collection and analysis of their data, thus being transparent and truthful with the user. | Security and privacy, Activity management | Valuable, Satisfaction, Credible | 
| HEUXIVA12 | Voice assistant reliability | Reliability must be transmitted through the behavior of the device both in interaction with the user and when the user is inactive. | Customizable, Security and privacy | Valuable, Credible | 
| HEUXIVA13 | Guides and documentation | The device must provide simple and comprehensive physical or electronic documentation of the internal and external workings of the device, either through a request from the user or external search. | Guidance and assistance | Findable/locatable, Valuable, Useful, Effectiveness, Satisfaction | 
| Study | Domain | Description | Number of Elements | Validation | Limitations | 
|---|---|---|---|---|---|
| Nielsen heuristics (1990) [1,31] | General desktop applications | Set of heuristics. Focus on usability. | 10 heuristics | Expert review, heuristic evaluation. | Not specific to voice assistants, limited to usability. | 
| Cowan et al. (2017) [9] | Intelligent personal assistants (IPAs) | 6 main areas related to usability/UX problems. Focus on user experience. | 6 key themes | No reported | Only qualitative analysis, does not propose heuristics. | 
| Langevin et al. (2021) [10] | Conversational agents | Set of heuristics, adapted from Nielsen [31]. Focus on usability. | 11 heuristics | Expert review, heuristic evaluation. | Not specific to voice assistants; limited to usability. | 
| Sánchez-Adame et al. (2021) [11] | Chatbots | Set of heuristics. Focus on usability. | 5 heuristics | Expert review, heuristic evaluation. | Only for text-based devices, limited to usability. | 
| Zwakman et al. (2020) [12] | Voice assistants | Survey (scale), adapted from SUS [32]. Focus on usability. | 10 items | Quantitative validation (exploratory factor analysis). | Does not propose heuristics; limited to perceived usability. | 
| Nowacki and Gordeeva [37] | Voice user interface (VUIs) | Ergonomic criteria, based on [31,39]. Focus on usability and ergonomics. | 8 criteria and 20 sub-criteria | Preliminary user testing | Preliminary validation, does not propose heuristics, limited to ergonomics. | 
| HEUXIVA | Voice assistants | Set of heuristics, based on [9,10,11,12,31,37]. Focus on user experience. | 13 heuristics | Heuristic evaluation, expert judgment, user testing. | Preliminary validation scope (single device). | 
| ID | Name | Type | Origin | Novel Aspect Introduced for UX Evaluation | 
|---|---|---|---|---|
| HEUXIVA1 | System status visibility | Adapted heuristic | Heuristics: Nielsen and Langevin et al. | Focus on feedback (voice, light, sound) | 
| HEUXIVA2 | System guidance and capabilities | Adapted heuristic | Heuristics: Nielsen and Langevin et al. | Guidance through dialog and capability explanation | 
| HEUXIVA3 | Effective and fluid communication | New heuristic | Heuristics: Sánchez et al. Ergonomics criteria: Nowacki and Gordeeva | Conversational fluidity, contextual continuity, memory | 
| HEUXIVA4 | Environment match between assistant and user language | New heuristic | Heuristics: Langevin et al. | Adaptation to user language and linguistic environment | 
| HEUXIVA5 | Information accuracy | New heuristic | Heuristics: Nielsen Ergonomics criteria: Nowacki and Gordeeva | Accuracy, brevity, and contextual relevance of responses | 
| HEUXIVA6 | User control and freedom | Adapted heuristic | Heuristics: Nielsen and Langevin et al. Ergonomics criteria: Nowacki and Gordeeva | Undo/redo through conversational commands | 
| HEUXIVA7 | Consistent voice interface | New heuristic | Heuristics: Nielsen and Langevin et al. Ergonomics criteria: Nowacki and Gordeeva | Voice consistency and coherence | 
| HEUXIVA8 | Voice shortcuts, flexibility and personalization | New heuristic | Heuristics: Nielsen and Langevin et al. Ergonomics criteria: Nowacki and Gordeeva | Voice shortcuts, customization, and adaptability | 
| HEUXIVA9 | Error prevention | Adapted heuristic | Heuristics: Nielsen and Sánchez et al. Ergonomics criteria: Nowacki and Gordeeva | Preemptive voice feedback before execution | 
| HEUXIVA10 | Help users recognize, diagnose, and fix errors | Adapted heuristic | Heuristics: Nielsen and Langevin et al. Ergonomics criteria: Nowacki and Gordeeva | Constructive voice-based error communication | 
| HEUXIVA11 | Data privacy | New heuristic | Heuristics: Langevin et al. | Data transparency, privacy management, user consent | 
| HEUXIVA12 | Voice assistant reliability | New heuristic | Heuristics: Langevin et al. Ergonomics criteria: Nowacki and Gordeeva | Reliability and trust in autonomous voice behavior | 
| HEUXIVA13 | Guides and documentation | Adapted heuristic | Heuristics: Nielsen | Simplified physical and digital documentation | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Quiñones, D.; Rojas, L.F.; Serrá, C.; Ramírez, J.; Barrientos, V.; Cano, S. HEUXIVA: A Set of Heuristics for Evaluating User eXperience with Voice Assistants. Appl. Sci. 2025, 15, 11178. https://doi.org/10.3390/app152011178
Quiñones D, Rojas LF, Serrá C, Ramírez J, Barrientos V, Cano S. HEUXIVA: A Set of Heuristics for Evaluating User eXperience with Voice Assistants. Applied Sciences. 2025; 15(20):11178. https://doi.org/10.3390/app152011178
Chicago/Turabian StyleQuiñones, Daniela, Luis Felipe Rojas, Camila Serrá, Jessica Ramírez, Viviana Barrientos, and Sandra Cano. 2025. "HEUXIVA: A Set of Heuristics for Evaluating User eXperience with Voice Assistants" Applied Sciences 15, no. 20: 11178. https://doi.org/10.3390/app152011178
APA StyleQuiñones, D., Rojas, L. F., Serrá, C., Ramírez, J., Barrientos, V., & Cano, S. (2025). HEUXIVA: A Set of Heuristics for Evaluating User eXperience with Voice Assistants. Applied Sciences, 15(20), 11178. https://doi.org/10.3390/app152011178
 
        




 
       