Next Article in Journal
Design of a Dual Path Mixed Coupling Wireless Power Transfer Coupler for Improving Transmit Arrays in UAV Charging
Next Article in Special Issue
Spinal Line Detection for Posture Evaluation Through Training-Free 3D Human Body Reconstruction with 2D Depth Images
Previous Article in Journal
The Impact of an Integrated ACT-Based Psychological Intervention (SmartACT) on Attention and Psychological Flexibility in Adolescent Student-Athletes
Previous Article in Special Issue
Machine Learning-Based Smartphone Grip Posture Image Recognition and Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

User Experience Enhancement of a Gamified Speech Therapy Program Using the Double Diamond Design Framework

1
Department of Industrial and Management Engineering, Pohang University of Science and Technology, Pohang 37673, Republic of Korea
2
Hidden Figures Inc., Seoul 04799, Republic of Korea
3
School of Contents Convergence Design, Handong Global University, Pohang 37554, Republic of Korea
4
Department of Industrial and Safety Engineering, University of Ulsan, Ulsan 44610, Republic of Korea
5
Department of Physical Medicine and Rehabilitation, Jeonbuk National University Medical School, Jeonju 54907, Republic of Korea
6
Research Institute of Clinical Medicine of Jeonbuk National University—Biomedical Research Institute of Jeonbuk National University Hospital, Jeonju 54907, Republic of Korea
7
Department of Speech-Language Pathology, Jeonbuk National University, Jeonju 54896, Republic of Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(2), 826; https://doi.org/10.3390/app16020826
Submission received: 6 December 2025 / Revised: 3 January 2026 / Accepted: 8 January 2026 / Published: 13 January 2026
(This article belongs to the Special Issue Novel Approaches and Applications in Ergonomic Design, 4th Edition)

Abstract

The global rise in childhood speech disorders highlights the need for accessible and engaging home-based rehabilitation tools. This study applied the Double Diamond design framework to enhance the user experience (UX) of Smart Speech, a gamified functional speech therapy program. Using heuristic evaluation, expert interviews, and benchmarking, six core UX problem areas were identified, including insufficient guidance, low personalized motivation, limited feedback, and accessibility issues. Through an iterative ideation process, 78 UX improvement concepts were generated, encompassing motivational reinforcement (e.g., praise stickers and character interaction), automated training guidance, enhanced feedback mechanisms, and error-prevention features. A usability evaluation with 20 participants, including speech-language pathologists (SLPs) and parents, showed significant improvements across key dimensions, with increases of 1.1 to 2.6 points on a 7-point scale. These findings demonstrate that systematic UX design can substantially improve engagement, usability, and the potential therapeutic utility of home-based speech therapy systems.

1. Introduction

The prevalence of speech disorders among children continues to increase globally, stimulating research into more interactive and user-centered treatment methods. In particular, gamified functional speech therapy games have proven effective in enhancing motivation and therapeutic outcomes [1]. For instance, in the United States, approximately 1.2 million children under 12 years old were diagnosed with speech disorders in 2022—nearly twice the 600,000 reported in 2019 [2]. Similarly, in the Republic of Korea, the number of children under 14 with speech disorders increased 2.3 times, from 2023 in 2016 to 4612 in 2023 [3]. Over the past two decades, interdisciplinary research integrating neuroscience, linguistics, and computer science has advanced speech therapy through therapeutic and technological integration [4]. Among these, gamified programs are particularly effective for maintaining engagement among children who often lose interest in repetitive rehabilitation exercises [5].
Various functional speech therapy games have been developed and validated to support effective treatment. Representative gamified programs that have demonstrated positive effects on speech improvement and motivation include Dr. Speech (Tiger DRS, Inc., Seattle, WA, USA) [6], Speech Mirror (CluSoft, Seoul, Republic of Korea) [7], TheraVox (WEVOSYS Medical Technology GmbH, Baunach, Germany) [8], Apraxia World (Texas A&M University, College Station, TX, USA), Feeling Factory (Drexel University, Philadelphia, PA, USA), MITA (ImagiRation LLC, Boston, MA, USA), SpokeIt (University of California, Santa Cruz, Santa Cruz, CA, USA), and Smart Speech (Humanopia, Inc., Pohang, Republic of Korea) [9,10,11,12,13,14,15,16]. Apraxia World, designed for children with hearing and speech impairments, combines avatar-controlled gameplay with speech training and has been shown to increase voluntary participation and therapy duration [11]. MITA, an early-intervention program for children with autism spectrum disorder, achieved a 120% improvement in users’ language scores [16]. Smart Speech, a PC-based functional speech therapy game co-developed by Jeonbuk National University Hospital and Pohang University of Science and Technology, provides game-based training for articulation and speech disorders [15]. Its development incorporated input from key stakeholders—including speech therapists, physicians, and parents of children with speech disorders—to ensure clinical relevance and usability. In a study of 30 patients aged 25–83, participants trained with Smart Speech for six weeks (30 min per session, three times weekly), resulting in significant improvements in phonation time, jitter, and U-TAP scores (p < 0.05) [15].
While many speech therapy games have enhanced usability through post-development testing, existing literature on serious games for health remains largely limited by its focus on initial development or component-level evaluation, leaving a significant gap in research that applies structured service design methodologies—such as the Double Diamond framework—specifically for the ‘systematic renewal’ of established therapeutic tools to improve the overall user experience (UX) across the entire therapy process. Apraxia World was evaluated with 11 children diagnosed with speech disorders, and while improvement points were identified, the evaluation was limited to specific in-game preferences (e.g., timing of training levels, avatar control) [11]. Similarly, SpokeIt was improved through a co-creation design approach, involving children post–cleft lip surgery, their parents, and speech therapists [10]. However, these efforts focused primarily on interface intuitiveness and feedback for individual game tasks rather than the overall therapy process. Smart Speech was also evaluated by 19 experts (12 physicians and 7 speech therapists), who emphasized the need for improvements in instructions, color contrast, tutorial videos, voice support, accessibility, and data management [12]. These findings underscore the need to enhance overall UX beyond basic usability to support independent home-based use and maximize therapeutic effectiveness. To address this gap, it is essential to establish a systematic renewal process that integrates ‘clinical reasoning’ from healthcare experts into UX design, theoretically anchoring the design interventions within Self-Determination Theory (SDT) and the Technology Acceptance Model (TAM) [17,18,19]. This theoretical and clinical linkage is crucial, as optimized user engagement in digital interventions has been shown to significantly increase therapy adherence, which is a primary predictor of successful clinical outcomes [17,18].
To ensure scholarly precision, we distinguish between usability, user experience (UX), and engagement in this study. While usability focuses on the functional efficiency and learnability of the interface, UX encompasses the broader emotional and perceptual responses of the user. Furthermore, engagement is defined as the motivational drive that ensures sustained therapy adherence—a critical outcome in digital rehabilitation. By clarifying these constructs, we establish a multi-dimensional evaluative framework to analyze how specific design interventions, categorized under these three pillars, translate into clinical value through the systematic renewal process.
The present study qualitatively investigates the comprehensive user experience (UX) of Smart Speech using the structured Double Diamond UX design framework. From the user’s perspective, pain points and contextual usability issues were analyzed to identify improvement strategies. Based on these findings, UX-enhanced design solutions were developed, and a comparative usability evaluation between the original and improved versions of Smart Speech was conducted to assess the effectiveness of the proposed improvements.

2. Application of the Double Diamond Design Framework

This study applied the Double Diamond design framework [20] to improve the user experience (UX) of Smart Speech. The framework consists of four iterative stages—Discover, Define, Develop, and Deliver—and is widely adopted in service and experience design for user-centered innovation. Unlike single-method approaches such as usability testing, which address specific interaction issues, the Double Diamond framework holistically explores all experiences related to a product or service. For example, in the case of a game console, it encompasses not only the act of playing but also pre- and post-use interactions, such as setup, maintenance, and storage [21]. In this study, the framework was used to systematically identify usability challenges and unmet user needs across the full Smart Speech user journey.

2.1. Discover Phase

The first stage, Discover, involved investigating the experiences of key stakeholders to uncover pain points and UX improvement opportunities on Smart Speech. Three complementary methods—heuristic evaluation, expert interviews, and benchmarking—were employed to triangulate insights. A heuristic evaluation was conducted by five UX specialists based on Jakob Nielsen’s ten usability heuristics [22]. Each screen and interaction flow of Smart Speech—including installation procedures, game selection, and training setup—was analyzed to identify usability issues and inconsistencies. Semi-structured interviews were conducted with ten speech disorder and rehabilitation experts (one rehabilitation physician, two speech therapy professors, and seven SLPs). Each participant used Smart Speech for approximately one week before being interviewed. The interview questions, developed by UX researchers, addressed gameplay context and included follow-up probes derived from the heuristic evaluation results.
To identify child-oriented design features that could enhance engagement, benchmarking was performed on 17 educational programs, including Khan Academy Kids and Todo Math. This analysis focused on instructional clarity, motivational strategies, and visual design elements relevant to Smart Speech.
The Discover phase revealed approximately 139 UX improvement opportunities, primarily related to insufficient instructional guidance and limited user motivation. The heuristic evaluation uncovered about 85 major issues, including (1) inadequate program explanations, (2) lack of sustained interest, (3) insufficient feedback, (4) inadequate constraint design, (5) inconsistent interactions, and (6) distracting in-game elements (Table 1). The expert interviews generated approximately 39 improvement points, identifying UX deficiencies in four areas: (1) personalized motivation, (2) detailed training methods and progress feedback, (3) customized training tasks, and (4) game reliability (Table 2). Finally, benchmarking produced about 15 improvement points across three dimensions: (1) motivation for the primary user group, (2) provision of detailed learning guidance, and (3) development of a customized graphical user interface (GUI) (Table 3).

2.2. Define Phase

The second stage, Define, is aimed at clearly identifying and structuring the key issues requiring improvement based on the UX data collected during the Discover phase. In this stage, the collected user experiences were analyzed to uncover users’ explicit needs and latent problems, which were then organized to define actionable improvement areas. The UX data obtained through the three research methods—heuristic evaluation, expert interviews, and benchmarking—were converted into pain point datasets. The affinity diagram technique was applied to filter irrelevant information, merge duplicates, and group similar issues to form higher-level improvement categories. Additionally, cross-validation was performed to confirm whether an issue identified in one screen or game also appeared in other games with similar mechanics. For example, if a usability issue was detected in one game involving pitch setting and vocalization (e.g., the Airplane Game), data were re-examined in similar games (e.g., Piano Game) to ensure consistency. After grouping and verifying all issues, representative labels were assigned to define the final pain point structure.
Based on the Discover phase findings, 58 improvement points were identified in the Define phase and categorized into six major groups (Table 4) as follows:
  • Lack of effective usage methods: Users lacked clear explanations of how each training game contributed to therapy and how to play effectively. For example, in the breathing training game (Paper Blowing), the instruction only stated “s (스 in Korean),” whereas the actual task required producing the /s/ sound by blowing air between the tongue and palate. This lack of clarity limited therapeutic precision.
  • Lack of personalized motivation: The system provided insufficient incentives to start or sustain training, especially for children with speech disorders. While some exercises (e.g., Candle Blowing, Cat Rescue) offered engaging visual feedback and scores, others (e.g., Oral Motor and Word Games) lacked mechanisms to maintain user motivation.
  • Lack of constraints to prevent errors: The interface allowed unrestricted inputs (e.g., entering non-numeric or excessively large values for duration settings), causing confusion and reducing reliability.
  • Lack of affordance cues: The interface provided inadequate visual or interactive indicators for clickable or controllable elements. For example, in the pitch control training, users were not guided to set the target pitch before starting, leading to skipped steps.
  • Lack of customized GUI: The visual design did not adequately consider children or users with disabilities. Overly colorful or visually complex screens distracted attention from key gameplay elements (e.g., the tube in the Tube Blowing game). Moreover, text-heavy interfaces made it difficult for non-literate users to navigate effectively.
  • Lack of feedback for correct usage: Feedback mechanisms confirming system recognition of vocalizations were insufficient. For instance, after performing a microphone test, no confirmation was provided, leaving users uncertain whether their input was detected.

2.3. Develop Phase

The third stage, Develop, focuses on generating concrete solutions to the pain points defined in the previous phase and consolidating these ideas into actionable UX improvement proposals. In this stage, multiple ideation techniques—such as mind mapping, brainstorming, and mandal-art—were employed to derive a broad range of concepts. Five UI designers individually sketched initial ideas for each pain point, as shown in Figure 1, and then collaboratively reviewed and refined them through group discussions. While some solutions emerged from a single idea, most final concepts resulted from combining and iteratively improving two or more ideas. To prevent fragmented or conflicting design directions, pain points were first reorganized by functional category or screen context, and ideation proceeded based on these clusters. For example, although issues such as the absence of microphone setup reminders, lack of affordance on the microphone test button, and lack of signifiers for starting the microphone test belonged to different pain point groups, they were addressed collectively because they all relate to “microphone setup” interactions within the settings screen. The designers’ proposals spanned graphic improvements, information architecture (IA) restructuring, and new functional features. For instance, to reduce visual distraction during gameplay, one idea involved lowering the saturation of background elements while maintaining visual emphasis on core interactive elements (e.g., the tube in the tube-blowing game). IA-oriented solutions included a mandatory pop-up prompting microphone setup on the first program launch. Additionally, new motivational features—such as a praise-sticker reward system—were proposed to enhance user engagement.
The Develop stage yielded a total of 78 improvement proposals, which were organized into three overarching design directions (Figure 2): (1) providing motivational elements to sustain continuous engagement; (2) offering guidance and features that support correct execution of each training task to maximize therapeutic effectiveness; and (3) presenting GUI and UI components tailored to the characteristics and needs of child users with speech disorders. For continuous motivation, the proposals included reward-based features such as praise stickers granted upon completing individual games or daily missions, as well as motivational enhancements delivered by friendly characters that address users by name. For self-guided training support, improvements involved developing a mandatory, comprehensive usage guide displayed upon the first program launch (with optional access thereafter), offering detailed explanations of each training task and its therapeutic purpose, and implementing mission maps and daily missions categorized by difficulty to help first-time users navigate Smart Speech without confusion. The final direction centered on UI enhancements tailored to the primary users—children with speech disorders. Proposed solutions addressed usability gaps and adapted the interface to children’s cognitive and perceptual needs, including restricting input ranges for difficulty settings to prevent errors and integrating voice-guided instructions to ensure accessibility for non-literate users.

2.4. Deliver Phase

Finally, in the Deliver stage, the proposed solutions were translated into a functional prototype. This phase focuses on visualizing the improved concepts as detailed screens or interactive scenarios, validating feasibility, and refining design completeness through iterative feedback. Prototypes are typically strengthened through collaboration with relevant stakeholders, increasing alignment with their expectations and practical needs. For all improvement items requiring interface modifications, high-fidelity prototypes were produced using Figma Desktop App version 116.15.4 (Figma, Inc., San Francisco, CA, USA) and Adobe Illustrator v25.0 (Adobe Inc., San Jose, CA, USA). The visual style and interaction patterns were intentionally aligned with the original Smart Speech design to ensure aesthetic and functional consistency. The prototype screens were then shared with stakeholders, who provided targeted feedback regarding clarity, usability, and therapeutic appropriateness. Based on this input, additional refinements were incorporated to finalize the improved designs. The resulting prototypes—examples of which are shown in Figure 3—reflect integrated enhancements, with multiple UX improvements often applied within a single screen.

3. Validation of Improvement Effects

To assess the effectiveness of the proposed UX improvements, usability evaluations were conducted using the original and improved versions of Smart Speech. From the total of 78 improvement proposals, five researchers selected nine representative items as shown in Table 5, prioritizing those expected to substantially influence overall usage or address critical usability barriers. For evaluation purposes, paired comparison videos were created for each improvement. To minimize bias, the videos did not indicate whether they represented the original or the improved version.

3.1. Method

3.1.1. Participants and Materials

Twenty adults participated in the study, consisting of ten SLPs (the primary expert users of Smart Speech) and ten parents in their 30s and 40s raising children aged 4 to 10.
To evaluate the effectiveness of the proposed US improvements, a video-based comparative evaluation was conducted. While functional prototypes were developed using Figma to simulate the improved interactions, video representations were chosen for the formal validation to ensure high experimental control. This approach allowed participants to focus strictly on the specific design interventions without being distracted by unrelated gameplay mechanics or technical latencies. Furthermore, by using standardized simulations, we minimized potential confounding variables such as the learning effect or the familiarization period required for new users to master the software’s basic controls [23,24]. This method effectively captures the dynamic aspects of the intended interaction while maintaining consistent evaluation conditions across all participants [24].

3.1.2. Experimental Procedure

The evaluations were conducted in natural user environments, including homes and speech therapy clinics. After briefly familiarizing themselves with Smart Speech and the evaluation procedure, participants viewed paired videos showing the original and improved versions of each UX improvement. They then rated both versions using a 7-point scale. Participants also provided a direct comparative rating of the improved version relative to the original. Evaluation items included visibility, learnability, motivation, effectiveness, familiarity, error prevention, and overall satisfaction. Items irrelevant to certain improvements were selectively excluded (e.g., Motivation was not assessed for Improvement #7, which involved cursor visibility).
The experiments were conducted in the participants’ living environments, such as homes and speech therapy clinics. Participants familiarized themselves with Smart Speech and the experiment overview before participating in the evaluation. Participants watched videos of both the original and improved versions, then rated each based on specific questions (evaluation items) using a 7-point scale. They also provided comparative ratings of the improved version relative to the original, using the same 7-point scale. The evaluation items included seven categories: visibility, learnability, motivation, effectiveness, familiarity, error prevention, and overall satisfaction. Questions were tailored depending on the characteristics of each improvement. For example, for Improvement No. 7 (Recognizable UI), which involves enlarging the mouse cursor and adapting its appearance based on interaction context, learnability and motivation were considered irrelevant, and thus only the relevant visibility- and interaction-related measures were applied.

3.1.3. Analysis Method

Descriptive statistics for each evaluation item were calculated using Minitab 19. Paired t-tests were performed to determine statistically significant differences between the original and improved versions.

3.2. Result

Figure 4 presents a consolidated quantitative comparison of the usability evaluation across the nine improvement proposals. All proposals produced statistically significant gains relative to the original system, with overall satisfaction increasing by 1.1 to 2.2 points across all redesigned variants (p < 0.01 to p < 0.001). The improvements were consistent across dimensions, with 24 out of 27 dimension-specific comparisons showing increases exceeding 1.0 point, indicating broad and robust performance effects. Across all improvements, 92% of dimension-level comparisons (i.e., 33 of 36 measured values across visibility, learnability, familiarity, motivation, effectiveness, error prevention, and satisfaction) showed gains of 1.0 point or more, underscoring the consistency and magnitude of the design effects.
  • Motivational and Engagement Gains: The most substantial impact on user engagement was driven by Improvements 1 (Sufficient Guide & Motivation) and 2 (Solid Motivation). These proposals achieved two of the top four satisfaction gains, with Motivation scores increasing by 1.5 and 2.2 points, respectively (p < 0.01).
  • Guidance and System Transparency: Proposals focused on clarity, such as Friendly Situational Help (Improvement 3) and Required Setting Notification (Improvement 4), showed statistically significant improvements in learnability (2.0 and 2.2) and visibility (1.8 and 2.6). Notably, the 2.6-point visibility leap in Improvement 4 represented the largest single-dimension gain across the entire study.
  • Structural Constraints and Error Prevention: Implementing structural constraints through a Sequentially Constrained UI (Improvement 5) and Clear Hierarchy (Improvement 6) effectively mitigated setup-related errors. Error-prevention scores rose by 1.7 and 1.2 points, respectively, demonstrating that enforcing task order and restricting invalid inputs directly correlates with increased overall satisfaction (1.4 and 1.3 points; p < 0.01).
  • Perceptual Clarity: The final three proposals (Improvements 7, 8, and 9) focused on sensory feedback and signifiers. Recognizable UI produced the study’s highest visibility increase (2.5 points), while Clear Signifiers led in error prevention (1.5 points). Additionally, Complete Feedback improved both learnability (1.3) and error prevention (1.4), underscoring the critical role of immediate perceptual confirmation during system parameter settings.

4. Discussion

This study demonstrates the value of applying the widely used Double Diamond design framework to the systematic renewal of a therapeutic game, filling a methodological gap where structured design is seldom used to bridge clinical goals with user-centered needs. Unlike previous studies that often relied on single methods [11,25] or focused on specific gameplay moments [10,26], we concurrently applied heuristic evaluation, expert interviews, and benchmarking across all four Double Diamond stages. This multi-method approach enabled a comprehensive UX process spanning the full journey of in-home use—from system initialization to exercise selection—ensuring that improvements were closely aligned with real-world user experiences [27,28]. By mapping clinical reasoning to these design interventions, we theoretically anchored the improvements in TAM and SDT [19], using features like character-led motivation to support user competence and autonomy. Ultimately, these usability gains serve as a necessary gateway to therapeutic effectiveness, as optimized engagement in digital interventions is proven to increase therapy adherence, a primary predictor of successful clinical outcomes [17,18].
Another distinct feature of this study lies in its cross-disciplinary and stakeholder-inclusive design process. The improvements were derived and refined through collaborative verification involving speech therapy experts, UX designers, UI engineers, developers, ergonomics specialists, and prospective users—all key stakeholders in therapeutic game development. Although previous work has involved multi-expert collaboration—for example, Scott et al. (2024) engaged speech therapists, developmental psychologists, and game designers—the concretization of goals has often been led by psychology or philosophy experts, with limited leadership from designers [29]. Furthermore, many prior usability evaluations have been restricted to either expert-only or user-only samples, with development frequently executed solely by developers [30]. In contrast, this study is meaningful in that UX designers and UI engineers played a central leadership role throughout the Double Diamond process, while speech therapy experts and other stakeholders provided domain-specific insights, resulting in improvements that balance clinical appropriateness with practical usability.
Furthermore, this study highlights the theoretical implications of adopting a UX-led design approach over traditional developer-centric models in healthcare. Developer-centric approaches often perceive a tension between ‘medical rigor’ and ‘user engagement,’ prioritizing functional stability at the expense of usability. However, our findings demonstrate that UX-led interventions resolve this tension by reframing ‘engagement’ not as an optional add-on, but as a critical clinical function that drives adherence. By translating rigid clinical protocols into intuitive game mechanics, the UX-led framework proves that medical devices can be both rigorous and engaging.
A major limitation of the study is that the validation of improvements was conducted using video-based representations rather than fully implemented game versions. While this approach may not fully capture the embodied cognitive load or real-time learning effects, it was purposefully adopted as a formative evaluation step to isolate the impact of specific UX interventions from the confounding ‘noise’ of general system familiarization. By presenting idealized simulations derived from high-fidelity Figma prototypes, we ensured that participants evaluated the ‘intended interaction state’ without the interference of technical latencies or unrelated gameplay mechanics. This strategy is consistent with established HCI methodologies, which suggest that video prototyping is a superior medium for capturing the dynamic aspects of interaction while minimizing extraneous learning effects that could skew early-stage usability perceptions [23,24].
Another limitation is the reliance on adult proxies—SLPs and parents—rather than direct evaluation by child users. In the early formative stages, prioritizing expert and caregiver feedback was essential to ensure clinical integrity and operational safety [23,24]. However, we acknowledge that adult proxy evaluations may not fully capture the intrinsic motivation or ludic engagement patterns unique to children [31,32]. This study thus focused on establishing a robust clinical and functional framework as a necessary precursor to direct pediatric testing.
Despite this limitation, the application of the study’s results and the deployment of the improved Smart Speech system have important practical implications. For speech-impaired patients who face financial or logistical barriers to frequent face-to-face therapy, an enhanced home-based rehabilitation tool can support more continuous training while enabling close therapist–program collaboration to accelerate progress and potentially reduce in-person sessions. The transition from clinically sufficient but unstructured content to a guided, mission-based system significantly reduced the ‘decision burden’ for users, a key factor in preventing early drop-outs. Features like detailed training records (Improvement 2) further bridge the gap between home-based training and professional oversight, enabling more precise clinical guidance without increasing in-person visits.
The core UX issues identified in this study, the lack of continuous motivation and insufficient guidance, were consistently identified as critical barriers by both UX designers and speech therapy experts (physicians and SLPs). To address these shared concerns, we introduced structured mission maps and motivation-enhancing features, such as friendly characters and praise stickers, to guide users through training sequences. While these interventions significantly outperformed the original version, the practical value of these results lies in the qualitative leap they represent. The observed score increments effectively move the baseline from a ‘marginal’ or ‘neutral’ state to a ‘satisfied’ or ‘excellent’ level [33]. An increase exceeding 1.0 point on a 7-point scale—rarely achieved through minor interface adjustments—indicates that fundamental usability barriers were successfully addressed, transforming sub-optimal experiences into robustly positive interactions essential for sustained clinical adherence [34]. While these usability gains do not directly measure clinical speech recovery, they serve as a critical gateway to therapeutic effectiveness. By transforming ambiguous setup procedures into intuitive interactions, these UX interventions ensure that the user’s cognitive resources are reserved for the therapeutic task itself rather than struggling with the system interface, thereby fostering the sustained engagement essential for clinical adherence. Parent interviews further suggested that motivation could be strengthened by visualizing accumulated rewards (e.g., prizes for parents when children collect a certain number of praise stickers) and by having characters address children by name with personalized encouragement. Such guidance and motivation-focused improvements are likely to be broadly applicable to other therapeutic games and may enhance usage and facilitate the potential for improved therapeutic outcomes in self-care contexts. Beyond the motivational strategies implemented here (e.g., praise stickers, certificates, character praise, game item acquisition), addressing issues such as appropriate difficulty adjustment and prevention of frustration from repeated failures, as highlighted by Saeedi et al. (2022), and incorporating social motivation mechanisms such as teamwork and friendly competition, as proposed by Pereira et al. (2019), could further increase user engagement and game usage in future iterations [35,36].
Crucially, the visibility of the renewed game recorded the most substantial improvement (+2.6 points). This sharp increase was not merely a result of superficial interface adjustments, such as font resizing. Rather, it indicates the resolution of a fundamental functional issue in the legacy system: the ‘ambiguity of feedback.’ The original system often left users uncertain about whether their speech input was successfully recognized, creating significant cognitive friction. By implementing immediate, multimodal feedback (visual cues and sound effects) for every interaction, the renewed design eliminated these functional blind spots, thereby assuring users of their system control.
Future research should extend usability evaluation to the full set of proposed improvements and conduct clinical trials to verify the speech therapy effectiveness of Smart Speech. While video-based comparisons were employed in this study as a preliminary step to validate specific design concepts and clinical reasoning, subsequent research will involve iterative pre-testing to ensure the system’s technical completeness and stability. During this preliminary stage, the reliance on adult proxies was a strategic choice to ensure clinical integrity before pediatric deployment; however, we plan to bridge the potential divergence between adult reports and pediatric experiences [31,32] through direct longitudinal testing. Once this maturity is achieved, we plan to conduct a 12-week randomized controlled trial (RCT) with children diagnosed with speech disorders to empirically verify the therapeutic effectiveness of the improved Smart Speech based on the hypothesis that enhanced usability (TAM) and autonomy (SDT) will directly improve therapy adherence [19]. Unlike the present formative evaluation, this longitudinal study will utilize the actual interactive game to monitor concrete clinical endpoints, including therapy adherence (usage frequency), training dosage (completed exercises), and clinical speech outcomes such as phoneme accuracy and intelligibility [17,18]. By establishing this causal pathway, we aim to confirm that systematic UX design functions as a critical mediator for maximizing clinical efficacy in home-based digital rehabilitation [19].

5. Conclusions

This study applied the Double Diamond UX design framework to systematically improve the user experience of Smart Speech, a functional speech therapy game. Through heuristic evaluation, expert interviews, and benchmarking, major pain points related to insufficient guidance, low motivation, inadequate feedback, and interface inconsistencies were identified. Using these insights, 78 UX improvement proposals were generated, and nine representative enhancements were validated through a comparative usability evaluation. All nine improvements significantly increased user ratings across multiple usability dimensions, with 92% of dimension-level comparisons showing increases of 1.0 point or more. Features such as mission maps, daily missions, motivational characters, and structured task flow demonstrated strong potential to enhance learnability, visibility, error prevention, and overall satisfaction. These results highlight the importance of integrating user-centered design principles into therapeutic game development and demonstrate that high-quality UX directly contributes to more sustainable and engaging home-based rehabilitation. Beyond specific usability metrics, this study contributes to the field of assistive technology by demonstrating a paradigm shift in development leadership. It illustrates how shifting from an engineering-driven to a UX-designer-led process can fundamentally transform clinical software. By positioning UX designers as the bridge between clinical requirements and technical implementation, this research provides a replicable model for developing “human-centered” medical devices that respect the patient’s experience as much as the therapeutic outcome. A major limitation of this study is the reliance on video-based prototypes for usability testing. Future research should evaluate fully interactive implementations of the improvements and conduct longitudinal clinical trials to assess their therapeutic effects. Nonetheless, the findings provide a strong foundation for advancing UX-driven speech therapy tools. By addressing core motivation and guidance issues, the improved Smart Speech system has the potential to support more frequent and effective home-based rehabilitation for individuals with speech disorders, ultimately enhancing accessibility and clinical outcomes in digital speech therapy.

Author Contributions

Conceptualization, S.K. and H.Y.; methodology, S.K., Y.C. and H.Y.; design, S.K., E.K. and J.Y.; ideation, S.K., E.K., J.Y., Y.C., M.-H.K., Y.-j.J., H.-G.K. and H.Y.; validation, S.K., E.K., J.Y., Y.C. and H.Y.; analysis, S.K. and Y.C.; writing—original draft preparation, S.K., Y.C. and H.Y.; writing—review and editing, S.K., E.K., J.Y., Y.C., M.-H.K., Y.-j.J., H.-G.K. and H.Y.; visualization, S.K., Y.C. and H.Y.; supervision, H.Y.; project administration, H.Y.; funding acquisition, H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly supported by the research programs (2022R1A2C1013198; RS-2024-00322570) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (MEST), the Industrial Technology Innovation Program (RS-2025-02653087) funded by the Ministry of Trade, Industry & Resources (MOTIR), and the Institute of Information & Communications Technology Planning & Evaluation (IITP)-Global Data-X Leader HRD program grant funded by the Korea government (MSIT) (IITP-2024-RS-2024-00441244), and the Biomedical Research Institute Fund, Chonbuk National University Hospital.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board (IRB) of the Pohang University of Science and Technology (PIRB-2024-E014 on 31 July 2024).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request.

Acknowledgments

ChatGPT-4o (OpenAI) was used to support English language refinement during manuscript preparation. All generated content was carefully reviewed and edited by the authors, who take full responsibility for the final manuscript.

Conflicts of Interest

Author Eunjin Kwon was employed by the company Hidden Figures Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
UXUser Experience

References

  1. Dan, B. Gamification of therapy: The fun factor in rehabilitation. Dev. Med. Child Neurol. 2022, 64, 276. [Google Scholar] [CrossRef] [PubMed]
  2. Khan, T.; Freeman, R.; Druet, A. Louder than Words: Pediatric Speech Disorders Skyrocket Throughout Pandemic. Available online: https://www.komodohealth.com/ (accessed on 25 August 2025).
  3. Statistics Korea. Number of Registered Persons with Disabilities (by Age). KOSIS. Available online: https://kosis.kr/statHtml/statHtml.do?orgId=117&tblId=DT_11761_N003 (accessed on 25 August 2025).
  4. Georgiou, A.M.; Jerger, S. Editorial: Methods in speech and language: 2023. Front. Hum. Neurosci. 2024, 18, 1475311. [Google Scholar] [CrossRef] [PubMed]
  5. Liu, N.; Barakova, E.I.; Zhang, F.; Han, T.; Feng, J. Motivating Online Game Intervention to Enhance Practice Engagement in Children with Functional Articulation Disorder. In Proceedings of the HAI ’23: International Conference on Human-Agent Interaction, Gothenburg, Sweden, 28 November–1 December 2023; ACM: New York, NY, USA, 2023. [Google Scholar] [CrossRef]
  6. Tiger DRS, Inc. Dr. Speech; Computer software; Tiger DRS, Inc.: Seattle, WA, USA, 2025; Available online: https://www.drspeech.com (accessed on 25 August 2025).
  7. CluSoft. Speech Mirror; Computer software; CluSoft: Seoul, Republic of Korea, 2025; Available online: https://www.speechmirror.com/ (accessed on 25 August 2025).
  8. WEVOSYS Medical Technology GmbH. Thera Vox; Computer software; WEVOSYS Medical Technology GmbH: Baunach, Germany, 2025; Available online: https://www.wevosys.de (accessed on 25 August 2025).
  9. Dunn, R.; Vyshedskiy, A. Mental Imagery Therapy for Autism (MITA)—An early intervention computerized brain training program for children with ASD. Autism Open Access 2015, 5, 153. [Google Scholar] [CrossRef]
  10. Duval, J.S.; Márquez Segura, E.; Kurniawan, S. SpokeIt: A co-created speech therapy experience. In Proceedings of the Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; p. D501. [Google Scholar] [CrossRef]
  11. Hair, A.; Ballard, K.J.; Monroe, P.; Ahmed, B.; Gutierrez-Osuna, R. Apraxia World: A speech therapy game for children with speech sound disorders. In Proceedings of the 17th ACM Conference on Interaction Design and Children, Trondheim, Norway, 19–22 June 2018; pp. 119–131. [Google Scholar] [CrossRef]
  12. Kim, G.-W.; Kang, S.-R.; Han, K.S.; Jo, Y.-J.; Kim, R.-Y.; Jung, Y.-J.; Kim, J.-H.; You, H.; Ko, M.-H. A usability evaluation for the development of speech therapy solutions for speech-impaired people. Korean J. Cleft Lip Palate 2021, 24, 68–76. [Google Scholar] [CrossRef]
  13. Hajesmaeel-Gohari, S.; Goharinejad, S.; Shafiei, S.; Bahaadinbeigy, K. Digital games for rehabilitation of speech disorders: A scoping review. Health Sci. Rep. 2023, 6, e1308. [Google Scholar] [CrossRef] [PubMed]
  14. Lyon, N.E. Feeling Factory: A Prosody Improvement Game for Children with ASD. Ph.D. Thesis, Drexel University, Philadelphia, PA, USA, 2015. Publication No. 1594280. [Google Scholar]
  15. Yang, X.; Sadika, E.D.; Pratama, G.B.; Choi, Y.; Kim, Y.K.; Lee, J.Y.; Jo, Y.; Kim, G.; Lee, J.K.; Yu, M.J.; et al. An analysis on serious games and stakeholders’ needs for vocal training game development. Commun. Sci. Disord. 2019, 24, 800–813. [Google Scholar] [CrossRef]
  16. Vyshedskiy, A.; Khokhlovich, E.; Dunn, R.; Faisman, A.; Elgart, J.; Lokshina, L.; Gankin, Y.; Ostrovsky, S.; DeTorres, L.; Edelson, S.M.; et al. Novel prefrontal synthesis intervention improves language in children with autism. Healthcare 2020, 8, 566. [Google Scholar] [CrossRef] [PubMed]
  17. Baranowski, T.; Buday, R.; Thompson, D.I.; Baranowski, J. Playing for real: Video games and stories for health-related behavior change. Am. J. Prev. Med. 2008, 34, 74–82. [Google Scholar] [CrossRef] [PubMed]
  18. Des Roches, C.A.; Balachandran, I.; Ascenso, E.M.; Tripodis, Y.; Kiran, S. Effectiveness of an iPad-based software platform (Constant Therapy) in providing rehabilitation for individuals with aphasia and stroke. Front. Hum. Neurosci. 2015, 9, 114. [Google Scholar] [CrossRef]
  19. Holden, R.J.; Karsh, B.T. The Technology Acceptance Model: Its past and its future in health care. J. Biomed. Inform. 2010, 43, 159–172. [Google Scholar] [CrossRef] [PubMed]
  20. Design Council. The Double Diamond: A Universally Accepted Depiction of the Design Process. Available online: https://www.designcouncil.org.uk/our-resources/the-double-diamond/ (accessed on 25 August 2025).
  21. Design Council. Framework for Innovation: The Double Diamond. Available online: https://www.designcouncil.org.uk/our-resources/framework-for-innovation/ (accessed on 25 August 2025).
  22. Nielsen, J.; Molich, R. Heuristic evaluation of user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Seattle, WA, USA, 1–5 April 1990; pp. 249–256. [Google Scholar] [CrossRef]
  23. Mackay, W.E. Video Prototyping: A technique for developing hypermedia systems. In Proceedings of the CHI’88 Conference Companion Human Factors in Computing Systems, Washington, DC, USA, 15–19 May 1988; pp. 219–224. [Google Scholar]
  24. Karras, O.; Unger-Windeler, C.; Glauer, L.; Schneider, K. Video as a By-Product of Digital Prototyping: Capturing the Dynamic Aspect of Interaction. In Proceedings of the 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW), Lisbon, Portugal, 4–8 September 2017; pp. 41–47. [Google Scholar] [CrossRef]
  25. Saeedi, S.; Ghazisaeedi, M.; Ramezanghorbani, N.; Seifpanahi, M.-S.; Bouraghi, H. Design and evaluation of a serious video game to treat preschool children with speech sound disorders. Sci. Rep. 2024, 14, 17299. [Google Scholar] [CrossRef] [PubMed]
  26. Ishaq, K.; Rosdi, F.; Zin, N.A.M.; Abid, A. Heuristics and think-aloud method for evaluating the usability of game-based language learning. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 136–142. [Google Scholar] [CrossRef]
  27. Kuniavsky, M. Observing the User Experience: A Practitioner’s Guide to User Research, 2nd ed.; Morgan Kaufmann: Burlington, MA, USA, 2020. [Google Scholar]
  28. Sanders, E.B.-N.; Stappers, P.J. Co-creation and the new landscapes of design. CoDesign 2008, 4, 5–18. [Google Scholar] [CrossRef]
  29. Scott, A.M.; Clark, J.; Cardona, M.; Atkins, T.; Peiris, R.; Greenwood, H.; Wenke, R.; Cardell, E.; Glasziou, P. Telehealth versus face-to-face delivery of speech language pathology services: A systematic review and meta-analysis. J. Telemed. Telecare 2024, 31, 1203–1215. [Google Scholar] [CrossRef] [PubMed]
  30. Mahmood, E.; Hassan, N.; Qazi, F.; Gohar, S. Effectiveness of game-based interactive approach using deep learning framework for dyslogia. VFAST Trans. Softw. Eng. 2024, 12, 11–22. [Google Scholar] [CrossRef]
  31. Eiser, C.; Morse, R. Can parents rate their child’s health-related quality of life? Results of a systematic review. Qual. Life Res. 2001, 10, 347–357. [Google Scholar] [CrossRef] [PubMed]
  32. Varni, J.W.; Limbers, C.A.; Burwinkle, T.M. The PedsQL™ 4.0 Generic Core Scales: Concordance, reliability, and validity of the self-report of children with cerebral palsy and proxy-report of their parents. Med. Care 2005, 43, 73–84. [Google Scholar]
  33. Bangor, A.; Kortum, P.; Miller, J. Determining what usability scores actually mean: Adopting an adjective rating scale. J. Usability Stud. 2009, 4, 114–123. [Google Scholar]
  34. Akter, S.; d’Ambra, J.; Ray, P. User perception of service quality of mHealth systems: An application of hierarchical semantic differential scale. J. Assoc. Inf. Sci. Technol. 2013, 64, 176–195. [Google Scholar]
  35. Saeedi, S.; Bouraghi, H.; Seifpanahi, M.-S.; Ghazisaeedi, M. Application of digital games for speech therapy in children: A systematic review. J. Healthc. Eng. 2022, 2022, 4814945. [Google Scholar] [CrossRef] [PubMed]
  36. Pereira, F.; Bermúdez i Badia, S.; Ornelas, R.; Cameirão, M.S. Impact of game mode in multi-user serious games for upper limb rehabilitation: A within-person randomized trial on engagement and social involvement. J. Neuroeng. Rehabil. 2019, 16, 109. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Example of idea sketch.
Figure 1. Example of idea sketch.
Applsci 16 00826 g001
Figure 2. Three UX improvement directions and representative improvement proposals for Smart Speech.
Figure 2. Three UX improvement directions and representative improvement proposals for Smart Speech.
Applsci 16 00826 g002
Figure 3. Example of improvement prototype.
Figure 3. Example of improvement prototype.
Applsci 16 00826 g003
Figure 4. Usability comparison of nine improvements of Smart Speech to the original. The grey bars represent the Original version, and the blue bars represent the Proposed version (n = 20). N/A = Not Applicable.
Figure 4. Usability comparison of nine improvements of Smart Speech to the original. The grey bars represent the Original version, and the blue bars represent the Proposed version (n = 20). N/A = Not Applicable.
Applsci 16 00826 g004
Table 1. Representative pain points from heuristic evaluation.
Table 1. Representative pain points from heuristic evaluation.
No.CategoryRepresentative Examples
1Lack of usage instructions[Game selection] Users face difficulty selecting appropriate games, difficulty levels, or training durations due to unclear or hidden instructions.
2Lack of motivation & recurring errors[Game outcome] When tasks require increased effort (e.g., sustaining vocalization), insufficient motivational elements reduce user engagement and persistence.
3Lack of feedback for efficient training[Game setting] In the Sound Height game, only visual cues are provided during pitch setting; absence of auditory feedback makes it difficult for users to anticipate proper vocalization height.
4Lack of constraints to prevent errors[Game difficulty] Difficulty settings do not restrict invalid inputs (e.g., non-numeric entries, reversed difficulty values, extreme values), leading to confusion and reduced reliability.
5Lack of consistency[Interaction flow] Interactive actions for the same command are inconsistent across screens; for example, the function of the “X” button varies, causing confusion about how to exit or proceed.
6Other usability issues[Game screens] Insufficient visual contrast between key gameplay elements and backgrounds reduces immersion; lack of affordance on all in-game screens makes clickable items unclear.
Table 2. Representative pain points from expert interviews.
Table 2. Representative pain points from expert interviews.
No.CategoryRepresentative Examples
1Lack of personalized motivationContinuous motivation such as praise stickers is required for children; cute characters guiding the game would help improve engagement for school-age children.
2Lack of detailed training methods & evaluationThe purpose and location of the articulatory stimulus should be emphasized clearly; narration similar to home-training videos could improve clarity and engagement; pronunciation guidance is insufficient—for example, the required hu sound resembles blowing out a candle, not reading ‘hu.’
3Lack of customized home-based assignmentsPatients need simple, repetitive tasks that can be done independently at home; guidance on the sequence and types of exercises is currently missing.
4Lack of game reliabilityMicrophone settings must be calibrated before training so that ambient noise does not affect scoring, but most users skip this step; in word games, the absence of system responses to vocalizations makes training feel ineffective.
Table 3. Representative pain points from benchmarking.
Table 3. Representative pain points from benchmarking.
No.CategoryRepresentative Examples
1Lack of motivation for child usersUse engaging colors and appealing characters to sustain attention; provide motivational rewards such as praise stickers.
2Lack of structured learning guidanceOffer a clear learning roadmap and daily task lists to support goal setting and consistent practice.
3Lack of customized GUIReplace text-heavy instructions with voice or graphic-based guidance; design visuals using metaphors aligned with children’s cognitive models.
Table 4. Pain point structure of Smart Speech user experiences.
Table 4. Pain point structure of Smart Speech user experiences.
No.CategoryRepresentative Examples
1Lack of effective usage methodsNo explanation to guide effective game selection for therapy; key instructions are buried or unclear.
2Lack of personalized motivationAbsence of motivational triggers (e.g., praise stickers, character interaction, verbal encouragement); some games such as Oral Training lack feedback or scoring.
3Lack of constraints to prevent errorsNo input limits for difficulty or duration; missing guidance for essential settings such as microphone calibration.
4Lack of affordance cuesUsers struggle to interpret mouse cursor or button states; unclear controls (e.g., record button resembles volume control).
5Lack of customized GUIInterfaces do not reflect children’s reduced attention spans; excessive colors distract focus; lack of audio or visual aids for non-literate users.
6Lack of feedback for correct usageNo confirmation of vocal input detection; no auditory feedback when adjusting pitch or volume.
Table 5. Improvement proposals selected for usability evaluation.
Table 5. Improvement proposals selected for usability evaluation.
No.CategoryRepresentative Examples
1Sufficient guide & motivationProvides mission maps and daily missions organized by difficulty level, allowing users to begin training without deciding which game or difficulty to select. Completing missions provides encouraging messages and praise stickers to enhance motivation. A full-game mode remains available for free game selection.
2Solid motivationAllows therapists, guardians, and adult users to review training records. In-game characters provide praise based on performance indicators (e.g., high scores, consistent participation). The program or therapist can deliver encouraging chat messages to sustain user engagement.
3Friendly situational helpProvides a one-time comprehensive introduction upon first launch (e.g., therapeutic purpose of each training type), with optional access thereafter. Voice guidance supports users with low literacy to continue training independently.
4Required setting notificationGuides users through mandatory microphone calibration needed to set baseline vocal levels. Step-by-step instructions ensure correct setup before training.
5Sequentially constrained UIEnforces completion of prerequisite settings—such as target pitch or sound-range adjustment—before starting a training game. Game start is disabled until all essential requirements are met.
6Clear hierarchySeparates default and user-defined difficulty levels to maintain consistent hierarchy. Applies input limits (e.g., maximum duration or breath count) and prevents invalid or non-numeric entries to reduce errors.
7Recognizable UIEnlarges the mouse cursor by 200% and applies context-dependent cursor changes (e.g., move, hover, click) to help users easily identify interactive elements and control states.
8Complete feedbackProvides auditory or visual feedback when users set target parameters (e.g., pitch height), ensuring they understand the correct settings before beginning training.
9Clear signifiersEnhances clarity of buttons, guide texts, and sound-recognition indicators by aligning visual forms with users’ mental models, reducing confusion during game navigation and pre-task steps.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, S.; Kwon, E.; Yu, J.; Choi, Y.; Ko, M.-H.; Jo, Y.-j.; Kim, H.-G.; You, H. User Experience Enhancement of a Gamified Speech Therapy Program Using the Double Diamond Design Framework. Appl. Sci. 2026, 16, 826. https://doi.org/10.3390/app16020826

AMA Style

Kim S, Kwon E, Yu J, Choi Y, Ko M-H, Jo Y-j, Kim H-G, You H. User Experience Enhancement of a Gamified Speech Therapy Program Using the Double Diamond Design Framework. Applied Sciences. 2026; 16(2):826. https://doi.org/10.3390/app16020826

Chicago/Turabian Style

Kim, Sujin, Eunjin Kwon, Jaesun Yu, Younggeun Choi, Myoung-Hwan Ko, Yun-ju Jo, Hyun-Gi Kim, and Heecheon You. 2026. "User Experience Enhancement of a Gamified Speech Therapy Program Using the Double Diamond Design Framework" Applied Sciences 16, no. 2: 826. https://doi.org/10.3390/app16020826

APA Style

Kim, S., Kwon, E., Yu, J., Choi, Y., Ko, M.-H., Jo, Y.-j., Kim, H.-G., & You, H. (2026). User Experience Enhancement of a Gamified Speech Therapy Program Using the Double Diamond Design Framework. Applied Sciences, 16(2), 826. https://doi.org/10.3390/app16020826

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop