Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis

Alzahrani, Latifa; Weerakkody, Vishanth

doi:10.3390/systems14010113

Open AccessArticle

Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis

by

Latifa Alzahrani

^1,*

and

Vishanth Weerakkody

²

¹

Department of Management Information Systems, College of Business Administration, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

²

School of Management, University of Bradford, Emm Lane, Bradford BD9 4JL, UK

^*

Author to whom correspondence should be addressed.

Systems 2026, 14(1), 113; https://doi.org/10.3390/systems14010113

Submission received: 15 November 2025 / Revised: 11 January 2026 / Accepted: 19 January 2026 / Published: 22 January 2026

(This article belongs to the Section Artificial Intelligence and Digital Systems Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

As chatbot technology becomes increasingly prevalent across a wide range of industries, it is crucial to explore the factors that shape user satisfaction with this AI-driven innovation. This research provides insights into how age and gender impact user perceptions and engagement with AI-driven health technologies in Saudi Arabia. The information systems success model has been utilised to determine the effect of age and gender on user satisfaction. A self-administered questionnaire was distributed in two hospitals in Makkah City, Saudi Arabia, and 527 responses were collected from chatbot users. Structural equation modelling via analysis of moment structures validated the model constructs. The findings revealed that the privacy issue on user satisfaction has been significantly greater with males than with females. However, the correlation between user satisfaction and continuance usage intention, as well as net benefits, has been much higher among the females. Also, notable differences were found between user satisfaction and net benefits and continuance usage intention and net benefits, especially when comparing younger and older participants. Across all age groups, user satisfaction consistently emerged as a central driver of continuance usage intention and net benefits, underscoring the importance of fostering satisfaction to enhance the effectiveness of AI-driven chatbots in digital health services. This study can serve as a guide to highlight the importance of chatbot user satisfaction and provide implications, limitations, and future research opportunities.

Keywords:

chatbot; artificial intelligence; digital health services; user satisfaction; age and gender differences

1. Introduction

Chatbot technology has become a prominent focus of research and practice in recent years. Alongside the rapid expansion of messaging platforms such as Facebook Messenger and Slack, and voice-based services like Amazon Alexa and Apple Siri, recent advances in artificial intelligence (AI)—particularly in machine learning and deep learning—have significantly enhanced chatbot capabilities in natural language processing, contextual understanding, response generation, and decision-making [1]. Broadly, chatbots are computer programs designed to interact with users through natural language [2]. AI-driven chatbots have emerged as a viable alternative for organisations seeking to provide individualised and timely support to users in geographically dispersed markets [3]. In this sense, chatbots function as virtual assistants that offer a dynamic and scalable solution for interacting with users, answering queries, resolving problems and delivering information in real time [4].

Despite these advances, the impact of AI-driven chatbots on user satisfaction (US) remains ambiguous, revealing a substantial gap in both empirical research and practical implementation [5,6,7]. While contemporary AI systems demonstrate promising capabilities, there is still limited understanding of how chatbot performance relates to user demographics and satisfaction [5,7]. The increasing deployment of chatbots has attracted the interest of many organisations, particularly in the domain of digital health services. According to Gartner [8], more than half of companies have already invested in chatbot technologies, and by 2030, chatbots are expected to support approximately 25% of all digital health service operations. Nevertheless, scepticism persists, largely due to high failure rates in user–chatbot interactions and the frequent lack of personalised user experiences. For example, a travel search engine surveyed British users to examine their awareness of and expectations regarding chatbots. The results were discouraging: only 5% considered chatbots more reliable than humans, whereas 75% reported at least one concern, including data security, inaccurate responses, misinterpretation or potential manipulation of chatbot-generated answers [9]. Such concerns may cause organisations to hesitate before adopting chatbots in their operations. When evaluating a new technology that typically requires substantial initial investment—such as AI-based chatbots—it is therefore crucial to assess the value it can provide. One of the most effective ways to evaluate technological quality is to examine the user experience.

In the context of digital health services, US is a multi-dimensional construct influenced by system quality (SYQ), information quality (INQ), service quality (SVQ), privacy concerns (PC), net benefits (NETB), and continuance usage intention (CUI) of the system, all of which shape whether user needs are understood and met [4]. A key demographic factor influencing US and the adoption of AI-driven chatbots is gender. Although, globally, men are more likely than women to have access to information technology [10], the pronounced gender divide in access in the United States has nearly disappeared [2]. However, gender differences persist in the intensity and patterns of Internet use and in Internet-related knowledge. Men tend to use the Internet more frequently and for longer durations than women, and they are more likely to engage in online commercial transactions. By contrast, women are more inclined to use the Internet to maintain social relationships with family and friends. Moreover, men generally report higher levels of Internet-related knowledge than women [11].

The use of AI-based chatbots in digital health services has been extensively examined in Western societies, yet their effectiveness in non-Western contexts, such as Saudi Arabia, remains underexplored. This study investigates the influence of age and gender on the adoption of AI-based chatbots in Saudi Arabia. Employing multi-group analysis to identify subtle differences between demographic groups, the study seeks to address this gap by examining how age and gender shape user interactions with AI-driven chatbots designed for digital health services. This leads to the following research question:

RQ:

To what extent does the impact of chatbot use for digital health services on user satisfaction differ across age groups and genders in Saudi Arabia?

In addressing this research question, the study focuses specifically on AI-driven chatbots and aims to identify the key factors that underpin satisfactory service experiences and shape perceptions of SVQ. Understanding these factors is essential for promoting the effective utilisation of AI solutions in digital health. The remainder of this paper is organised as follows. First, the existing literature in this field is reviewed, and a theoretical framework is developed. The subsequent section presents the methodology in detail, followed by an analysis of the empirical findings. Finally, the results are discussed, and their theoretical and practical implications are outlined.

2. Literature Review

This section reviews prior research investigating the influence of age and gender on the use of AI chatbots.

2.1. Importance and Challenges in Digital Health Services

The global market for healthcare chatbots is projected to grow at a compound annual growth rate of 20.8% from 2022 to 2030, underscoring their increasing strategic importance [12,13]. Unlike chatbots deployed in hospitality or consumer services, health-related chatbots are used in high-stakes contexts where accuracy, safety, and trust are paramount. They are credited with improving scalability, efficiency, and convenience in the provision of health services, among other benefits [14]. Numerous chatbot-based applications have been developed to promote healthy lifestyles, physical activity, mental health, and psychological well-being. A recent scoping review suggests that users are generally willing to engage with therapeutic chatbots and tend to view them positively [15]. However, the evidence base regarding the therapeutic effectiveness of chatbots remains mixed and insufficient to draw definitive conclusions about their capacity to foster positive health outcomes and sustained healthy behaviour [16].

Moreover, chatbots often respond slowly or in ways that users perceive as unnatural or inappropriate, particularly when confronted with unexpected or ambiguous input. Safety concerns have also been raised, especially in relation to chatbots that provide health-related information or advice [15]. In terms of US, Xue et al. [17] treated in-app ratings and user reviews as key indicators of user experience. Their analysis of such evaluations revealed that users commented on a wide range of topics, including the technical performance of the app, positive experiences with health-related support, chatbots as supportive companions, concerns about AI capabilities and general recommendations. These topics elicited diverse user reactions, highlighting the need for continual refinement of chatbot design and functionality to address user concerns and enhance satisfaction.

The study further found that different chatbots available in app stores varied considerably in terms of user ratings and the number of reviews. Notably, personalisation features were strongly associated with higher ratings and more frequent reviews, underscoring their importance in promoting US and engagement [16]. Given the expanding research on chatbots and the proliferation of chatbot applications, it is therefore essential to evaluate both the quality of the empirical literature and the specific functional features of chatbots. Such evaluation is necessary to improve system design so that chatbots can more effectively promote health and support behavioural change.

2.2. AI-Driven Tools in Saudi Arabia and Vision 2030

Saudi Arabia has experienced a rapid digital transformation across multiple sectors in recent years. In healthcare, chatbots have been integrated into applications such as Sesahty and Mawid to facilitate appointment scheduling, medical consultations, and support related to COVID-19 [18,19]. It has been reported that more than 70% of citizens have accessed digital health services, indicating a comparatively high level of adoption [20]. Healthcare is an area undergoing substantial reform, and the deployment of emerging technologies, including AI, is expected to enhance service delivery as part of the Vision 2030 human-capital development agenda.

The Saudi government regards AI and digital health as central to improving access, efficiency, and quality of care under Vision 2030 [21]. Subsequent policy initiatives have further promoted digital health adoption in order to advance Vision 2030 goals related to human-capital development and the creation of a knowledge-based society [22]. However, variations in digital skills and literacy among both patients and providers have influenced the uptake and effective use of these technologies. Training and managerial support for healthcare professionals thus remain crucial to ensuring that digital health tools add maximum value to the wider digital transformation envisaged by Vision 2030 [23].

These efforts are coordinated by the Saudi Data and AI Authority, which oversees the integration of AI in alignment with Vision 2030 objectives of economic diversification and innovation. Through strategic investment in AI, Saudi Arabia aims to develop a knowledge-based economy, enhance the quality of public services—particularly SVQ—and position itself as a global leader in digital transformation [24].

2.3. User Satisfaction with AI-Driven Chatbots

US is widely recognised as a critical determinant of the success and effectiveness of information systems (IS). Although it is frequently used as a primary measure of system performance, defining US remains challenging because it is an evaluative and subjective construct rather than a directly observable metric [25]. Consequently, no single, universally accepted definition exists. In some cases, US serves as the sole criterion for determining whether an IS adequately meets organisational needs [26,27]. To ensure the success of an IS, it is therefore essential to identify and incorporate, during design, development, and implementation, the factors that influence or are associated with US [25].

Empirical studies have shown that satisfaction with chatbots is affected by factors such as usability, responsiveness, and the system’s ability to understand user intent [14,28]. For example, Xu et al. [29] demonstrated that social media chatbots create new opportunities for delivering personalised user engagement at scale. By mediating interactions between users and digital services, these chatbots can improve service performance while enabling users to derive social, informational, and economic benefits. Similarly, Ashfaq et al. [2] examined human–chatbot interaction to better understand US and users’ intentions to continue interacting with chatbots. They recommended integrating human support alongside chatbot services to enhance user experience.

In contrast, Cheng and Jiang [5] found that perceived privacy risks associated with chatbot use negatively affected US. Other studies have indicated that user loyalty and the intention to continue using chatbot services exert a significant influence on US [26,30]. Collectively, these findings underscore that US is shaped by a complex interplay of system attributes (such as usability and personalisation), contextual factors (such as privacy and safety), and user-related factors (such as loyalty, demographics, and behavioural intentions). Understanding how these factors operate in specific cultural and societal contexts—such as digital health services in Saudi Arabia—is therefore essential for designing AI-driven chatbots that users perceive as trustworthy, useful, and satisfactory.

2.4. Role of Age in Chatbot Usage

There is a strong effect of age on the usage of chatbots and a mediation effect on variables like efficacy, perceived usability, and perceived usefulness. Similar to the technology acceptance model, with extrinsic variables including age as a dominant factor that largely affects usage via these mediators, age has a direct impact on chatbot usage [31]. Therefore, age is a crucial demographic factor determining the trends in chatbot adoption and usage. Shah et al. [6] analysed a variety of dialogue systems and noted that the younger cohort rated higher performance scores as compared to the older cohort. Differently, Sheehan et al. [7] concluded that the audience between the ages of 18 and 34 years, due to their familiarity with digital tools, have higher comfort levels with AI-driven interactions. In contrast, meta-analysis suggests that the acceptance of technology is not attributable to the age of older adults only [32]. Instead, factors like functional, subjective, or psychosocial age may be more relevant. Measuring age based on perceived physical condition can provide better insights into age-related biophysical and psychosocial changes [33]. Elderly users face various challenges during interaction with the digital world, such as feelings of helplessness and frustration, lack of knowledge, negative attitudes, cognitive deficits, and physical impairments (e.g., vision/hearing loss and mobility limitations) [34]. Such results indicate a need to create age-friendly chatbot platforms that suit the requirements of users.

2.5. Role of Gender in Chatbot Usage and Satisfaction

In gender studies, various theories highlight the differences in how men and women utilise diverse services. For instance, the selectivity hypothesis theory explores gender-based distinctions in information-processing strategies [35]. According to this theory, biological differences between men and women influence how they process information when assessing a service. Meanwhile, women are more visually oriented than men when analysing data about a product or service [36]. In addition, women tend to collect more data before planning, whereas men do not assimilate all available information. These social motives promote different behaviours when males and females choose a service [35]. Studies have shown that men and women exhibit different preferences and behaviours when using chatbots [14]. Terblanche and Kidd [14] showed that gender had a moderating tendency on performance expectancy, which had a more substantial influence on behavioural intent for females than for males. Therefore, it is more important for females than males that a chatbot can deliver on their expectations. Similarly, Wang et al. [28] showed that females had significantly greater satisfaction with chatbot responses to current affairs commentary than males, indicating possible gender-based variations in chatbot evaluations. Zhang et al. [10] found that performance expectancy significantly impacts males more than females. Their findings indicated that the relationship between habit and continuance intention is more relevant for males than females. Alternatively, no significant effect was found on matching the gender of users and chatbots [30]. Gender-based preferences highlight the need for tailored chatbot designs that accommodate diverse user expectations.

2.6. Gaps in Literature

The current literature has analysed the user perceptions of chatbots, specifically in the context of usefulness and user experience (e.g., [5]). However, there is also a significant gap in terms of the utilisation of chatbots. Limited research has been conducted on both the age or gender effect on US and usage intention alone or in combination. Additionally, the majority of the available literature has focused on developed nations (e.g., [2,6]) so the knowledge gap has not been addressed regarding chatbot acceptance and satisfaction within developing countries. Cultural issues might also affect the preferences and expectations of users, necessitating further research in diverse contexts.

3. Conceptual Framework

Success refers to the extent to which the goals established for a task are achieved. For ISs, corresponding indicators must be measured to determine the level of success [37]. One of the most widely used models in IS research is the DeLone and McLean (D&M) model [38]. This has been used in numerous studies to evaluate IS success across different contexts and countries. The model proposes that user perceptions play a central role in determining IS success [37,39,40]. As ISs evolved to offer both information and service functions, D&M [41] incorporated service quality (SVQ) into the updated model.

In this study, a task-oriented chatbot is treated as an IS. User satisfaction refers to the extent to which users’ expectations are fulfilled through their interaction with an information system. In post-adoption contexts, satisfaction is widely recognised as a key antecedent of users’ continuance usage intention, which reflects their intention to continue using a system rather than discontinue it.

Various aspects of perceived quality influence user satisfaction and their intention to continue use of chatbots. If a chatbot lacks sufficient quality, users may find it difficult to navigate, which can reduce their intention to use it. In addition, the information quality (INQ) delivered by chatbots strongly shapes user perceptions and reliance on these systems [41]. Without adequate service quality (SVQ), chatbots may fail to meet user expectations. Because chatbots represent a relatively new technology, each dimension of perceived quality may involve characteristics that differ from those considered in the original DeLone and McLean (D&M) model. System quality (SYQ), information quality (INQ), service quality (SVQ), system use, user satisfaction (US), and net benefits (NETBs) constitute the primary constructs of the updated D&M Information Systems Success Model. The research framework for this study is illustrated in Figure 1. This study develops a research framework grounded in the updated D&M model; however, the model is extended to examine the relationships between demographic factors (i.e., age and gender) and the updated D&M model constructs.

3.1. System Quality

SYQ refers to the system’s technical efficiency, including response time, security, flexibility, usability, and reliability. It is regarded as one of the most significant components of the IS success model [41]. When system evaluations depend on user perceptions and usability, a high-quality chatbot can greatly enhance user interaction. SYQ plays a crucial role in chatbot usage, and when a chatbot is well-designed and effectively implemented, it can contribute to the overall success of an IS. Although the IS success model does not explicitly propose a direct link between SYQ and chatbot usage, previous studies have examined this relationship with inconclusive results [39,42].

Hypothesis H1:

SYQ positively influences US with AI-driven chatbots in digital health services.

3.2. Information Quality

INQ has obtained the desired feature from the system outputs [41]. The INQ provided by chatbots determines US. Users anticipate receiving highly relevant information after investing time and effort in utilising tools and services. If users cannot obtain the necessary assistance, they may need to seek help from alternative sources [37]. This may lead to dissatisfaction with the chatbot service and reluctance to use it. Additionally, if chatbots fail to provide adequate information during interactions, users may have to repeat their queries, which can cause frustration. Ultimately, this may result in a negative perception of the service [42].

Hypothesis H2:

INQ positively influences US with AI-driven chatbots in digital health services.

3.3. Service Quality

SVQ is the level of assistance received by users from the support team. Digital platforms supply many services, and it is critical to recognise their main distinctions. Reliability and effectiveness include the service’s technical and functional reliability, service response and usability [41]. The aesthetics of a chatbot may also impact US. A well-designed user interface may improve SVQ because of its attractiveness. User interface design influences service effectiveness by enabling easy navigation, which enhances usability and speed [1]. A digital service environment differs from the traditional perception of service and should be evaluated through distinct dimensions [37].

Hypothesis H3:

SVQ positively influences US with AI-driven chatbots in digital health services.

3.4. Privacy Concerns

PCs refer to a sense of unease regarding the information practices of platforms or third parties [41]. Such concerns can disrupt user expectations for chatbot services, reducing satisfaction. Furthermore, PCs can adversely influence user attitudes towards ISs. If users have considerable reservations about how chatbots handle their information, they may perceive that their needs are not being adequately addressed [5]. Previous studies utilising the D&M model have primarily evaluated SVQ, INQ, and SYQ [5,39]. However, limited research has explored the impact of PCs within this framework. Chatbot services may increase PCs, as users must share personal information to receive tailored recommendations.

Hypothesis H4:

PCs influence US with AI-driven chatbots in digital health services.

3.5. User Satisfaction

User satisfaction is a construct that indicates the extent to which user expectations are fulfilled by products or services [41]. The level of satisfaction significantly influences user behavioural intention towards technology. To connect user feelings to sustained loyalty to services, satisfaction is crucial [2]. US represents user comfort with and acceptance of any application while engaging with the system [39]. Floropoulos et al. [40] assessed US use factors, such as technological support for work, fulfilment of expectations, and overall satisfaction.

Hypothesis H5:

US positively influences the continuance usage intention of AI-driven chatbots in digital health services.

Hypothesis H6:

US positively influences the NETBs of AI-driven chatbots in digital health services.

3.6. Continuance Usage Intention (CUI)

Continuance usage intention refers to users’ intention to maintain ongoing use of an information system based on their prior experiences and evaluations. In survey-based IS research, particularly in cross-sectional designs, continuance usage intention is commonly considered as a proxy for post-adoption usage intention when objective system usage data are unavailable. Prior studies have shown that satisfied users are more likely to rely on a system and to express strong intentions to continue its use over time. Accordingly, this study conceptualises continuance usage intention as a post-adoption outcome that reflects sustained engagement with AI-driven chatbots in digital health services.

Hypothesis H7:

Continuance usage intention positively influences the net benefits of AI-driven chatbots in digital health services.

3.7. Gender and Age Effects

According to the selectivity hypothesis theory, gender is expected to have significant influence on how users evaluate AI-driven chatbots and how satisfaction translates into continuance usage intention [35]. Studies have shown that females are more likely to focus on affective evaluations, such as satisfaction, when forming post-adoption behavioural intentions, whereas males tend to emphasise instrumental and habitual considerations [28,36]. In addition, age-related differences in digital literacy, cognitive processing, and technology familiarity may influence user satisfaction and their post-adoption intentions [31]. Sheehan et al. [7] concluded that younger users (digital natives) are highlighted to have lower cognitive barriers and greater comfort with AI-driven systems than older users (digital immigrants). Therefore, the effects of user satisfaction on continuance usage intention are expected to differ systematically across gender and age groups. Accordingly, this study proposes the following hypotheses:

Hypothesis H8:

The positive relationship between user satisfaction and continuance usage intention will be stronger for female users than for male users.

Hypothesis H9:

The positive relationship between user satisfaction and continuance usage intention will be stronger for younger users than for older users.

4. Methodology

4.1. Measurement Development

A questionnaire was originally developed in English and later translated into Arabic. The items included in the questionnaire were drawn from previous studies on IS, with the metrics tailored to fit the context of this study (Table 1). Additionally, the author shared the finalised questionnaire for review, followed by a preliminary empirical pre-test. As highlighted by Mumtaz et al. [43], it was essential to test the questionnaire, which three domain experts in chatbot usage carried out. The pre-test aimed to confirm the questionnaire design, clarity, and relevance of the items [44]. Based on the pre-test feedback, some items were adjusted to improve readability. The responses were measured using a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree).

4.2. Data Collection

A self-administered questionnaire was distributed in two hospitals in Makkah City, Saudi Arabia. Ethical approval for this study was obtained from the Local Committee of Ethical Research in Makkah, Health Makkah Cluster, under approval number [H-02-K-076-0325-1309]. In this study, only active users of AI-driven digital health chatbots with prior interaction experience were included. Specifically, participants had previously used government-provided digital health chatbots, such as Sehhaty and Mawid, which are widely used in Saudi Arabia for appointment booking, medical inquiries, and other digital health services. Respondents who had not previously used such chatbots were excluded to ensure the validity of the user satisfaction and continuance usage intention measures. All participants were informed about the purpose of the study, and informed consent was obtained before data collection. A total of 527 responses were collected from chatbot users through a convenience sampling strategy. One of the primary disadvantages of convenience sampling is the potential sample bias [44]. Because participants are selected more conveniently than under randomisation, the sample may not accurately reflect the population of interest [43]. Participants were neither paid to participate in the study nor coerced, as answering the survey was voluntary. This also made the survey more unbiased than if there had been some compensation for completion. Survey participants were told to keep all responses anonymous to maintain transparency. Without establishing a personal link to any particular survey response, the responses could solely be utilised for statistical analysis [45]. Because of our familiarity with the platform, we used Google Forms. Its usage in survey administration and data processing for previous projects has raised knowledge of its potential and efficiency for this study. Because of the platform’s user-friendly design, the author and participants used Google Forms in this study. Furthermore, Google Forms was successfully utilised by Cheng and Jiang [5] and several other researchers. Additionally, demographic data, such as gender and age, were collected. For participants, we created a simplified Arabic version to satisfy linguistic preferences. In January 2025, the pre-test was conducted to evaluate the measurement instrument’s reliability. Data from 527 participants were gathered. The findings show high reliability with Cronbach’s α above 0.7.

4.3. Data Analysis

We employed structural equation modelling to evaluate our model using the IBM Analysis of Moment Structures (AMOS) software 21 Version. SPSS AMOS utilises a structural equation modelling (SEM) approach called covariance structure analysis or causal modelling. This method encompasses various traditional statistical techniques, such as the general linear model, factor analysis, regression, correlation, and analysis of variance. Additionally, it facilitates the development of attitude and behaviour models, capturing complex relationships more precisely than conventional multivariate statistical methods through an intuitive graphical or programmatic interface [46]. AMOS-SEM was appropriate for our research because we were testing an exploratory research model [47]. Because weak measurements in any concept may significantly affect all of the estimates in the covariance-based SEM, AMOS-SEM was more suited for the early phases of theory construction than covariance-based SEM [48]. SEM is a versatile multivariate technique increasingly employed to examine and evaluate complex causal relationships among multiple variables. AMOS is extensively utilised in IS research due to its flexibility, as it does not require assumptions about variable distributions and ensures high predictive accuracy [43,47]. Additionally, it is particularly advantageous for analysing complex models involving numerous constructs and indicators [48].

5. Results

Demographic statistics for the respondents are listed in Table 2. We collected data from 527 participants. Regarding gender, there were 242 male respondents, making up 45.9% of the total, and 285 female respondents, accounting for 54.1%. Regarding age distribution, 122 respondents (23.1%) were 18–30 years old. The age group of 3–40 years had the highest representation, with 191 respondents (36.2%). The age group of 41–50 comprised 115 respondents (21.8%), while the age group over 50 included 99 respondents (18.8%).

5.1. Reliability and Validity Evaluation

We ran the AMOS-SEM algorithm and evaluated its reliability and validity. Composite reliability (CR) evaluates the internal consistency of a construct within a model, determining the effectiveness of observed variables in reflecting the latent construct [46]. As shown in Table 3, the CR values for all examined constructs exceeded the threshold of 0.7. All constructs had CR values above the acceptable threshold of 0.70, indicating sufficient internal consistency. Constructs like SYQ, NETBs, PC, and US were highly reliable, while INQ, SVQ, and CUI demonstrated acceptable reliability.

The validity of the measurement model was evaluated by analysing convergent and discriminant validity. Convergent validity examines whether the indicators of a construct are highly correlated and accurately reflect the same underlying construct. A key measure for assessing convergent validity is the average variance extracted (AVE). For the model to be considered valid, the AVE for each variable must be at least 0.50 [48]. As shown in Table 4, the AVE values for the seven constructs were all above 0.5, confirming their acceptability. The SYQ had the highest value (0.679), followed by NETBs (0.654). All constructs had AVE values above 0.50, indicating acceptable convergent validity. Constructs such as SYQ, NETBs, and PCs exhibited good convergent validity, while INQ, SVQ, US, and CUI showed moderate but sufficient convergent validity. Conversely, CUI had the lowest value (0.525). Further refinement of the indicators for CUI could strengthen their AVE values.

The discriminant validity evaluates the distinction between constructs that are meant to be unrelated. The assessment of discriminant validity is the measure of model validity. Testing for differences between the latent variables is the goal of discriminant validity. It guarantees that a latent variable accounts for a more significant proportion of variation in its indicators than other constructs [48]. Discriminant validity was assessed by comparing the correlations between constructs with the square roots of their respective AVE values. Table 5 lists each latent variable’s square root of the AVE values. In every case, the off-diagonal correlation values were consistently lower than the diagonal values (square root of AVE value) per construct. Every construct satisfied the Fornell–Larcker criterion, indicating that discriminant validity was attained for this model. Good discriminant validity was shown by the model’s separate and non-overlapping constructs. The square roots of the AVE values per latent variable were larger than any correlation associated with each latent variable when compared with the other values on each column. According to these findings, the discriminant validity of the latent variables was adequate [49].

5.2. Structure Model Evaluation

Once the reliability and validity of the measurement model were confirmed, the structural model was evaluated. This section outlines the key steps in assessing the structural model. Figure 2 presents the final structural model together with the path coefficient results. An unstandardised estimate reflects the raw regression coefficient, indicating the change in the dependent variable produced by a one-unit increase in the predictor. A standardised estimate converts these values into a common scale (z-scores), enabling comparison across variables with different units of measurement [49].

The structural model was assessed using the unstandardised estimate, standardised estimate, critical ratio (CR), and p-value, as summarised in Table 6. The results show that SYQ explains 27.6% of the variance in US, INQ explains 19.3%, SVQ explains 16.1%, and PCs explain 22%. US strongly influences CUI, accounting for 49.3% of its variance. US also affects NETBs, contributing 31.5%, while CUI contributes 36.5% to NETBs. The standardised indirect effect of US on NETBs is 0.180 with a p-value of 0.000, indicating that US indirectly influences NETBs through CUI, and that this mediation effect is statistically significant.

Model fit indices were then examined to determine how well the proposed model aligned with the observed data. The comparative fit index value of 0.959 demonstrates an excellent model fit. Overall, the findings indicate that the hypothesised model fits the data well and does not require substantial modification.

5.3. Multi-Group Analysis

The multi-group analysis tested whether there is a statistically significant difference in the group-specific parameter estimates. In particular, the current study focused on investigating the age and gender effects on US and the usage of AI-driven chatbots in digital health services through multi-group analysis. We employed SEM and confirmatory factor analysis to measure interrelations between a set of constructs, which included SYQ, SVQ, INQ, PCs, US, CUI, and NETBs.

5.3.1. Gender and US

Male group: PC is the only significant predictor of US for male respondents, with a standardised effect of 0.297. US strongly and significantly influences CUI (0.543), indicating that satisfied users will likely use the system actively. The effect of US on NETBs is small and non-significant (0.165). CUI has almost no influence on NETBs (0.003, non-significant). The indirect effect of US on NETBs via CUI is 0.002 and non-significant (p = 0.922), suggesting no meaningful indirect relationship.

Female group: SYQ has the strongest influence on US among the predictors, with a standardised effect of 0.303. In addition, INQ and PC significantly contribute to US, with standardised effects of 0.220 and 0.206, respectively. SVQ has a smaller but significant impact on US (0.148). US strongly influences CUI (0.436) and NETBs (0.370). CUI has a significant positive effect on NETBs (0.376). The indirect effect of US on NETBs via CUI is 0.164, with a p-value of 0.000, indicating a significant indirect effect. This shows that US influences NETBs directly and indirectly through CUI. A comparison of males and females through standardised and unstandardised estimates, along with their critical ratios and p-values, is presented in Table 7.

Male vs. female groups: The critical ratio values assess differences in path estimates between male and female participants, determining whether the constructs influence each group differently. A critical ratio of at least 1.96 indicates a significant difference at the 0.05 level, while a value below 1.96 suggests no significant difference. The findings reveal that the relationships between US and NETBs and between US and CUI are significantly stronger among females. Similarly, the connection between CUI and NETBs is also more pronounced for females. In contrast, the relationship between PCs and US is significantly stronger among males. The effects of SYQ, INQ, and SVQ on US do not differ significantly between males and females. The impact of PCs on US is significantly stronger for males than females. Conversely, the effects of US on CUI and NETBs are significantly stronger for females. The critical ratio values with interpretation (males vs. females) are presented in Table 8.

5.3.2. Age and User Satisfaction

The age attribute was categorised into groups A (18–30 years), B (31–40 years), C (41–50 years), and D (over 50 years).

Group A (18–30 years): SVQ is the strongest predictor of US among the exogenous variables, with a standardised effect of 0.288. Furthermore, PC significantly impacts US, with a standardised effect of 0.222. SYQ and INQ positively influence US but are not significant contributors. US strongly influences CUI (0.496) and contributes to NETBs (0.214). In addition, CUI significantly affects NETBs, with a standardised effect of 0.431. The indirect effect of US on NETBs through CUI is 0.214, being significant (p = 0.002). This indicates that US indirectly contributes to NETBs via CUI.

Group B (31–40 years): INQ has the strongest influence on US, with a standardised effect of 0.313, followed by SYQ (0.268) and PC (0.276). However, SVQ shows a negative and non-significant effect on US, suggesting that for group B, SVQ does not play a critical role in determining US. US significantly influences CUI (0.385) and NETBs (0.547). CUI also significantly impacts NETBs, but its effect (0.199) is smaller than that of US. Moreover, the indirect effect of US on NETBs through CUI is 0.077, which is significant (p = 0.035). This means that US directly and indirectly contributes to NETBs via CUI.

Group C (41–50 years): SYQ has the strongest influence on US, with a standardised effect of 0.437, followed by PC (0.359) and SVQ (0.259). INQ positively impacts US, but the effect is not statistically significant. US significantly influences CUI (0.682) and NETBs (0.402), showing its central role in driving system usage and benefits. Furthermore, CUI has a positive but non-significant effect on NETBs (0.193). The indirect effect of US on NETBs through CUI is 0.132, which is non-significant (p = 0.305). This indicates that while US directly impacts NETBs, its indirect effect through CUI is weaker and non-significant.

Group D (>50 years): SYQ has the strongest influence on US, with a standardised effect of 0.437, followed by PC (0.359) and SVQ (0.259). INQ positively impacts US, but the effect is not statistically significant. US significantly influences CUI (0.682) and NETBs (0.402), showing its central role in driving system usage and benefits. However, CUI has a positive but non-significant effect on NETBs (0.193). The indirect effect of US on NETBs through CUI is 0.132, which is non-significant (p = 0.305). This indicates that while US strongly impacts NETBs directly, its indirect effect through CUI is weaker and non-significant for this group. A comparison of various age groups through standardised and unstandardised estimates, along with their critical ratios and p-values, is presented in Table 9.

Groups A and B: The results reveal that both groups differ significantly in how SVQ affects US (SVQ → US (−2.123)). This indicates that the importance of SVQ varies between both groups. The relationship between US and NETBs differs significantly between the two groups (US → NETBs (2.098)). Alternatively, the effects of SYQ, INQ, and PC on US do not differ significantly between groups. Similarly, the effects of US on CUI and CUI on NETBs are consistent across the two age groups.

Groups A and C: The findings suggest that none of the paths have a critical ratio value exceeding the threshold of 1.96. This indicates that the relationships between the constructs are consistent across groups A and C.

Groups A and D: The relationship between US and CUI differs significantly between groups A and D (US → CUI (2.583)). The relationship between CUI and NETBs differs significantly between the two groups (CUI → NETBs (−2.212)). Still, paths SYQ → US, SVQ → US, INQ → US, PC → US, and US → NETBs show no significant differences, indicating that these relationships are consistent between groups A and D.

Groups B and C: The results show that the impact of US on NETBs differs significantly between groups B and C (US → NETBs (−4.074)). The effect of CUI on NETBs also shows a significant difference between the two groups (CUI → NETBs (2.456)). Nonetheless, the influence of PC on US is close to being significantly different between groups B and C (PC → US (−1.941)). This suggests that PCs play varying roles in satisfaction for these groups. The effects of SYQ, SVQ, and INQ on US and US on CUI are consistent across groups B and C.

Groups B and D: The impact of SVQ on US differs significantly between groups B and D (SVQ → US (2.212)). This suggests that SQA plays a different role in influencing satisfaction for these two age groups. Similarly, the relationship between US and CUI differs significantly between the two age groups (US → CUI (3.188)). The effects of SYQ, INQ, and PC on US, as well as paths US → NETBs and CUI → NETBs, are consistent across groups B and D.

Groups C and D: The relationship between US and NETBs differs significantly between groups C and D (US → NETBs (2.175)). However, the effect of US on CUI significantly differs between the two groups (US → CUI (2.188)). Similarly, the impact of CUI on NETBs differs significantly between the two groups (CUI → NETBs (−2.984)). Paths SYQ → US, SVQ → US, INQ → US, and PC → US show no significant differences, indicating that these relationships are consistent across groups C and D. Data analysis shows significant differences in the relationship between US and NETBs, as well as CUI and NETBs, particularly between younger and older age groups. A pairwise comparison of critical ratio values between various age groups is presented in Table 10.

The findings highlight the complexity of US and chatbot usage, with different constructs influenced by demographic groups. For males, PCs were crucial for satisfaction, while SYQ and SVQ played a more significant role for females. Across all age groups, US consistently emerged as a central driver of CUI and NETBs, underscoring the importance of fostering satisfaction to enhance the effectiveness of AI-driven chatbots in digital health services.

6. Discussion

6.1. The Moderating Role of Age and Gender on Chatbot Adoption

This study provides strong evidence that both gender and age play an important role in shaping how users adopt chatbots. The quantitative results indicate that older adults are generally less inclined to continue to use chatbots, suggesting that demographic factors continue to matter even as conversational technologies become more widespread. Older individuals often face a range of barriers when engaging with digital tools, including frustration, limited digital literacy, negative attitudes, and, in some cases, cognitive or physical difficulties [34]. However, acceptance cannot be reduced to age alone [32]. Factors such as functional, subjective, and psychosocial age frequently offer a more accurate explanation for variations in technology continued use [33].

By contrast, younger adults tended to show a greater openness to technological innovation, and many preferred to avoid direct human interaction by communicating with chatbots instead. A closer look at the age trends also suggests that younger users may experience lower satisfaction levels than other age groups. Members of this group often face considerable pressure in both their work and personal lives, which may heighten their expectations for quick and efficient service. Previous research has noted that middle-aged users are often especially critical in their evaluations, with even minor delays shaping their perceptions. In this study, however, such effects were not statistically significant and should be interpreted only as a slight tendency rather than a robust difference. Larger and more systematic studies are needed to determine whether these patterns reflect genuine age-related effects or are simply due to sampling variation.

Gender differences also emerged. The impact of privacy concerns on user satisfaction was noticeably stronger for men, while women showed a stronger relationship between satisfaction and both actual use and perceived net benefits. The four groups used in the quantitative analysis were defined according to user satisfaction, reflecting the tendency for individuals to rely on chatbots for quick answers to relatively straightforward questions. Overall, the results suggest that age and gender meaningfully moderate satisfaction and adoption, although the present study does not explain the underlying causes of these behavioural differences.

6.2. Alignment with Existing Research

A key finding of this research is that privacy concerns had a stronger effect on user satisfaction among men than women, while women showed a more pronounced influence of satisfaction on actual use and perceived benefits. The existing literature also reports gender differences in chatbot interactions [14]. Women tend to emphasise performance expectancy and satisfaction, valuing chatbots that reliably meet their expectations, whereas men are more influenced by habit and intentions to continue using the system [28]. Some studies indicate a preference among women for more empathetic chatbot responses, while others find that gender matching has no meaningful effect, suggesting diverse patterns in how men and women evaluate chatbot services [10,30].

The four user groups in the analysis were formed based on satisfaction levels, and most users relied on chatbots to obtain clear and straightforward answers. These findings align with results from other quantitative studies examining user experiences with chatbots [14,32]. The analysis also suggests that older participants valued human interaction more, preferring to speak with a person over the phone. Younger adults demonstrated the opposite pattern and were more comfortable interacting with chatbots. Other quantitative studies have shown that younger users often have positive experiences with chatbots because they are more familiar with digital tools and generally more comfortable with AI-driven interactions [7,50,51]. As noted earlier, chronological age alone cannot explain these patterns [32]; functional, subjective, and psychosocial age often provide deeper insight [33]. Older adults frequently encounter challenges such as limited digital confidence, negative experiences with technology, and physical or cognitive constraints [34], which can make continued use of chatbots more difficult.

6.3. Positioning Within Theoretical Frameworks

The selectivity hypothesis [35] underpins the theoretical approach taken in this study and remains relevant for future research. The findings support the idea that older adults place a higher value on meaningful and socially rich interactions than younger adults. While the framework helps interpret the observed behavioural patterns, it does not directly explain the mechanisms behind age- or gender-related differences in chatbot adoption.

6.4. Implications for Digital Health Practice

These findings carry important implications for digital health services, especially as health systems increasingly rely on automated tools to cope with rising demand. Older adults often encounter obstacles such as low digital literacy, a sense of helplessness, and health-related barriers. However, chatbots may help mitigate some of these difficulties. Their conversational structure allows information to be broken into manageable steps, which may be particularly helpful for users with declining memory or slower processing capacity. Given these insights, policymakers and designers of digital health tools should consider tailoring chatbot interfaces and communication styles to suit different age groups. Doing so may promote wider adoption, improve user experience, and support more equitable access to digital health services. Ultimately, improved uptake may translate into better health outcomes and reduced operational strain on healthcare systems.

7. Conclusions

This research adds to the growing literature on AI-driven chatbots by examining how age and gender moderate user satisfaction within the IS success model. The study aimed to test the applicability of the D&M model in evaluating chatbot satisfaction among users in Saudi Arabia. The results confirm the relevance of age and gender in shaping satisfaction and adoption. Privacy concerns had a significant effect on satisfaction for men but not women, while satisfaction had a stronger influence on continued use intention and perceived benefits for women. Differences also emerged between younger and older adults in the relationships among net benefits, continued use intention, and satisfaction.

A notable finding is that older adults were generally less willing to adopt chatbots, which may reflect prevailing socio-cultural norms and religious values that encourage more traditional forms of communication. Across all age groups, satisfaction consistently emerged as a major driver of both continued use intention and perceived benefits. These results highlight the importance of designing inclusive and adaptive chatbot systems and reinforce the value of applying the D&M IS success model across diverse cultural contexts.

8. Research Limitations and Future Work

Although the study offers useful insights into chatbot adoption in the Saudi Arabian context, several limitations should be acknowledged. First, convenience sampling was used, which may limit the generalisability of the results. Hospital visitors and patients may differ from the wider population, as their health-seeking context could influence their perceptions of continued use and perceived benefits of chatbots. Future studies should include more diverse and representative samples.

The cross-sectional design also limited the ability to observe changes over time or draw firm causal conclusions. These findings may serve as an early step toward developing hypotheses about age-related differences in chatbot satisfaction and adoption. Further research is needed to determine whether distinct age groups genuinely prefer chatbot communication over human interaction. Future work should consider larger sample sizes, randomised sampling methods, and longitudinal designs to track changes in perceptions over time. Such enhancements would improve the robustness of the findings and offer a more detailed understanding of the factors shaping AI technology adoption in a variety of settings.

In addition, this study adopted self-reported survey data and employed a cross-sectional design. As a result, continuance usage intention was used as a proxy for post-adoption usage behaviour rather than objective system usage metrics, such as frequency or duration of use. Although this approach is consistent with prior IS research, future studies should incorporate system log data or longitudinal designs to better capture usage behaviour over time. Finally, while the convergent validity of all constructs met recommended thresholds, the average variance extracted (AVE) for continuance usage intention was relatively close to the minimum acceptable value. Future research could refine the measurement items or employ alternative operationalisations to strengthen construct validity and enhance the robustness of investigations into AI-driven chatbot adoption in digital health services.

Author Contributions

Conceptualisation, L.A. and V.W.; Methodology, L.A. and V.W.; Formal analysis, L.A.; Investigation, L.A. and V.W.; Writing—original draft, L.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was obtained from the Local Committee of Ethical Research in Makkah, Health Makkah Cluster, under approval number [H-02-K-076-0325-1309].

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Participation in the survey was voluntary, and respondents were informed about the purpose of the research and assured of the confidentiality and anonymity of their responses.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to acknowledge the Deanship of Graduate Studies and Scientific Research, Taif University, for funding this work. They also wish to thank the Health Makkah Cluster and the two hospitals in Makkah City for facilitating data collection and the participants for their time and cooperation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Paliwal, S.; Bharti, V.; Mishra, A.K. Ai chatbots: Transforming the digital world. In Recent Trends and Advances in Artificial Intelligence and Internet of Things; Springer International Publishing: Cham, Switzerland, 2019; pp. 455–482. [Google Scholar]
Ashfaq, M.; Yun, J.; Yu, S.; Loureiro, S.M.C. I, Chatbot: Modeling the determinants of users’ satisfaction and continuance intention of AI-powered service agents. Telemat. Inform. 2020, 54, 101473. [Google Scholar] [CrossRef]
Prahalad, C.K.; Ramaswamy, V. Co-creating unique value with customers. Strategy Leadersh. 2004, 32, 4–9. [Google Scholar] [CrossRef]
Castro, D.; New, J. The promise of artificial intelligence. Cent. Data Innov. 2016, 115, 32–35. [Google Scholar]
Cheng, Y.; Jiang, H. How do AI-driven chatbots impact user experience? Examining gratifications, perceived privacy risk, satisfaction, loyalty, and continued use. J. Broadcast. Electron. Media 2020, 64, 592–614. [Google Scholar] [CrossRef]
Shah, H.; Warwick, K.; Vallverdú, J.; Wu, D. Can machines talk? Comparison of Eliza with modern dialogue systems. Comput. Hum. Behav. 2016, 58, 278–295. [Google Scholar] [CrossRef]
Sheehan, B.; Jin, H.S.; Gottlieb, U. Customer service chatbots: Anthropomorphism and adoption. J. Bus. Res. 2020, 115, 14–24. [Google Scholar] [CrossRef]
Says, G. Percent of Customer Service Operations Will Use Virtual Customer Assistants by 2020. Gartner Newsroom. 2018. Available online: https://www.gartner.com/en/newsroom/press-releases/2018-02-19-gartner-says-25-percent-of-customer-service-operations-will-use-virtual-customer-assistants-by-2020. (accessed on 3 April 2024).
Elsner, N. KAYAK mobile travel report: Chatbots in the UK. Dostupno 2017, 23, 2020. [Google Scholar]
Zhang, B.; Zhu, Y.; Deng, J.; Zheng, W.; Liu, Y.; Wang, C.; Zeng, R. “I am here to assist your tourism”: Predicting continuance intention to use ai-based chatbots for tourism. does gender really matter? Int. J. Hum.–Comput. Interact. 2023, 39, 1887–1903. [Google Scholar] [CrossRef]
Barrett, C. Pew internet and American life project. In Encyclopedia of Behavioral Medicine; Springer: Berlin/Heidelberg, Germany, 2020; pp. 1656–1658. [Google Scholar]
George, A.H.; George, A.S. From Pulse to Prescription: Exploring the Rise of AI in Medicine and Its Implications. Partn. Univers. Int. Innov. J. 2023, 1, 38–54. [Google Scholar]
Parveen, A.; Kannan, G. Healthcare transformed: A comprehensive survey of artificial intelligence trends in healthcare industries. In Digital Healthcare in Asia and Gulf Region for Healthy Aging and More Inclusive Societies; Volume 4 in Information Technologies in Healthcare Industry; Academic Press: Cambridge, MA, USA, 2024; pp. 395–424. [Google Scholar]
Terblanche, N.; Kidd, M. Adoption factors and moderating effects of age and gender that influence the intention to use a non-directive reflective coaching chatbot. Sage Open 2022, 12, 21582440221096136. [Google Scholar] [CrossRef]
Laranjo, L.; Dunn, A.G.; Tong, H.L.; Kocaballi, A.B.; Chen, J.; Bashir, R.; Surian, D.; Gallego, B.; Magrabi, F.; Lau, A.Y.S.; et al. Conversational agents in healthcare: A systematic review. J. Am. Med. Inform. Assoc. 2018, 25, 1248–1258. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Oh, Y.J.; Lange, P.; Yu, Z.; Fukuoka, Y. Artificial intelligence chatbot behavior change model for designing artificial intelligence chatbots to promote physical activity and a healthy diet. J. Med. Internet Res. 2020, 22, e22845. [Google Scholar] [CrossRef] [PubMed]
Xue, J.; Zhang, B.; Zhao, Y.; Zhang, Q.; Zheng, C.; Jiang, J.; Li, H.; Liu, N.; Li, Z.; Fu, W.; et al. Evaluation of the current state of Chatbots for digital health: Scoping review. J. Med. Internet Res. 2023, 25, e47217. [Google Scholar] [CrossRef] [PubMed]
Alghareeb, M.; Albesher, A.; Asif, A. Studying Users’ Perceptions of COVID-19 Mobile Applications in Saudi Arabia. Sustainability 2023, 15, 956. [Google Scholar] [CrossRef]
Asiri, A.A.; Al-Qahtani, F.S.; Al-Saleh, M.M.; Alhayyani, R.M.; Alfaya, F.A.; Alfaifi, S.H.; Al-Badour, H.M. Impact of electronic health services on patient satisfaction in primary health care centers in Southwestern Saudi Arabia. J. Fam. Med. Prim. Care 2024, 13, 85–92. [Google Scholar] [CrossRef]
Mani, Z.A.; Goniewicz, K. Transforming healthcare in Saudi Arabia: A comprehensive evaluation of vision 2030’s impact. Sustainability 2024, 16, 3277. [Google Scholar] [CrossRef]
Housawi, A.A.; Lytras, M.D. A strategic framework for digital transformation in healthcare: Insights from the Saudi Commission for Health Specialties. In Digital Transformation in Healthcare in Post-COVID-19 Times; Elsevier: Amsterdam, The Netherlands, 2023; pp. 173–192. [Google Scholar]
Muafa, A.M.; Al-Obadi, S.H.; Al-Saleem, N.A.I.; Taweili, A.A.; Al-Amri, A.G. The Impact of Artificial Intelligence Applications on the Digital Transformation of Healthcare Delivery in Riyadh, Saudi Arabia (Opportunities and Challenges in Alignment with Vision 2030). Acad. J. Res. Sci. Publ. 2024, 5, 61–102. [Google Scholar] [CrossRef]
Alatawi, A.; Ahmed, S.; Niessen, L.; Khan, J. Systematic review and meta-analysis of public hospital efficiency studies in Gulf region and selected countries in similar settings. Cost Eff. Resour. Alloc. 2019, 17, 17. [Google Scholar] [CrossRef]
Memish, Z.A.; Altuwaijri, M.M.; Almoeen, A.H.; Enani, S.M. The Saudi Data & Artificial Intelligence Authority (SDAIA) vision: Leading the kingdom’s journey toward global leadership. J. Epidemiol. Glob. Health 2021, 11, 140–142. [Google Scholar]
Gelderman, M. The relation between user satisfaction, usage of information systems and performance. Inf. Manag. 1998, 34, 11–18. [Google Scholar] [CrossRef]
Leclercq, A. The perceptual evaluation of information systems using the construct of user satisfaction: Case study of a large French group. ACM SIGMIS Database Database Adv. Inf. Syst. 2007, 38, 27–60. [Google Scholar] [CrossRef]
Liu, W.; Hu, J.; Lv, F.; Tang, Z. A new method for long-term temperature compensation of structural health monitoring by ultrasonic guided wave. Measurement 2025, 252, 117310. [Google Scholar] [CrossRef]
Wang, M.-C.; Lin, Y.-T.; Huang, C.-Y.; Huang, C.-Y.; Yeh, P.-K. Empirical Assessment of User Satisfaction with American and Chinese AI Chatbots. Int. J. 2023, 10, 3199–3211. [Google Scholar] [CrossRef]
Xu, A.; Liu, Z.; Guo, Y.; Sinha, V.; Akkiraju, R. A new chatbot for customer service on social media. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Denver, CO, USA, 6–11 May 2017; pp. 3506–3510. [Google Scholar]
Reinkemeier, F.; Gnewuch, U. Match or mismatch? How matching personality and gender between voice assistants and users affects trust in voice commerce. In Proceedings of the 55th Hawaii International Conference on System Sciences, Virtual Event, 4–7 January 2022. [Google Scholar]
McFarland, D.J. The Role of Age and Efficacy on Technology Acceptance: Implications for E-Learning. In Proceedings of the WebNet 2001—World Conference on the WWW and Internet, Orlando, FL, USA, 23–27 October 2001. [Google Scholar]
Ma, Q.; Chan, A.H.; Teh, P.-L. Insights into older adults’ technology acceptance through meta-analysis. Int. J. Hum.–Comput. Interact. 2021, 37, 1049–1062. [Google Scholar] [CrossRef]
Cegolon, A.; Jenkins, A. Older adults, cognitively stimulating activities and change in cognitive function. Int. J. Lifelong Educ. 2022, 41, 405–419. [Google Scholar] [CrossRef]
Liu, L.; Wu, F.; Tong, H.; Hao, C.; Xie, T. The digital divide and active aging in China. Int. J. Environ. Res. Public Health 2021, 18, 12675. [Google Scholar] [CrossRef]
Meyers-Levy, J.; Maheswaran, D. Exploring differences in males’ and females’ processing strategies. J. Consum. Res. 1991, 18, 63–70. [Google Scholar] [CrossRef]
Meyers-Levy, J. The influence of sex roles on judgment. J. Consum. Res. 1988, 14, 522–530. [Google Scholar] [CrossRef]
Petter, S.; DeLone, W.; McLean, E. Measuring information systems success: Models, dimensions, measures, and interrelationships. Eur. J. Inf. Syst. 2008, 17, 236–263. [Google Scholar] [CrossRef]
DeLone, W. The DeLone McLean Model of Success System Information: A ten-Year Update. J. Manag. Inf. 1992, 19, 9–30. [Google Scholar]
Al-Hattami, H.M. Validation of the D&M IS success model in the context of accounting information system of the banking sector in the least developed countries. J. Manag. Control 2021, 32, 127–153. [Google Scholar]
Floropoulos, J.; Spathis, C.; Halvatzis, D.; Tsipouridou, M. Measuring the success of the Greek taxation information system. Int. J. Inf. Manag. 2010, 30, 47–56. [Google Scholar] [CrossRef]
DeLone, W.H.; McLean, E.R. The DeLone and McLean model of information systems success: A ten-year update. J. Manag. Inf. Syst. 2003, 19, 9–30. [Google Scholar]
Zaied, A.N.H. An integrated success model for evaluating information system in public sectors. J. Emerg. Trends Comput. Inf. Sci. 2012, 3, 814–825. [Google Scholar]
Mumtaz, A.; Ting, H.; Ramayah, T.; Chuah, F.; Cheah, J. A review of the methodological misconceptions and guidelines related to the application of structural equation modelling: A Malaysian scenario. J. Appl. Struct. Equ. Model. 2017, 1, 1–13. [Google Scholar]
Sekaran, U. Research Methods for Business: A Skill Building Approach; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
Bell, E.; Bryman, A. The ethics of management research: An exploratory content analysis. Br. J. Manag. 2007, 18, 63–77. [Google Scholar] [CrossRef]
Arbuckle, J.L. IBM SPSS Amos 22 User’s Guide; Amos Development Corporation: Crawfordville, FL, USA, 2013; Volume 635. [Google Scholar]
Hair, J.F.; Sarstedt, M.; Ringle, C.M.; Mena, J.A. An assessment of the use of partial least squares structural equation modeling in marketing research. J. Acad. Mark. Sci. 2012, 40, 414–433. [Google Scholar] [CrossRef]
Reinartz, W.; Haenlein, M.; Henseler, J. An empirical comparison of the efficacy of covariance-based and variance-based SEM. Int. J. Res. Mark. 2009, 26, 332–344. [Google Scholar] [CrossRef]
Fornell, C.; Larcker, D.F. Evaluating structural equation models with unobservable variables and measurement error. J. Mark. Res. 1981, 18, 39–50. [Google Scholar] [CrossRef]
Cinganotto, L.; Filippetti, S.; Montanucci, G. The Impact of Artificial Intelligence Systems on Online Italian Language Courses: A Case Study at the University for Foreigners of Perugia. In Proceedings of the Conference Proceedings Bridges Across Cultures, Viterbo, Italy, 20–21 June 2024. [Google Scholar]
Larsen, A.G.; Følstad, A. The impact of chatbots on public service provision: A qualitative interview study with citizens and public service providers. Gov. Inf. Q. 2024, 41, 101927. [Google Scholar] [CrossRef]

Figure 1. Research framework.

Figure 2. Final structural model.

Table 1. Questionnaire concerning evaluated constructs.

Constructs	Items
System Quality	The conversational AI and virtual assistant system operate efficiently and without errors. The system is easy to use. The system responds quickly to my requests and inquiries.
Information Quality	The information provided by the system is accurate. The system offers comprehensive answers to my needs. The information is up-to-date and useful.
Service Quality	The system interacts professionally and politely. The system generally meets my needs. The service is consistently available.
Net Benefits	The system has saved me time and effort when using government services. The system has helped me complete my tasks more efficiently. Using the system enhances my overall experience with government services.
Privacy Concern	I am concerned about the system collecting my data. The system respects my privacy when delivering the service. I trust that my data will not be used for unauthorised purposes.
User Satisfaction	I am satisfied with my overall experience with the system. The system meets my expectations. I prefer using the system in the future over traditional methods.
Continuance Usage Intention	I regularly use the system to meet my needs. I rely on the system as essential to my interaction with government services. I see myself continuing to use the system in the future.

Table 2. Demographic statistics.

Demographics	Category	Frequency	Percent
Gender	Male	242	45.9
Gender	Female	285	54.1
Age	18–30	122	23.1
	31–40	191	36.2
	41–50	115	21.8
	More than 50	99	18.8

Table 3. CR of each construct with interpretation.

Construct	CR Value	Interpretation
SYQ	0.864	Good reliability. The observed variables reliably measure the construct.
INQ	0.781	Acceptable reliability, though slightly lower than SYQ.
SVQ	0.778	Acceptable reliability, indicating consistency in measurement.
NETB	0.849	Good reliability. This construct is measured consistently by its indicators.
PC	0.821	Good reliability. This construct is measured well.
US	0.815	Good reliability. Consistency in measurement is present.
CUI	0.768	Acceptable reliability, but close to the threshold of 0.70. Further refinement of the indicators might improve it.

Table 4. AVE from each construct with interpretation.

Construct	AVE Value	Interpretation
SYQ	0.679	Good convergent validity. The construct explains a significant portion of the variance in its indicators.
INQ	0.544	Acceptable convergent validity, slightly above the threshold.
SVQ	0.540	Acceptable convergent validity, indicating moderate shared variance.
NETB	0.654	Good convergent validity, demonstrating strong shared variance.
PC	0.604	Good convergent validity, explaining a substantial portion of the variance.
US	0.596	Acceptable convergent validity, close to good reliability.
CUI	0.525	Acceptable convergent validity, though relatively close to the threshold.

Table 5. Discriminant validity.

	SYQ	INQ	SVQ	NETB	PC	US	CUI
SYQ	0.824
INQ	0.137	0.737
SVQ	0.16	0.39	0.735
NETB	0.386	0.256	0.243	0.809
PC	0.377	0.178	0.114	0.424	0.777
US	0.378	0.313	0.292	0.449	0.323	0.772
CUI	0.334	0.217	0.158	0.519	0.441	0.454	0.724

Table 6. Unstandardised estimates along with their critical ratios and p-values.

Path	Unstandardised Estimate	Standardised Estimate	CR Value	p-Value	Interpretation
US <--- SYQ	0.293	0.276	5.297	***	Positive and significant effect
US <--- INQ	0.164	0.193	3.474	***	Positive and significant effect
US <--- SVQ	0.238	0.161	2.917	0.004	Positive and significant effect
US <--- PC	0.238	0.220	4.132	***	Positive and significant effect
CUI <--- US	0.427	0.493	8.475	***	Strong positive and significant effect
NETB <--- US	0.355	0.315	5.591	***	Positive and significant effect
NETB <--- CUI	0.475	0.365	6.063	***	Positive and significant effect

Legend: *** The p-value is less than 0.001, indicating a highly significant relationship.

Table 7. Comparison of males and females through standardised and unstandardised estimates, along with their critical ratios and p-values.

Path	Unstandardised Estimate		CR Value		p-Value		Standardised Estimate		Interpretation
	Male	Female	Male	Female	Male	Female	Male	Female	Male	Female
US <--- SYQ	0.181	0.277	1.667	4.229	0.096	***	0.131	0.303	Positive but non-significant	Positive and significant
US <--- INQ	0.149	0.171	1.734	3.023	0.083	0.003	0.161	0.220	Positive but marginally non-significant	Positive and significant
US <--- SVQ	0.170	0.220	1.258	2.125	0.209	0.034	0.116	0.148	Positive but non-significant	Positive and significant
US <--- PC	0.856	0.171	2.677	2.948	0.007	0.003	0.297	0.206	Positive and significant	Positive and significant
CUI<--- US	0.231	0.462	3.872	5.787	***	***	0.543	0.436	Strong positive and significant	Strong positive and significant
NETB <--- US	0.097	0.401	1.523	4.941	0.128	***	0.165	0.370	Positive but non-significant	Positive and significant
NETB <--- CUI	0.005	0.384	0.028	5.133	0.978	***	0.003	0.376	No significant effect	Positive and significant

Legend: *** The p-value is less than 0.001, indicating a highly significant relationship.

Table 8. Critical ratio values with interpretation (males vs. females).

Path	CR Value	Interpretation
SYQ → US	0.759	There is no significant difference between males and females in the effect of SYQ on US.
US → NETB	2.938	Significant difference. The effect of US on NETB is stronger for females than males.
US → CUI	2.325	Significant difference. The effect of US on CUI is stronger for females than males.
INQ → US	0.208	There is no significant difference between males and females in the effect of INQ on US.
SVQ → US	0.291	There is no significant difference between males and females in the effect of SVQ on US.
PC → US	−2.109	Significant difference. The effect of PC on US is stronger for males than females.
CUI → NETB	2.019	Significant difference. The effect of CUI on NETB is stronger for females than for males.

Table 9. Comparison of age groups through standardised and unstandardised estimates, along with their critical ratios and p-values.

Path	Unstandardised Estimate				CR Value				p-Value				Standardised Estimate
Path	A	B	C	D	A	B	C	D	A	B	C	D	A	B	C	D
US <--- SYQ	0.146	0.291	0.284	0.407	1.214	3.107	2.470	3.216	0.225	0.268	0.014	0.001	0.120	0.268	0.284	0.120
US <--- INQ	0.131	0.258	0.063	0.139	1.320	2.927	0.648	1.265	0.187	0.313	0.517	0.206	0.151	0.313	0.077	0.437
US <--- SVQ	0.379	−0.089	0.259	0.579	2.561	−0.545	1.540	2.280	0.010	−0.058	0.124	0.023	0.288	−0.058	0.191	0.132
US <--- PC	0.235	0.381	0.040	0.297	2.174	2.838	0.356	2.787	0.030	0.276	0.722	0.005	0.222	0.276	0.042	0.259
CUI <--- US	0.370	0.285	0.409	0.850	4.100	4.033	3.433	5.228	***	0.385	***	***	0.496	0.385	0.444	0.359
NETB <--- US	0.259	0.607	−0.042	0.428	1.958	6.058	−0.337	2.419	0.050	0.547	0.736	0.016	0.214	0.547	−0.038	0.682
NETB <--- CUI	0.702	0.298	0.859	0.165	3.482	2.340	4.533	1.229	***	0.199	***	0.219	0.431	0.199	0.727	0.402

Legend: *** The p-value is less than 0.001, indicating a highly significant relationship.

Table 10. Pairwise comparison of critical ratio values between various age groups.

Path	CR Value
Path	A and B	A and C	A and D	B and C	B and D	C and D
SYQ → US	0.954	0.829	1.497	−0.051	0.735	0.721
SVQ → US	−2.123	−0.534	0.683	1.484	2.212	1.051
INQ → US	0.956	−0.491	0.056	−1.488	−0.840	0.521
PC → US	0.849	−1.244	0.411	−1.941	−0.489	1.653
US → NETB	2.098	−1.660	0.764	−4.074	−0.882	2.175
US → CUI	−0.741	0.263	2.583	0.896	3.188	2.188
CUI → NETB	−1.691	0.571	−2.212	2.456	−0.716	−2.984

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alzahrani, L.; Weerakkody, V. Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis. Systems 2026, 14, 113. https://doi.org/10.3390/systems14010113

AMA Style

Alzahrani L, Weerakkody V. Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis. Systems. 2026; 14(1):113. https://doi.org/10.3390/systems14010113

Chicago/Turabian Style

Alzahrani, Latifa, and Vishanth Weerakkody. 2026. "Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis" Systems 14, no. 1: 113. https://doi.org/10.3390/systems14010113

APA Style

Alzahrani, L., & Weerakkody, V. (2026). Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis. Systems, 14(1), 113. https://doi.org/10.3390/systems14010113

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploring the Roles of Age and Gender in User Satisfaction and Usage of AI-Driven Chatbots in Digital Health Services: A Multigroup Analysis

Abstract

1. Introduction

2. Literature Review

2.1. Importance and Challenges in Digital Health Services

2.2. AI-Driven Tools in Saudi Arabia and Vision 2030

2.3. User Satisfaction with AI-Driven Chatbots

2.4. Role of Age in Chatbot Usage

2.5. Role of Gender in Chatbot Usage and Satisfaction

2.6. Gaps in Literature

3. Conceptual Framework

3.1. System Quality

3.2. Information Quality

3.3. Service Quality

3.4. Privacy Concerns

3.5. User Satisfaction

3.6. Continuance Usage Intention (CUI)

3.7. Gender and Age Effects

4. Methodology

4.1. Measurement Development

4.2. Data Collection

4.3. Data Analysis

5. Results

5.1. Reliability and Validity Evaluation

5.2. Structure Model Evaluation

5.3. Multi-Group Analysis

5.3.1. Gender and US

5.3.2. Age and User Satisfaction

6. Discussion

6.1. The Moderating Role of Age and Gender on Chatbot Adoption

6.2. Alignment with Existing Research

6.3. Positioning Within Theoretical Frameworks

6.4. Implications for Digital Health Practice

7. Conclusions

8. Research Limitations and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI