How Live Streaming Interactions and Their Visual Stimuli Affect Users’ Sustained Engagement Behaviour—A Comparative Experiment Using Live and Virtual Live Streaming

: With the massive expansion in live streaming, enhancing the sustained engagement of users has become a key issue in ensuring its success. This study examines the relationship between real-time interaction, user perceptions, user intention to keep using live streaming, and whether this relationship differs between a live and a virtual live streaming environment. Using partial least squares (PLS) structural equation modelling (SEM), this paper analyses 240 valid questionnaire responses and ﬁnds that there is a link between real-time interactions, visual stimuli, and users’ sustained engagement. This shows that users’ active interactions while watching live streaming videos signiﬁcantly affect their perceptions of social presence and trust, which in turn, affect their sustained engagement behaviour. These effects were found to vary with differences in the live streaming environment. The ﬁndings of this paper will play a positive role in understanding the differences between various live streaming environments, in optimizing the design of live streaming content and in improving the perceptions of emotional warmth by live streaming users.


Introduction
In recent years, internet users have been eager to share the joyful moments of their lives through social media. Moving from text images to short videos and then on to real-time streaming videos, the ways in which users share has been transformed. Among these, live streaming has achieved explosive growth as a social media tool for recording and sharing in real-time. During this period, at the end of 2019, the COVID-19 epidemic broke out around the world, and the embargoes imposed for public safety reasons further boosted the live streaming economy. According to the Global Live Streaming Market 2021 data published by Research and Market, in 2021, the global live streaming market was $59.14 billion, and it is expected to reach $223.98 billion by 2028.
Live streaming has expanded on a massive scale due to its various advantages, including real-time interactivity, high levels of interactivity, a strong sense of consumer engagement and the satisfaction of seeking novelty [1,2]. This growth means that live streaming, based on social media, has an even greater influence on the viewing experience of internet users. High interactivity is the main feature that distinguishes real-time streaming from the traditional media, and this has affected the consumption experiences of internet users in an unprecedented way [2]. In traditional or social media, the separation in time and space maintains the interaction between users and influencers at a certain physical distance. However, the emergence of live streaming positions them at the same time, and this real-time interaction closes the social distance between viewers and influencers, as well as that with other viewers, thus achieving an immersive experience [3]. For example, a study by Zhou et al. [4] points out that text-based real-time chat rooms, or interactive mechanisms designed to express incentives through like-swiping gifts, attract internet users to watch real-time videos in a more immersive manner. In addition, scholars have explored the influence of factors such as live streaming interface design, system services, personal attitudes, and perceived value on consumers' willingness to view [5]. Existing studies have discussed the influence of various factors on individual participation and interaction in live streaming, but these results are often limited to the initial stages of specific behaviours, such as consumer awareness and acceptance. Since the development of individual behaviours often moves through different stages before reaching a more stable state, it is necessary to explore which factors influence individuals' willingness to sustain their engagement and their possible internal mechanisms of action. Currently, these are being studied in a limited way in both academia and industry. For example, a Lim et al. [6] study on users who watch live streaming games points out that emotional involvement influences users' sustained viewing behaviours through supersocial relationships. However, such studies are limited to the unilateral intimacy of viewers to the anchor. In contrast, in live streaming, due to the interactions, emotional connections are created not only between the anchors and viewers but also between viewers and other viewers [5]. This paper goes further by exploring how the interactions between viewers and anchors and viewers with other viewers, create emotional connections and thus, come to influence viewers' sustained engagement behaviours.
The anchor, who plays the central role in a live broadcast, is able to mobilize users' emotions by communicating with them in real-time, thereby enhancing their immersive experience and influencing their subsequent behaviours. In addition to sports anchors and game anchors, who have been common since the early stages of the development of the live economy, more recently, there has been large scale expansion in the roles created for anchors in talent shows, reality shows and in the general sharing of life. Simultaneously, the nature of the anchor has become diversified through the application of emerging technologies. For example, virtual anchors, represented by cute, animated characters, have appeared, one after another, in a wave of e-commerce development. These anchors are 2D or 3D animated avatars that combine artificial intelligence with virtual simulation technology and are capable of performing a range of tasks, such as media content production and distribution, and are generally voiced-over by humans [7]. Since the debut of virtual anchor Kizuna AI on YouTube in 2016, virtual anchors have started to develop and multiply rapidly, but related research is limited. In recent years, some scholars have explored the use of virtual anchors; for example, Xu [8] explored the possibility of applying artificial intelligence technology to virtual anchors, while Lu, Shen, Li, Shen, and Wigdor [7] used qualitative studies such as interviews, to explore the attractiveness of virtual anchors to internet users. However, these studies were limited to the Otaku community, and the authors did not clarify whether other users had similar perceptions. In general, current research into the field of virtual hosting is still at the stage of initial acceptance and perceptions of users, and it lacks any exploration of users' sustained behaviours, or any in-depth analysis of the mechanisms underlying their behaviour, whether in terms of the technology or user-behaviour dimensions.
With the explosive growth in social media represented by live streaming, retaining users has become an urgent issue for both major companies and individual bloggers; however, this issue has been pursued in only a limited way by academia, and research results are limited to the initial stages of specific behaviours, such as the awareness and acceptance of live streaming users. In addition, the emergence of virtual anchors has enriched the anchor format, yet this niche culture has not attracted sufficient attention from academia, despite the fact that the corporate world is scrambling to launch its own virtual anchors. The competition between virtual anchors and real ones should be an issue worth exploring. Considering the questions raised above, the purpose of the current study is twofold: First, to discover how environmental factors (both visual and social) can achieve long-term relationships with live streaming audiences by mobilizing their emotional responses (through perceptions of presence and trust). Second, to illustrate the variability in the results of the study described above in relation to different anchors.
In summary, this paper argues that it is necessary to further explore the factors influencing users' willingness to sustain their engagement in a live streaming environment, together with their mechanisms of action, and to explore the variability of these influencing factors in both live and virtual streaming. Therefore, this paper poses the following research questions: 1.
Can visual stimuli that convey emotions (e.g., emojis) influence the emotional connection between users of live streaming? 2.
Can the social engagement behaviour of live streaming users influence their intention of continuous engagement? 3.
Will the form of the anchor make a difference to the influencing factors identified?
Using social presence theory as its theoretical reference, this study developed a twostage model of the development of consumers' willingness to sustain their participation in the context of live streaming. The study uses partial least squares (PLS) structural equation modelling (SEM) to analyse the data collected from 240 samples and to test the research model and the hypotheses proposed in this paper. The study shows that interactive texts containing emotion-rich, visual stimuli, such as love and gifts, in the live interactive environment can enhance the emotional connections and trust relationship among users, which in turn, affects their sustained engagement behaviours. In addition, this paper contrasts and analyses the differences between different live streaming formats (virtual vs. real life streaming) through group experiments. These results will help readers to gain a comprehensive understanding of the formation and factors influencing the sustained engagement intention of live streaming users. In this way, it will provide some theoretical references for related future research and help platforms to optimize and adjust the interface design of their live streaming to meet user needs better.
The remaining sections are organised as follows: Section 2 provides a literature review of research into social presence and sustained engagement behaviour. Section 3 presents the theoretical foundations and research hypotheses related to the study. Section 4 elaborates on the research methodology. Section 5 derives the results of the analysis from the experimental data. Section 6 describes the implications of the study. This section also points out the study's limitations and makes recommendations for conducting future in-depth study. Section 7 draws conclusions.

Literature Review
In social media, user-company interactions are influenced by the characteristics of the media sites concerned. Among these, images [9] and visual presentations [10] are considered unique affective cues which prompt users' cognitive and emotional responses and thus, influence the user's interaction behaviour with the site. Similarly, whether it is a text chat or a virtual gift, the information is presented in real-time on the app interface and is quickly seen by viewers. In general, the visual senses dominate our perceptions, which in turn, profoundly influence our emotional responses and behaviours [11]. For example, Fiore and Yu [12] conducted research on advertising which showed that when consumers see imagination-stimulating content they may imagine the product being crafted, which then evokes pleasant emotions. With the further development of internet technology and the enrichment of web content, scholars have found that in internet marketing, the effect of positive emotional expressions, and especially the use of virtual expressions, often resonates more with consumers than mere textual expressions. Emoticons, as a surrogate for conveying emotional tone and non-verbal gestures, such as facial expressions, add to the richness of the message and promote a perception of fun among users [13]. It is likely that the combined effect of text messaging and emoticon use creates media richness, which facilitates the perception of playfulness in the social interactions of mobile instant message users, and in its turn, the perceived fun drives consumer engagement interactions [13].
Live streaming brings together liveness (as in live broadcasting) and the participatory culture (social interactions) to an unprecedented level [2]. Engagement is defined as the user's investment of physical and psychological energy to fulfill certain psychological needs [14]. For example, searching for entertainment information [15], making new friends, and relieving social anxiety [16]. Existing research has explored the concept of engagement from two main perspectives: cognitive involvement and affective involvement. Cognitive involvement is associated with 'rational thinking' and is induced by utilitarian or cognitive motives [17]. When users are exposed to environmental stimuli, such as novel technological features (e.g., the 'bullet screen'), the characteristics of broadcasters (e.g., a super-high level of gameplay) or convenient interaction methods, the users' cognitive engagement is enhanced (i.e., their active participation in watching and learning useful game skills) [8]. Emotional engagement, on the other hand, denotes the emotional sharing behaviours of viewers. For example, when watching sports events, viewers often express their happy or frustrated feelings, and they also want to share their feelings with other viewers. This is the basic practice of emotional engagement [18]. In the overall design of a live streaming service, viewers can express their respective views through text chatting in the chat room, or they can express their joy and share their emotions by 'liking' and giving virtual gifts.
During the real-time engagement of live interactions, consumers' levels of emotional connection change with the different stimuli and interactions [7,19]. In summary, both cognitive and affective engagement are considered by scholars to be distinct psychological experiences that enable the social presence of individuals [18]. This is due to the tendency for emotional connections and perceptions of warmth between individuals to be enhanced through the occurrence of actual interactions.
In a study on the impact of social media engagement on sports channel loyalty, Lim, Hwang, Kim, and Biocca [18] found that the depth of viewers' engagement gave rise to enhanced channel loyalty by creating an emotional bond between the channel and the viewer as well as between the viewer and other viewers. Hajli et al. [20] noted that realtime interactions of engagement in social commerce enhance users' perceptions of warmth, effectively increasing their social skills and in turn, their long-term purchase intention. A review of the literature reveals that the interactive engagement features of social media can increase user loyalty by enhancing the emotional bond between users, and that loyalty is considered an important factor in assessing the likelihood of subsequent user behaviour. Loyalty was first defined in behavioural terms because behaviours (e.g., repeat purchases) can be easily captured [21,22]; however, defining loyalty in terms of a single behavioural dimension seems to be inadequate because it cannot distinguish between false loyalty (high behavioural but low attitudinal loyalty) and true loyalty (high attitudinal with high behavioural loyalty) [23]. Subsequent researchers measured customer loyalty in terms of both the behavioural and the attitudinal dimensions and they have indicated that consumer loyalty cannot be separated from positive attitudes and willingness to repeat purchases [24]. Later in-depth studies confirm that attitudinal loyalty positively influences behavioural loyalty; for example, in an interview on branded toothpaste purchasers, Bandyopadhyay and Martell [25] found that behavioural loyalty is influenced by attitudinal loyalty to the brand. Both attitudinal and behavioural loyalty are considered key determinants of longterm brand survival, and retaining existing customers and enhancing customer loyalty is extremely important for service providers wishing to gain the competitive advantage [26]. In this study, in order to measure more accurately and to conduct comparative trials effectively, we studied the behaviour of live streaming users from a behavioural loyalty perspective and defined it as sustained engagement behaviour. Sustained engagement on the internet means that the site makes a good impression on consumers and attracts them to spend more time on it or to visit the site more frequently [27].
On 1 December 2016, the first generation of virtual VTuber Kizuna AI opened her personal YouTube account and pitched her first video. Since then, virtual anchors like Kizuna AI, which use real-time animation capture and facial expression capture technology to generate animated images, have been increasing rapidly. By June 2021, there were already 32,000 virtual idols and anchors on Bilibili, and in 2021, more than 60,000 companies related to virtual characters. According to iiMedia Research (Ai Media Consulting), the virtual idol industry has maintained a steady growth trend, and in 2021, for example, the driven and core market size of virtual idols was expected to be 107.49 billion RMB and 6.22 billion RMB, respectively. In China, these are expected to reach 186.61 billion RMB and 12.08 billion RMB in 2022. Although the virtual host industry is growing rapidly, related studies are limited. There is a noted lack of research on consumer willingness and behaviours in the live streaming industry. In the past year, scholars have begun to explore the interaction between consumers and virtual hosts; for example, Lu, Shen, Li, Shen and Wigdor [7] explored viewers' perceptions and interactions with virtual hosts through interviews with members of the Otaku community.

Hypothesis Development
Scholars have defined social presence as 'the extent to which a medium allows users to experience others as being psychologically present' [28]. More specifically, social presence is the extent to which users perceive human warmth and social competence when participating in media activities. The act of engaging in an interaction in a live broadcast (e.g., text chats, liking, and swiping gifts) involves interactions between the user, the host, and other users. Communication and interaction are considered to be the basis for generating social presence [29]. The emergence of social media has made it possible for internet users to interact in real-time from anywhere, and this interaction is similar to interpersonal interaction in a real environment, which enhances the perception of social presence among the participants [30]. The advent of live streaming has accelerated the speed and scope of user participation in interactions, and as viewers engage in a fast-paced interactive chat environment, they become aware of the presence of others [31] resulting in a perception of immersion [32] and a subsequent emotional connection with others. This paper, therefore, proposes the following hypothesis:

H1.
Users' participation and interaction behaviours in a live streaming environment positively influence their sense of social presence.
Trust can be defined as a sense of security that indicates a willingness to rely on someone or something [33]. It is established through extensive and continuous interactions between people [34]. Interpersonal interaction, whether face-to-face or virtual, is a prerequisite for trust [34], and the more frequent the interaction, the more conducive it is to building trusting relationships [35]. In online communities, the strength of users' interactions with others has also been shown to be a key factor in fostering trust between them, and the more active the communication and interaction between individuals, the deeper the trust that develops [36]. For example, Wu and Chang [37] studied the interaction between members and administrators in online travel communities, he found that the more members communicated with the administrators, the more they trusted them. Jiang, et al. [38], in a study of social commerce, found that interactive communication between consumers, merchants, and other consumers could reduce the sense of unreality caused by not being able to touch the real product, thus enhancing the consumers' perceptions of trust. Resulting from this, we propose the following hypothesis: H2. Users' live participation and interaction behaviours positively affect their perceptions of trust.
Using social media to present emotional elements can enhance customers' emotional support for the company, which helps them to understand and improve their relationship with the company [39]. Scholars have found that visually appealing emotional elements, such as those conveying caring, understanding, and empathy, can significantly influence customers' perceived experience of emotional issues [40], while scholars such as Zhang et al. [41] have shown that visual elements, such as images and videos, not only improve the overall appearance of a website but also elicit positive emotional responses from users. Accordingly, this paper refers to the emotional influences presented by multimedia technologies, which are based on images, emojis, etc., as emotional visual elements. When these touch customers' emotions and prompt experiences of warmth, they promote the customers' participation in virtual community interactions, which in turn, satisfies their sense of belonging [42]. Users who are actively involved in social media are more likely to experience social interactions and to feel happy [43]. For example, anchors and viewers in live broadcasts use emojis, likes, and other forms of interaction to express their emotions, thus creating a positive emotional connection for participating users [6]. Therefore, we hypothesize that: H3. Emotional visual elements positively influence the user's social presence.
Perceptions of trust are derived from rational characteristics, such as reliability, competence, and responsibility, as demonstrated by a trusted person, as well as from perceptions of factors such as emotional and social skills [44]. Because the virtual community is different from the traditional marketplace environment, individuals tend to look to limited symbolic cues to form impressions of others in a context where there is a lack of personalized cues. At the same time, due to the lack of information, individuals tend to have stereotypical perceptions of others' images, and therefore, trust levels tend to be low [45]. However, our visual senses dominate our perceptions; for example, Fogg [46] found that when online articles contain photographs, it enhances their perceived trustworthiness. Marketing research points out that advertising relies on the construction of friendly images to enhance positive consumer attitudes [47]. Hassanein and Head [48] also notes in their study that descriptive and graphic visual elements designed to evoke emotions have a positive impact on consumer attitudes. Therefore, we propose the following hypothesis:

H4. Emotional visual elements positively influence perceptions of trust by users of live streaming.
Social presence has been shown to be an important influencing factor in increasing users' sustained engagement behaviours. For example, from a study on video pop-ups, Fang et al. [49] concludes that social presence influences individuals' intention to repeat viewing due to their immersive experience and hedonic perceptions. Nadeem, et al. [50] states, from the social commerce perspective, that social presence provides a human atmosphere that allows consumers to experience fun and warmth in social interaction, which leads to willingness to persist. Lim, Hwang, Kim, and Biocca [18] and others have studied sports channel loyalty and conclude that social presence positively influences users' willingness to keep watching by enhancing their commitment to the channel. Scholars agree that a high perception of social presence will influence individuals to join and to continue using social networks [51]. In summary, we hypothesize that: H5. Social presence positively influences the sustained engagement behaviour of users of live streaming.
Research by Hong and Cho [52] states that, in the field of relationship marketing, trust is the cornerstone of building long-term relationships; it is a determinant of relationship commitment and an important relationship marketing tool available to companies. Many studies point out that trust directly affects the building of customer loyalty and has farreaching effects [53][54][55]. For example, in a study on mobile instant messages in China, Deng, Lu, Wei, and Zhang [54] demonstrate that trust significantly influences consumers' sustained engagement behaviour. In a virtual community environment, in which face-toface contact or physical evidence does not provide sufficient assurance, a perception of trust can reduce the sense of risk in the transaction process. Consumers, therefore, tend to be more willing to establish a long-term and stable trust relationship with a trustworthy service provider to whom they will demonstrate loyalty [56], such as using the service continuously (sustained engagement behaviour) or recommending the service to others. As a new situation in e-commerce, live streaming still has inherent risks and uncertainties, and if internet users develop a trusting relationship, it will determine their readiness to participate continuously. Therefore, we argue that: H6. Perceived trust positively influences the sustained engagement behaviour of the users of live streaming.
This paper advocates for the following research model, as shown in Figure 1. In the proposed model, we illustrate structural relationships between visual stimuli and interactions, the affective responses of the live streaming users (presence perception and trust perception), and the intention to keep watching. Visual stimuli and interactions are derived from the live streaming environment and originate from observation and participation. Presence perception and trust perception are both personal emotional responses affecting the subsequent behaviour of the live streaming user. Sustained engagement, as one of the behavioural manifestations of loyalty, is the final dependent variable.

Experimental Design
To test our research hypotheses, we designed a questionnaire (Appendix A) around a scenario-based experiment. First, we measured the feasibility of the whole research model, and next, we conducted group experiments for different live streaming formats (live vs. virtual hosts). Live and virtual hosts that could understand the users' requirements in a timely and accurate manner and effectively fulfill their needs were used to interact with the respondents in the respective groups. To ensure that the participants could relate to the whole experimental scenario, we recruited subjects who met the following criteria: (1) older than 18 years; (2) proficient in internet skills (with basic skills such as web browsing and online shopping); (3) with at least one live-viewing experience in the past three months. Due to COVID-19, this experiment was conducted in an online manner. Subjects were randomly assigned to the two groups and then watched a pre-recorded live video (virtual or real streaming) for 15 min. They were then asked to answer a questionnaire (the same questionnaire was used for both groups).
To ensure the validity of the hypotheses testing, the design of the questionnaire was based on established scales from relevant literature, with appropriate modifications and adaptations according to the characteristics of the online live streaming environment and human-computer interaction. When designing the study, we interviewed 15 users in five regions and asked them to name the relevant factors that they thought affected their continuous participation in live streaming. We then made adaptations to existing scales based on their responses and combined them to arrive at our final questionnaire. The questionnaire we have finalized consists of two main parts: the first part collects basic information about the subjects, such as age, gender, and education level, and the second part presents four questions relating to each of the variables, identified separately. All questions, apart from the respondents' demographic information, were measured using a 7-point Likert scale. Specifically, the answers to each question were to be given a value of 1-7, with '1' indicating strong disagreement and '7' indicating strong agreement. Once the initial questionnaire design was completed, we conducted a small-scale pretest. In all, 30 questionnaires were completed to check whether the semantic and grammatical expressions of the options were easy to understand and whether their reliability and validity met the requirements. At this point, some statements on the questionnaire were modified according to the respondents' feedback, and this resulted in the final questionnaire.

Sample Selection
A total of 300 subjects were recruited and randomly divided into two groups (virtual vs. live streaming). Of the 300 responses, 31 were removed as being invalid for the following reasons: (1) the time taken to answer was less than three minutes: 7 samples; (2) the question answers were too focused on a single option: 6 samples; (3) the choice of options showed an obvious regularity: 3 samples; (4) the questionnaire was set up in such a way that it detected a lack of validity in the subjects' responses: 15 samples. The final number of valid samples was thus 269 (of which, 142 were in the virtual live streaming group and 127 in the live streaming group). Since our study is a group experiment to be conducted in the context of live and virtual streaming, we expect the data to be equal for each group, so we randomly selected 120 questionnaires from each group for analysis.
The demographic information is shown in Table 1. Female subjects accounted for 56.75% of the total number and male subjects accounted for 43.25%, so gender can be considered evenly distributed. The subjects were mainly between 18 and 45 years of age (88.33%), which is in line with current trends where middle-aged and young people are the major online-consumer groups. Minors were excluded from the sample group because they need the permission of their guardians to participate in an experiment, and their shopping behaviours are limited by their lack of economic power and the influence of their guardians. The respondents had generally received higher education, and 83.75% of them were either studying at undergraduate level or had already received a bachelor's degree. In relation to the frequency with which respondents watched live streaming, most of them watched live streaming 4-6 times per week, while about 25% of them watched live streaming more than ten times per week. These figures are consistent with information about consumers' live streaming behaviours in the e-commerce environment. The data for this experiment were not collected on the university campus, but the sample was selected through social media, so the data is representative of the population as a whole.
The study used PLS-SEM to analyse 240 questionnaires and conduct a multiple group analysis (MGA) to observe the heterogeneity between the virtual and real live streaming groups. It has been noted that PLS-SEM is suitable for small sample studies when the research model has complex relationships with many constructs and metrics and that the model is flexible when dealing with non-normal data [57,58]. Also, the PLS-SEM approach is the preferred method when the study aims to explore theoretical extensions of established theories [57]. Therefore, in this study, 120 valid samples were randomly selected from each of the two groups and the data were modelled by PLS-SEM using SmartPLS 3.3.7 in two steps, which included measurement model analysis, to assess the reliability and validity of the constructs. A structural model analysis was used to assess the associations between the constructs and to check the propositions. In addition, considering the heterogeneity between virtual and real live streaming, this paper conducted a multigroup comparison to test whether users had different levels of persistent intention to use the two streaming approaches. Henseler's MGA nonparametric technique is easy to apply, and it tests for potential group differences by using bootstrap results, and it does not require any distributional assumptions. Our study therefore uses Henseler's approach to evaluate the MGA results using PLS-MGA [58].

Measurement Models
The current study analysed measurement model methods to assess structural reliability, composite reliability (CR), and average variance extracted (AVE). To measure reliability, we used Cronbach alpha (CA) and composite reliability. The results for CA and CR are shown in Table 2 for sustained engagement behaviour (0.952, 0.952), social presence (0.934, 0.934), perceived trust (0.936, 0.940), engagement interaction (0.893, 0.897), and affective visual elements (0.968, 1.324), respectively. According to Hair et al. [59], a CR value above 0.7 indicates high reliability and a CA greater than 0.7 indicates good reliability of the indicator. In this study, these were found to be within the acceptable range, thereby showing a high degree of internal consistency. The structural validity of this study was evaluated by content validity, convergent validity, and discriminant validity. All the scales in this study were selected from established scales in the classical literature, so they had good content validity. Convergent validity is usually evaluated by the average variance extraction (AVE) and the combined reliability (CR) indicators. As shown in Tables 2 and 3, the standardized external loadings of the indicators in their structures, and the AVE of the different structures are greater than 0.85 [60,61]. Thus, all structures had good convergent validity. As shown in Table 4, the square root of the AVE of any construct in the model was greater than the correlation value corresponding to the other constructs. In addition, as shown in Table 3, the normalized out-loadings of all the indicators in the constructs to which they belonged were greater than their cross-loadings. This suggests that the measures of the different constructs in this study have sufficient judgmental validity [62,63]. In summary, the overall quality of the measurement model in this study was relatively satisfactory according to the tests of reliability and validity.

Structural Model
In the second phase of the analysis, the study evaluated a structural model of consumers' ongoing engagement in live streaming behaviour. Specifically, this study examined the statistical significance of the t-values of the path coefficients in the study model by bootstrapping in SmartPLS 3.3.7. The original sample size was 240. We used SmartPLS software to evaluate the structured equation model using 5000 bootstrapping procedures, and the results of the path coefficients and statistical significance tests are shown in Table 5. The results of the structural model evaluation showed that all six hypotheses proposed in this paper passed the significance test. Specifically, the relationship between engagement interaction and social presence showed a β value of 0.865 and a p-value of 0.000, while the relationship with perceived trust showed a β value of 0.332 and a p-value of 0.000, which proved to be significantly positive, meaning that H5 and H6 were verified. The relationship between affective visual elements and social presence showed a β value of 0.230 and a p-value of 0.000, and the relationship with perceived trust showed a β value of 0.12 and p-value of 0.005, so H1 and H3 were verified. Finally, both social presence and perceived trust showed a positive relationship with consumers' continuous participation behaviour, so we can assume that both social presence and perceived trust, as perceived by consumers during the live broadcast, can positively drive their participation behaviour, thus, H2 and H4 also passed the significance test. The complete model is shown in Figure 2. The coefficient of determination (R 2 ) is the most commonly used coefficient when evaluating structural models and is used to evaluate the predictive power of the model [63]. The R 2 value is between 1 and 0. The higher its value, the greater the predictive power. Generally, when the R 2 value is between 0.5 and 0.75, the explanatory power is moderate. In this study, an R 2 value of 0.552 was reached for the continuous participation behaviour of consumers, indicating that the model proposed in this study has good explanatory power.

Multi-Group Analysis
Hair, Risher, Sarstedt, and Ringle [57] suggest using MGA for categorical moderators that affect all independent and dependent variable relationships at the same time. Based on this, in order to test the effect of different anchors on consumers' sustainable engagement behaviour, this study relied on PLS-MGA to test the different effects of virtual and real anchors on consumers. However, measurement invariance should be confirmed before using SmartPLS for MGA [58]. This is because the accuracy of the results cannot be confirmed unless the researcher is certain of their measurement invariance, and variations in the structural relationships between potential variables may be due to different interpretations or understandings of the phenomena by different groups, rather than to differences in the structural relationships. In this study, configurable invariance, component invariance, and equality of composite means and variances were determined through a multi-group validation factor analysis.
Further, this study conducted a comparison of the two different live streaming formats (virtual vs. real life streaming) based on the PLS-MGA parameterization method proposed by Sarstedt et al. [64] to further test the research hypotheses. The following Equation (1) and the t-test for independent samples were used to determine whether there was a significant difference between the different treatment groups. Where the parameters are significantly different and when the p-value is less than 0.05, the comparison results are shown in Table 6. By looking at Table 6, in the two pairs of relationships for sustained engagement behaviour, the p-values for social presence and perceived trust in relation to sustained engagement behaviour are 0.001 and 0.027, respectively, meaning that the differences are significant. In the two pairs of relationships for social presence, the p-value between engagement interaction and social presence is 0.764, which therefore does not pass the parametric significance test, while the emotional visual element (p = 0.000) does pass. In addition, in the two pairs for perceived trust, the p-value for engagement interaction is 0.001, which means that real and virtual anchors do create different engagement interactions and significantly affect consumers' perceived trust. However, the p-value for the effect of emotional visual elements on perceived trust is 0.110, which therefore does not pass the parametric significance test. Thus, the different types of anchor can give different perceptions to users and eventually influence their continuous engagement behaviour. The specific structural model is shown in Figure 3.

Discussion
The following findings emerged from this study: first, the interactive information displayed on the live streaming interface affects viewers' perceptions of social presence and trust. Specifically, we found that interactions between live streaming users and hosts, and user-user interactions, enhanced users' perception of human warmth. This is consistent with the findings of Kruikemeier, van Noort, Vliegenthart, and de Vreese [30], Lim, Hwang, Kim, and Biocca [18] and others that individuals are able to perceive the presence of others in a rapid real-time interactive environment, which in turn enhances their emotional connection. This finding did not differ between the virtual and real live streaming environments. Furthermore, consistent with the findings of previous research, interaction enhances users' perceptions of trust, and it is a prerequisite for generating trusting relationships [33]. However, in the virtual live environment, trust perceptions are less significant than in the real live environment, one possible reason for this being that in the virtual environment, the virtual nature of the host tends to keep the users calm and rational, thus making it more difficult to establish trust. This suggests that it is more difficult to build trust in a virtual environment [65].
Second, emotional visual elements have been shown to significantly influence users' perceptions of both social presence and trust. Visual stimuli expressing emotions such as emojis, hearts, and gifts can trigger positive emotional responses, including perceptions of warmth and trust. This is due to the fact that hearts and gifts often represent the users' praise and support for the host [66], or their appreciation and recognition of shared content [67], with the perception of trust gradually increasing with recognition and support for others. At the same time, emotional visual elements were once again shown to influence users' perceptions of social presence [68]. However, in the group comparison, we found that visual stimuli in the virtual live streaming environment had a greater impact on users' perception of social presence, which may be determined by the image characteristics of the virtual hosts being 2D or 3D anime characters. Users watching virtual live streaming are often anime enthusiasts, and when groups with similar interests watch live streaming at the same time, the sense of community between groups is enhanced [69], and in turn, this enhances the emotional connection and perceptions of social presence among users.
Finally, as we suspected, both social presence and perceived trust significantly influenced users' sustained engagement behaviour. Perceived social presence reflects the user's perception of human warmth in the environment [50], and when a user experiences social warmth in live streaming, he or she is more likely to engage in it over time. Trust was once again shown to be the cornerstone of long-term relationships [52], and users' perceptions of trust increased their time and frequency of live-viewing [70]. In addition, in the group comparisons, we found that the social presence factor in virtual live streaming influenced users' sustained engagement behaviour to a greater extent than in real live streaming, while the perceived trust factor did the opposite. A reasonable guess is that users of virtual live streaming watch videos more for enjoyment, they do not need to consider whether the anchor is trustworthy, whereas in real live streaming, where live shopping is the hottest live category, a perception of trust tends to reduce the risks of the transaction process and therefore builds long-term, stable relationships.

Theoretical and Practical Implications
As the scope of the live streaming industry continues to expand, research results on live streaming are becoming increasingly abundant. The research reported in this paper will also contribute to the development of this field. First, we demonstrated that interactive text, love, and gifts, which convey personal emotions on a live streaming interface, can bring a warm social experience to live streaming users and thus influence their willingness to continue watching. The group experiments show that the difference in the live streaming format (virtual vs. live host) does not affect this result. This result enriches the research into social presence theory in the field of virtual live streaming. Second, we found that perceived problems of trust in the live streaming environment tend to focus on social commerce, but as the foundation of long-term relationships, we believe it is necessary to explore their impact on users' long-term viewing intention in any live streaming environment, and we confirmed our conjecture through quantitative research. These findings further enrich the research into the sustained engagement behaviour of live users and are applicable in both real and virtual environments. Finally, the strictly-controlled experimental setting and the resulting relatively high internal effects provide us with favourable conditions for exploring persistent behaviour in live streaming users.
In live streaming, both virtual and real anchors should aim to improve users' sense of socialization and trust through various means, so as to retain users and enhance user stickiness. Specifically, the frequent interactive messages that pop up in the live broadcast can significantly affect the emotional connection between users, so anchors should fully mobilize the interactive atmosphere by throwing out topics and actively answering questions. In addition, participation in real-time interaction is not only between users and anchors, but also between users, and both interactions are found to be driven by needs such as self-presentation and interpersonal communication. Anchors should therefore give the live stage to users at the right time and give them opportunities to show themselves (such as liking and giving gifts) to encourage their continuous participation in the interaction and to enhance social perception and perception of trust.
We compared the group experiments and concluded that the sense of social presence in virtual live streaming has a greater impact on users' sustained engagement behaviour than in real live streaming. We believe that this difference mainly comes from user differences, as most of the users watching virtual live streaming have similar anime hobbies, so they care more about the emotional connection and sense of belonging in their live streaming, while, in live streaming, the viewers' preferences may be less concentrated. We therefore suggest that virtual hosts can talk appropriately about anime-related topics to stimulate the viewers' interest and motivate them to participate in the live streaming interactions.

Limitation and Future Work
Although this study provides a favourable discussion of the sustained engagement behaviour of live users, it has inherent limitations. Further improvements are therefore needed in future studies. First, the sample size investigated in this study was 240. This meets the minimum sample size criterion required by PLS-SEM, but a larger sample size would effectively improve the accuracy and evaluation of the model. In addition, during the study, although we conducted group experiments to explore the similarities and differences in user engagement behaviours in the context of live and virtual live streaming, the study did not consider whether the differences in live content would have different effects, and some existing cutting-edge studies and surveys point out that live streaming heterogeneity, such as different live streaming platforms and content, may result in different perceptions of presence due to different user concerns. In addition, we also found that most of the users who watch virtual live streaming are concentrated in the circle of anime lovers (Otaku), and we must consider whether anime lovers and non-anime lovers would have different perceptions of presence and trust in the virtual live streaming environment. Therefore, in our future research, we will continue to deepen our work in the field of live streaming, for example, by exploring the impact of differences in live content on user engagement.

Conclusions
By observing the live streaming industry and searching through a large amount of literature, this paper identifies a viewer behaviour that is currently lacking in research: continuous participation behaviour. By exploring the many antecedents of continuous participation, we found that live streaming environmental stimuli (both interactions and visual stimuli) can influence the emotional cognition (sense of presence and trust) of live streaming viewers and, thus, influence their continuous participation behaviour. This study, therefore, constructed a structural equation model and demonstrated, through responses to the questionnaire, that the interactive information and visual stimuli displayed on a live streaming interface positively influence the emotional perceptions (presence and trust) of live streaming users and, thus, their continuous engagement behaviour. By comparing groups, the study argues, for the first time, that these influences differ between live and virtual streaming. The analysis of the results suggests that companies should stimulate interaction between live viewers, anchors, and other viewers in various ways, as interaction is a decisive factor in promoting viewers' emotional perceptions in both live and virtual streaming. Regarding visual stimulation, more may be needed from a virtual live interface, as the anchor and the interface itself are virtual animated images, so people will demand more, such as visual aesthetics, from the live streaming, and this poses challenges for the company. Finally, we demonstrated that perceived trust and social presence positively influence consumer loyalty.
Author Contributions: Conceptualization, J.L. and C.C.; data curation, Y.S.; formal analysis, X.S.; investigation, Q.X. and L.N.; methodology, J.L. and C.C.; resources, C.C., X.S. and Y.S.; writingoriginal draft, J.L., Q.X. and L.N.; writing-review and editing, C.C. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Written informed consent from the participants was not required to participate in this study in accordance with the national legislation and the institutional requirements.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the privacy restrictions.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1. Questionnaire Items.

Construct
Item References

INT
When watching a live-stream, I will exchange and share opinions with the streamer or other audiences. [5,18] When watching a live-stream, I interacted with other viewers using the hashtags related to the live streaming. When watching a live-stream, I posted my feelings in real-time online conversation. When watching a live-stream, I will answer questions from the anchor and other viewers.

EVE
The interactive content in the live interface is visually appealing.
[41] The interactive content in the live interface is visually pleasing. The interactive content in the live interface is visually cheerful. The interactive content in the live interface is visually interesting.

PTR
Promises made by this live streaming are likely to be reliable. [20,48] I do not doubt the honesty of this live streaming. Based on my experience with this live streaming, I know it is honest. I feel that this live streaming is trustworthy.

SPR
When I participate in a live-streaming chat, I feel emotionally connected with users I am chatting with. [6,48] There is a sense of human contact on this live streaming. There is a sense of sociability on this live streaming. There is a sense of human warmth on this live streaming. SEB I feel more attached to my favorite live-streaming channels than other channels. [6] I will continue to watch my favorite live-streaming channel.
I will increase the amount of time I spend watching my favorite live-streaming channel. I consider myself to be a committed fan of my favorite live-streaming channel.