Long-Term Effects of Perceived Friendship with Intelligent Voice Assistants on Usage Behavior, User Experience, and Social Perceptions

: Social patterns and roles can develop when users talk to intelligent voice assistants (IVAs) daily. The current study investigates whether users assign different roles to devices and how this affects their usage behavior, user experience, and social perceptions. Since social roles take time to establish, we equipped 106 participants with Alexa or Google assistants and some smart home devices and observed their interactions for nine months. We analyzed diverse subjective (questionnaire) and objective data (interaction data). By combining social science and data science analyses, we identiﬁed two distinct clusters—users who assigned a friendship role to IVAs over time and users who did not. Interestingly, these clusters exhibited signiﬁcant differences in their usage behavior, user experience, and social perceptions of the devices. For example, participants who assigned a role to IVAs attributed more friendship to them used them more frequently, reported more enjoyment during interactions, and perceived more empathy for IVAs. In addition, these users had distinct personal requirements, for example, they reported more loneliness. This study provides valuable insights into the role-speciﬁc effects and consequences of voice assistants. Recent developments in conversational language models such as ChatGPT suggest that the ﬁndings of this study could make an important contribution to the design of dialogic human–AI interactions.


Introduction
Intelligent voice assistants, or IVAs, which are integrated into smart speakers such as Amazon's Alexa, are rapidly gaining popularity and becoming an integral part of everyday life [1,2].These devices can recognize voice-based requests, respond with human-like speech, and assist users with various tasks [3].Some users consider voice assistants a useful tool, whereas others have formed closer relationships or even friendships with them [4].As the social relationships with voice assistants become more established, their impact on interactions, user experience, and self-disclosure becomes more apparent [5][6][7].For example, a study by Wu, et al. [8] showed that users assign different roles to IVAs and that this attribution, in turn, determines the expectations and usage patterns.Consistent with other research [9,10], Wu, He, Peng, Li, Zhou, and Guan [8] found that one of the most common roles assigned to IVAs was that of a "friend".Therefore, the present study precisely focuses on the attribution of IVAs as friends.
However, social roles and interactions are established over time in natural interaction environments.Therefore, longitudinal studies are valuable in understanding how the perceived role influences usage behavior and social perceptions of IVAs.They also give users the necessary time to establish a social relationship with the device [11].Despite the growing importance of the relationship between humans and IVAs, there is a lack of longitudinal studies that focus on the social aspects of this relationship.Most longitudinal studies focus on describing usage patterns and behaviors [12,13], with relatively little attention given to the social aspects of human-technology interactions [14].Since social roles on technologies and the attribution of social counterparts are often perceived unconsciously, using mainly explicit measurement methods falls short or even causes reactance [15][16][17].Long-term studies also allow for a multifaceted capture of interactions, such as continuous interaction data.
In summary, initial short-term studies show that social role attribution affects users' expectations and use of IVAs.However, there is a lack of targeted long-term studies that include both subjective and objective data, as well as a temporal course of the phenomenon.Thus, the following research questions arise: (1) Do participants perceive IVAs as friends?
(2) How does the perceived social role of IVAs as a friend influence usage behavior, user experience, and social perception over time (3) Does the user's personality influence the attribution of friendship to IVAs?To investigate this, our participants interacted with a common IVA for nine months in their homes.We obtained self-report questionnaires and analyzed continuous interaction data by using social science and data science methods.Our multi-method approach aims to uncover differences in usage behavior as a function of the social role attribution of the device and visualizes changes over time.Thus, the present work differs from previous studies as it determines whether users differ in their attribution of friendship to voice assistants by examining role-specific interaction patterns using an interdisciplinary multi-method approach.We also explore the personality traits that may favor a perception of friendship with an IVA and analyze interactions with IVAs in natural environments from a long-term perspective.

Related Work
Anthropomorphism, or the attribution of human characteristics to non-human beings, has been the subject of extensive research in psychology and the study of human-computer interaction.Previous research has shown that people tend to anthropomorphize inanimate objects under certain circumstances, which may extend to other areas of human cognition, attitudes, and even behavior [18].This section reviews social attributions and their impact on users' interactions with technological entities.Following the definitions of Russell and Norvig [19], in this study, we focus on voice-based AI systems and use the term "IVA" to refer to intelligent voice-based assistants, with smart speakers (e.g., Amazon Echo, Google Home) and voice-controlled intelligent personal assistants (e.g., Amazon Alexa, Google Assistant) being the most popular examples.

Voice Assistants as Social Actors in User Relationships
The media equation suggests that humans tend to attribute social and cognitive characteristics to technological entities, treating them as social actors [20] due to the humanlike characteristics or behaviors that activate users' social scripts [20].This has been found to be the case with PCs, smartphones [21], and even websites [22].IVAs, unlike other technologies, have particularly high social affordances and anthropomorphic attributes with their conversational speech interaction and names [23][24][25].Most noticeable is the personification of smart speakers or IVAs by users.Often, users personify these systems by using human pronouns and the respective name of the IVA instead of the device name for smart speakers [9,10,26].The social perception of IVAs is also evident in effects that otherwise only in human-human interaction.For instance, Liu and Pu [5] found that voice assistants induce a social facilitation effect.In their experiment, they studied the effect of the presence of a smart speaker on the solution speed of easy or complex tasks.The results indicate that participants solved easy tasks faster in the presence of the smart speaker, whereas they reacted slower in difficult tasks.Other works studied emotional-affective responses toward IVAs.Carolus, Wienrich, Törke, Friedel, Schwietering, and Sperzel [6] found that observers of voice assistants feel more empathy for the assistant when they see the IVA being treated rudely.Furthermore, computer voices have been observed to increase respondents' socially desirable response behavior and encourage the disclosure of sensitive information [27,28].Wienrich, Reitelbach, and Carolus [7] also found that voice assistants promote the disclosure of sensitive information the more trustworthy and competent they are perceived.
As research shows, IVAs can elicit social responses and behaviors in individuals.However, does this also mean that we ascribe concrete social roles to voice assistants?If so, to what extent does the perceived social role influence the usage behavior, user experience, and social perceptions in the long term once a social relationship has developed between the IVA and the user?1.1.2.Social Roles of IVAs in Relationships with Users Social roles are assigned to holders of certain positions or functions in social contexts, which are accompanied by demands of their behavior and character.Dreitzel [29] defines a social role as a set of expectations that are attached to the behavior of the holder in interaction situations.In other words, different expectations are attributed to the holder depending on the assigned social role.In addition, previous studies on AI-based technologies found that users attribute social roles to technology, which can affect their expectations and perceptions [8,30].For example, users of IVAs often assign them either a tool-based role (e.g., assistant, service provider, tool) or a friend-based role (e.g., friend, family member, companion) [31].Users expect helpful, cooperative, and understanding behavior from tool-based IVAs and respect, an emotional connection, and loyalty from friend-based IVAs.In particular, users may develop friendship-like feelings toward IVAs and associate them with social roles that can influence their interactions [32,33].When users perceive IVAs as friends, they feel more sympathy toward them [33].Feelings of friendship toward IVAs further motivate users to speak patiently and more slowly, thereby increasing the understanding of voice input [33].The perception of an IVA as a friend can also positively influence attitudes toward products in the context of voice shopping [34] and products are liked more when IVAs provide product recommendations in the role of a friend.
Parasocial relationship theory can explain the use of IVAs, as they convey closeness and intimacy, positively influencing perceptions of the voice assistant as a friend, usage intention, and satisfaction [35,36].In general, a parasocial relationship is defined as the extent to which a media consumer has developed a social relationship with a medium [37].Parasocial relationships with media entities develop over time [38][39][40] and can motivate consumption [41] and influence recipient activity [42].In the case of voice assistants, Hsieh and Lee [43] describe parasocial relationships as a key factor for perceived ease of use and future intention to use IVAs.Numerous findings support this assumption and show that the stronger the social relationship between IVAs and users, the higher the perceived user experience [44][45][46].When there is a strong relationship between IVAs and users, it can encourage acceptance and social presence [24,47], as well as exploratory usage behaviors [48].Technologies that are perceived as social and relational can result in a more positive, user-friendly, and useful user experience [44,45].
However, previous studies often rely on short-term measures, although establishing appropriate relationships takes time and usually occurs in natural environments such as homes [11,14,38].Consequently, some researchers conducted longitudinal studies [11,49].Gao, Pan, Wang, and Chen [49] found that some users assigned different roles to IVAs, most of them being human roles with positive emotions, whereas others attributed more impersonal roles with less positive emotions.Voit, Niess, Eckerth, Ernst, Weingärtner, and Woźniak [11] found that some users viewed smart speakers as social agents and described a social relationship with them.In contrast, others viewed them as technical tools and distanced themselves from them.Some results show that the context of the IVA's use influences the nature of the interactions and their associated emotions.Thus, these initial studies suggest two types of roles-personal and tool-like.However, it is difficult to capture the social roles that users associate with IVAs.The attribution of social roles or human characteristics to technological devices often occurs unconsciously [15,16], which makes it hard to assess the social role using explicit measurement methods.Users often deny the social and friendship roles they attribute to AI systems if directly and explicitly asked about them [17].Therefore, incorporating more implicit measures could complement the measurement of role-specific relationships between users and IVAs and provide insights into role-specific interaction styles.The continuous interaction data recorded and stored by voice assistant providers such as Amazon and Google may be valuable [14].However, to date, no study has evaluated continuous interaction data and systematically examined it in terms of the social roles attributed to IVAs and the trends over time.
Furthermore, previous long-term studies have not explored the role of the user's personality in determining which social role is attributed to an IVA.Whelan et al. [50] showed that anthropomorphizing tendencies are particularly salient in individuals with insecure attachment styles and attachment anxiety.Epley, Waytz, and Cacioppo [18] suggested that people with insecure attachment styles tend to anthropomorphize to compensate for unmet interpersonal attachment needs.In the field of IVAs, it has been shown that use over a more extended period can reduce situational feelings of loneliness among users [51,52], as IVAs provide a social presence [53] that can fulfill certain social needs and provide the sense of being in a human company [18,54].In the context of voice assistants, there is currently no empirical evidence to show how users' personality traits and attachment styles relate to social perceptions of IVAs and influence the development of feelings of friendship toward them.Similarly, although previous studies have shown that IVAs can reduce feelings of loneliness [52], they do not specifically address the role-specific significance of IVAs and are restricted to self-report methods for describing interaction styles.

Summary and Present Study
Voice assistants are a widely used technology with a broad social appeal that can influence users' perceptions and behaviors based on the social role assigned to them (e.g., assistant-like role or friend-like role).Research has also shown that the attribution of social roles and the corresponding responses are often unconscious, with users sometimes denying assigning such roles when explicitly asked about them [15][16][17]55].Moreover, social processes evolve over time and in natural contexts of use [14].In contrast to previous research approaches, our study considers the temporal conditions in which friendships develop between users and IVAs and examines them in a natural context of use (i.e., home).We conduct a longitudinal study and analyze the development of role-specific effects and relationships between users and IVAs over nine months.For users who attributed the role of friendship, we evaluate both explicit and implicit data, including the use of questionnaires and the assessment of continuous interaction data.By utilizing a multimethod approach that integrates both data and social sciences analyses, we investigate the degree to which role attribution influences usage behavior, user experience, and social perception.We also analyze the personality traits that impact role attribution.Thus, our results contribute to a better understanding of how the social perception of IVAs can affect human-AI interactions.This may become even more important in the future as AI becomes more adaptive, intelligent, and social, which is associated with greater opportunities [4] and risks [56].Using a long-term study in a natural context of use, our research provides valuable insights into role-specific interactions with IVAs over time.Overall, this research offers valuable perspectives into the evolving relationship between users and IVAs and has implications for future developments in this field.

Structure of Present Study
Our work takes a broader exploratory approach.Rather than formulating hypotheses, we developed a set of research questions to gain a deeper understanding of the role-specific effects of friend-like IVAs at multiple levels including usage behavior, user experience, and social perception).To provide a better overview, the Methods and Results sections are divided into five sections, each addressing one of the following research questions: To what extent do personality traits of users differ in attributing vs. not attributing a friendship role to IVAs?

Participants
A total of 106 students who reported not owning a smart speaker were included in this study.Participants were excluded from the 9-month longitudinal study if they already owned a device with a voice assistant before the study (n = 5), participated in the study unreliably (n = 5), or left the university (n = 1).Finally, longitudinal study data were collected from 85 participants who ranged in age from 17 to 23 years (M = 19.42,SD = 1.37) and were predominantly female (n = 70; n = 7 male; n = 8 diverse).Because sample sizes differed for some variables, the gender composition of the samples is reported at the level of the constructs, as shown in Table S1.

Procedure
At the beginning of the study, half of the sample was equipped with a Google Home Mini and an Amazon Echo Dot.In addition, participants were given other smart home devices such as a smart socket or a light bulb.Participants were instructed to install all devices within one week.After successful installation, different channels (e.g., the messenger service, Telegram) were used to ensure anonymous and low-threshold communication with the investigators.Participants were given randomized, anonymous subject codes to create the email addresses used to register the devices with Google or Amazon.The log files, which were used to generate user logs and analyze user behavior, were also linked to this e-mail address.
The long-term study was divided into (1) the installation stage, (2) the free interaction stage, (3) the intervention stage, and (4) the interview stage (Figure 1).Usage behavior, user experience, and social perception variables in the context of IVA usage were collected over 15 time points (short "T").In the installation stage, the devices were distributed to participants and prepared for use.In the free interaction stage, we analyzed the unrestricted interactions with the devices.In the intervention stage, we conducted experimental interventions to study their effect on usage behavior, user experience, and social perception.Intervention 1 provided a deeper understanding of the functions of IVAs.In intervention 2, participants were instructed to play games over their IVAs.In intervention 3, participants used the TK smart relaxation skill.In the interview stage, participants took part in a structured interview (11 main questions, 14 follow-up questions) that was tape-recorded.Questions were related to, for example, assessment of usage, acceptance of the IVA, and its perception as a social interaction partner.The present study addresses time points T1 (30 October 2021) to T12 (25 February 2022) (Table 1) to analyze the behavior of natural interaction in the field and exclude the effects of interventions starting at T13. perception.Intervention 1 provided a deeper understanding of the functions of IVAs.In intervention 2, participants were instructed to play games over their IVAs.In intervention 3, participants used the TK smart relaxation skill.In the interview stage, participants took part in a structured interview (11 main questions, 14 follow-up questions) that was taperecorded.Questions were related to, for example, assessment of usage, acceptance of the IVA, and its perception as a social interaction partner.The present study addresses time points T1 (30 October 2021) to T12 (25 February 2022) (Table 1) to analyze the behavior of natural interaction in the field and exclude the effects of interventions starting at T13.The idea of the free interaction stage was to allow the participants to interact with their new devices without any restrictions for four months.During this stage, participants answered 13 online questionnaires.Each questionnaire began with instructions and included privacy explanations.Participants then entered their unique codeword and answered the questionnaire.New surveys and information were announced via participants' messenger services.The questionnaires included five thematic blocks such as personality (e.g., demographics; personality traits); usage behavior (e.g., the function used, frequency of use); user experience (e.g., UX user motivations, usage ratings), and social perceptions of the IVA (e.g., friendship, empathy).Although personality traits were recorded only once during the free interaction phase, the other constructs were recorded repeatedly to track changes over time.After completion of the study, participants consented to or disagreed with the analysis of their data.Participants affirmed that the recorded voice input was theirs alone and that no third-party voice data were recorded.The devices were returned after nine months.The idea of the free interaction stage was to allow the participants to interact with their new devices without any restrictions for four months.During this stage, participants answered 13 online questionnaires.Each questionnaire began with instructions and included privacy explanations.Participants then entered their unique codeword and answered the questionnaire.New surveys and information were announced via participants' messenger services.The questionnaires included five thematic blocks such as personality (e.g., demographics; personality traits); usage behavior (e.g., the function used, frequency of use); user experience (e.g., UX user motivations, usage ratings), and social perceptions of the IVA (e.g., friendship, empathy).Although personality traits were recorded only once during the free interaction phase, the other constructs were recorded repeatedly to track changes over time.After completion of the study, participants consented to or disagreed with the analysis of their data.Participants affirmed that the recorded voice input was theirs alone and that no third-party voice data were recorded.The devices were returned after nine months.

Data Analysis
To determine whether users differed in their attribution of friendship to IVAs (Section 1), they were divided into homogeneous groups using K-means clustering.A multifactorial ANOVA then identified the significant differences between the groups.In Sections 1 to 5, the groups are compared using appropriate procedures (e.g., Welch's t-test [57]) to reveal the differences in usage behavior, user experience, social perception, and personality traits.For group comparisons, two-sided tests were performed at an α-level < 0.05.In Sections 1 to 4, the effects over time are examined using repeated measures ANOVA (RM-ANOVA) with Greenhouse-Geisser and Bonferroni corrections in post hoc tests.RM-ANOVA is an appropriate procedure because the measurements were repeated among participants at time intervals and we wanted to examine the trends over time both within and across groups.The time between the individual measurement points is indicated in the respective analyses.

Measures, Results, and Discussions by Section
We describe the variables, measurement tools, and results per research question in the following sections: (1) Cluster Formation and Time Effects, (2) Usage Behavior, (3) User Experience, (4) Social Perception, and (5) Personality Traits.

Section 1-Cluster Formation and Time Effects
Research has shown that users attribute different social roles to IVAs such as tool-based or friend-based roles [31].In this section, we analyze how participants explicitly associate IVAs with socially identified social roles [10].However, we also use implicit methods to capture the perceptions of friendship qualities, as social attributions to technologies are often unconscious [15,16].In this section, we examine (1) whether participants differ in their perceptions of friendship toward IVAs, (2) whether they can be grouped based on these perceptions, and (3) how friendship perceptions evolve.

Measures
Social Roles Scale-Explicit Measurement.Purington, Taft, Sannon, Bazarova, and Taylor [15] identified five social roles that users associate with voice assistants.Participants rated their voice assistant on a seven-point Likert scale (ranging from 1 "strongly disagree" to 7 "strongly agree") based on Purington et al. [10]'s social roles: (1) information source, (2) entertainment provider, (3) administrative assistant, (4) companion, and ( 5) friend (α = 0.052).The explicit measure allowed us to determine the role that the participants consciously associated with the IVA.Participants received the instruction: "People look at their voice assistant very differently.When you think of your voice assistant, to what extent do the following descriptions apply from your perspective?".
Friendship Quality-Implicit Measurement.The Intimate Friendship Scale (IFS; Sharabany [58]) was used to measure the extent to which participants perceived the IVA as a friend.The IFS is an appropriate instrument for measuring the depth and quality of perceived friendship.The scale focuses on the important aspects of the relationship between interaction partners.In addition, we can use the subscales of the IFS to measure the latent attributes of friendship perceptions that individuals associate with their voice assistants.Compared to the explicit Social Role Scale, the IFS measures the implicit dimensions of friendship and does not directly ask about the extent to which participants perceive the IVA as a friend.With its 32 items, the IFS measures 8 subscales (frankness and spontaneity (α = 0.83), sensitivity and knowing (α = 0.80), attachment (α = 0.81), exclusiveness (α = 0.76), giving and sharing (α = 0.76), imposition (α = 0.70), common activities (α = 0.63), and trust and loyalty (α = 0.67)) on a scale of 1 (strongly disagree) to 6 (strongly agree).We adapted the items to voice assistants.Items that deviated too much from the original were not included in the analysis (see Table S2), which is only relevant to the imposition and common activities subscales, as well as six items from the sensitivity and knowing, attachment, exclusiveness, and giving and sharing subscales.

Results
Explicit and Implicit Measurements.The explicit measurement of social roles via the Social Role Scale at T10 showed that participants (N = 73) most often associated IVAs with the information source role (M = 6.06,SD = 1.15).This was followed by the entertainer (M = 5.54, SD = 1.48), assistant (M = 3.81, SD = 1.96), companion (M = 1.63,SD = 1.04), and friend (M = 1.20,SD = 0.60) roles.As expected, the implicit measure of the perceived friendship quality at T10 was higher than the explicit measure (see Table 2).Therefore, for the evaluation in Section 1, the cluster analysis was performed using the implicit measure.Cluster Analysis.The aim was to group participants according to their perceived level of friendship with the voice assistant.This involved dividing participants into groups with similar ratings while ensuring that individuals in different groups had differing ratings.The use of the K-means cluster analysis was appropriate, as it can optimize both homogeneity within clusters and heterogeneity between clusters [59].In addition, this method is popular and is widely used for data [60].
The parameters used to group participants in the K-means cluster analysis were Sharabany's six subscales of perceived friendship quality [58].The procedure was computed using the R package, Stats, and the Euclidean distance was used as the measure of dissimilarity.The optimal number of clusters was determined using a gap-stat plot [61,62].As shown in Figure 2, the optimal number of clusters in the data with k = 2 was determined using the NbClust package in R, which used 30 simultaneous processes to identify the optimal number of clusters (see the vertical dashed line in Figure 2).Ten indices supported a solution with two clusters according to the majority rule, whereas 6 indices suggested 3 clusters, 3 indices suggested 8 clusters, 1 index suggested 9 clusters and 3 indices suggested a 10-cluster solution.On this basis, we decided on two clusters.Table 3 shows the mean values and the number of participants in each clust shows that the clusters can be divided into higher perceived friendship quality and lo perceived friendship quality.The first cluster (n = 40) showed higher scores in the a ution of friendship quality to IVAs than the second cluster (n = 33).To test the validi the clusters, a multivariate analysis of variance (MANOVA) with post hoc tests was to determine whether the clusters differed significantly based on Sharabany's subs [58].The MANOVA supported the validity of the clusters (see Table 3) by showing a nificant difference in the perceived friendship quality between the two groups (F(7, 23.66, p < 0.001, Wilk's Λ = 0.28) [63].Table 3 shows the mean values and the number of participants in each cluster.It shows that the clusters can be divided into higher perceived friendship quality and lower perceived friendship quality.The first cluster (n = 40) showed higher scores in the attribution of friendship quality to IVAs than the second cluster (n = 33).To test the validity of the clusters, a multivariate analysis of variance (MANOVA) with post hoc tests was used to determine whether the clusters differed significantly based on Sharabany's subscales [58].
The MANOVA supported the validity of the clusters (see Table 3) by showing a significant difference in the perceived friendship quality between the two groups (F(7, 65) = 23.66,p < 0.001, Wilk's Λ = 0.28) [63].To avoid ambiguity, we refer to the first cluster as the friend cluster, which is characterized by a higher perceived quality of friendship toward the voice assistant.Conversely, we refer to the second cluster as the non-friend cluster, which is characterized by a relatively lower perceived quality of friendship toward the voice assistant.

Brief Discussion of Section 1
The results are interesting because the measurements of the perception of the voice assistant as a friend on both explicit and implicit levels reveal contradictions.The explicit measure shows that the majority of participants rejected the voice assistant as a friend.However, the implicit measures reveal that the voice assistant fulfills friendship qualities.The explicit measure is consistent with the findings of previous studies, which showed that social perceptions and social behavior toward technology can be denied even though they actually occurred [55].This leads us to conclude that users view IVAs as friends far more than they admit or are aware of.Furthermore, the implicit measure of friendship quality was an appropriate variable for distinguishing between individuals based on whether they perceived IVAs as more or less friend-like.In addition, the relationship between these clusters differed over time (Figure 3).The friend cluster increasingly associated the IVA with friendship qualities over time, whereas there was no effect of time observed for the nonfriend cluster.These findings guided the follow-up analysis, which investigated whether the two clusters differed in terms of their usage behavior, perceived user experience, and social perceptions over time.
whether they perceived IVAs as more or less friend-like.In addition, the relationship between these clusters differed over time (Figure 3).The friend cluster increasingly associated the IVA with friendship qualities over time, whereas there was no effect of time observed for the non-friend cluster.These findings guided the follow-up analysis, which investigated whether the two clusters differed in terms of their usage behavior, perceived user experience, and social perceptions over time.

Section 2-Usage Behavior
The perception of IVAs as friends can affect how users interact with and utilize voice assistants [32,33].To determine usage behavior, we used subjective and implicit measurement techniques to further analyze usage habits, future usage intentions, and the frequency and type of features used by users.These factors provided a holistic understanding of both current and future IVA usage patterns.We then conducted a two-sided Welch's t-test to determine if there were any disparities in the usage behavior variables between the friend cluster and the non-friend cluster.Our choice of statistical tests followed the recommendations of [57].

Section 2-Usage Behavior
The perception of IVAs as friends can affect how users interact with and utilize voice assistants [32,33].To determine usage behavior, we used subjective and implicit measurement techniques to further analyze usage habits, future usage intentions, and the frequency and type of features used by users.These factors provided a holistic understanding of both current and future IVA usage patterns.We then conducted a two-sided Welch's t-test to determine if there were any disparities in the usage behavior variables between the friend cluster and the non-friend cluster.Our choice of statistical tests followed the recommendations of [57].

Measures
Subjective Measure-Usage Habits.To understand the adoption and changes in the use of voice assistants, we examined the integration of IVAs into daily routines and tested for group variances.Previous research has indicated that incorporating media into daily routines is a critical factor in determining usage patterns [64].We adapted items from the Social Media Use Integration Scale [64] to capture the integration of IVAs into daily individual routines (13 items; e.g., The voice assistant wakes me up every morning) and daily routines with others (6 items; I call my friends using the voice assistant).The reliability of the questionnaire was α = 0.75.
Subjective Measure-Future Usage.A 5-point Likert scale (1 = I do not believe in it at all; 5 = I believe in it very much), which was adapted from Przybylski et al. [65], was used to measure participants' interest in continuing to use the voice assistant in the future (one item, "I will continue to use a voice assistant after the end of the study") and tendency to recommend the voice assistant to others (word to mouth; two items, e.g., "I will tell others positive things about my voice assistant").The internal consistency of the items in this study was α = 0.83.
Implicit Measure-Behavioral Data.Considering participants' consent, their usage behavior was examined using IVA interaction data from participants' exported activity logs.The information was then compiled into a CSV file that contained columns for each participant's anonymous subject code, the date of the command, the command and conversation content, and the response from the voice assistant.From the logs, we were able to extract a total of 22,436 speech entries (M = 467.42,SD = 787.35).Thus, we were able to implicitly derive actual usage behavior based on voice interactions with the IVAs.Due to technical complications in the provider platform, some user logs were incomplete.In total, we had access to 48 complete user logs.
To determine how users interacted with the IVA during the study period, we first took a sample of 2000 transcribed voice commands from the usage logs and examined them using Mayring's Qualitative Content Analysis [66].If identification via the user's transcribed voice input was not possible, we used the IVA's transcribed voice output as an indicator of the function used.We then developed a fixed set of 45 subcategories (functions used by the voice assistant such as news, weather, listen to music) and 7 primary categories (a structural classification of functions at a higher level based on similarities such as knowledge acquisition, support, media entertainment) to categorize the interactions based on the voice commands used (see Table S3).This set was derived from categories determined in previous literature [12,26,67], feature reviews from device vendors [68], and inductive new categories, and was iteratively revised by all researchers until an agreement was reached.
We generated and used keywords to automatically categorize the transcribed voice input.For instance, for the subcategory "lamps", we manually searched for related voice commands containing the keywords "lamp", bright", and "light".As we analyzed the commands from various usage logs, we were able to identify additional relevant keywords related to user intent (e.g., "bulb"), as well as find voice commands that contained similar keywords but were not related to user intent (e.g., "Is it already bright outside?").After finding new keywords and exceptions, we added or removed keywords to differentiate this category from other categories of voice commands.A randomized sample of 1000 voice commands was used to verify that the majority of the categorized voice commands corresponded to the actual function (subcategory).This iterative approach allowed us to provide a unique classification for each subcategory.
Once the keywords were finalized, we performed the automated categorization by string matching.In the specific case of the inductively created primary category "Social Interaction," we conducted a second, independent categorization process.The primary category captured the extent to which users personify the IVA [69].We performed a second categorization process because we assumed that personification features could occur in all the functions (subcategories) used.For example, politeness phrases (such as "please" and "thank you") and greetings (such as "hello" or "hi") may be used in combination with other functions (e.g., listening to music, alarm clock, and time) such as "Alexa, please play a song" or "Can you please tell me what time it is?".To maintain the validity and reliability of the categorizations, each categorization was manually reviewed and adjusted as needed.A sample of the user input was verified, and the majority of the categorized speech input was found to match the actual function.A second independent coder categorized 10% of the whole sample (Cohen's kappa = 0.95).Subjective Measure-Future Usage.The reference time point for the intention to continue using and recommending the voice assistant in the future was T8.The intention to recommend the voice assistant in the future was significantly higher (t(64.93)= 3.75, p < 0.001, d = 0.89) for the friend cluster (M = 4.48, SD = 1.15, n = 40) than the non-friend cluster (M = 3.39, SD = 1.29, n = 33).Regarding interest in continuing to use the voice assistant in the future, there was no significant (p = 0.340) difference between the friend cluster (M = 3.03, SD = 1.46) and the non-friend cluster (M = 2.70, SD = 1.45).

Subjective
Objective Measure-Behavioral Data.The data on the voice assistant functions used by each participant throughout the entire study period were collected and an overall value was calculated.Using this index, Grubbs' test identified one significant outlier, which was excluded from further analyses.Both clusters were then tested for differ-ences in frequency of use.We tested for differences between the thematic categories that summarized the individual functions of the IVA, as well as the individual functions themselves (subcategories).
Categories.Welch's t-tests across categories revealed significant and marginally significant differences between the two clusters (Table 4).For example, individuals in the friend cluster were, on average, more likely to use the IVA for support (p = 0.038) and social interactions (p = 0.028), with differences for use in knowledge acquisition (p = 0.086) and mood management (p = 0.073) being marginally significant and higher for the friend cluster.However, differences between the clusters in terms of media and entertainment (p = 0.796) and smart homes (p = 0.910) were not found to be significant.

Subcategories.
Welch's t-tests revealed significant and marginally significant differences between the clusters (Table 5).The friend cluster was more likely to use the IVA for news (p = 0.074), as a local guide (p = 0.025), as an alarm clock and for the time (p = 0.017), as a calendar (p = 0.044), for cooking (p = 0.098), for audiobooks and stories (p = 0.073), for jokes (p = 0.061), for self-esteem (p = 0.077), as a fun gadget (p = 0.087), to apologize (p = 0.083), to show interest in social cues (p = 0.034), for greetings and goodbyes (p = 0.014), and for direct speech (p = 0.049).Figure 4 shows the functions used and IVA usage across a week.The bar chart for the friend cluster is more colorful, indicating a wider range of functions used.Both clusters primarily used the voice assistant to control media, listen to music, and control their smart home.The friend cluster additionally showed more intensive use of the alarm clock and time function.Similarly, the friend cluster had more difficulty being understood by the voice assistant.Looking at the days of the week, both clusters were mainly active from Monday to Wednesday, whereas the non-friend cluster was most active on Thursday.Figure 4 shows the functions used and IVA usage across a week.The bar chart for the friend cluster is more colorful, indicating a wider range of functions used.Both clusters primarily used the voice assistant to control media, listen to music, and control their smart home.The friend cluster additionally showed more intensive use of the alarm clock and time function.Similarly, the friend cluster had more difficulty being understood by the voice assistant.Looking at the days of the week, both clusters were mainly active from Monday to Wednesday, whereas the non-friend cluster was most active on Thursday.

Interaction Modeling
Below, we model additional usage indicators based on IVA data to better understand the interactions with IVAs and potentially disaggregate usage differences as a function of perceived friendship.
Daily Use.To understand how the IVAs were used by the two clusters daily, participants' voice commands were examined in more detail.Analyses showed that there was no significant difference in the frequency of use across the study period (118 days) between the groups (t(39.09)= 0.81, p = 0.425, d = 0.25).The friend cluster sent an average of M = 5.55 (SD = 7.10) commands and the non-friend cluster sent an average of M = 3.73 (SD = 7.51) commands per day to their voice assistants.
Length of Voice Commands.In the two clusters, we looked at how many words on average the participants used per voice command throughout the study period (Figure 5).Word length per voice command over time.Figure 6 shows the number of words used per voice command per cluster over time.The average voice command length changed over time.Notably, the friend cluster showed a positive trend in terms of the word length used over time (minimum 2.05, maximum 4.31), whereas the non-friend cluster showed a negative trend (minimum 1.59, maximum 3.78).To assess the temporal effect, we compared the average instruction length of the two clusters in the first and last month of use.There was a significant difference between the two clusters (F(1, 31) = 7.13, p = 0.012, η²p = 0.19).In the first 4 weeks, the friend cluster (M = 3.88, SE = 0.26) and the non-friend cluster (M = 3.34, SE = 0.25) did not differ significantly (p = 0.84).However, in the last four weeks, the friend cluster (M = 3.92, SE = 0.25) used significantly (p = 0.027) more words per voice command than the non-friend cluster (M = 2.87, SE = 0.24).Notably, even in the first four weeks of use, the friend cluster used significantly more words per speech command than the non-friend cluster used in the last four weeks (p = 0.041).Word length per voice command over time.Figure 6 shows the number of words used per voice command per cluster over time.The average voice command length changed over time.Notably, the friend cluster showed a positive trend in terms of the word length used over time (minimum 2.05, maximum 4.31), whereas the non-friend cluster showed a negative trend (minimum 1.59, maximum 3.78).To assess the temporal effect, we compared the average instruction length of the two clusters in the first and last month of use.There was a significant difference between the two clusters (F(1, 31) = 7.13, p = 0.012, η 2 p = 0.19).In the first 4 weeks, the friend cluster (M = 3.88, SE = 0.26) and the non-friend cluster (M = 3.34, SE = 0.25) did not differ significantly (p = 0.84).However, in the last four weeks, the friend cluster (M = 3.92, SE = 0.25) used significantly (p = 0.027) more words per voice command than the non-friend cluster (M = 2.87, SE = 0.24).Notably, even in the first four weeks of use, the friend cluster used significantly more words per speech command than the non-friend cluster used in the last four weeks (p = 0.041).

Brief Discussion of Section 2
Our analysis revealed differences in the usage behavior between individuals from the friend cluster and the non-friend cluster.The friend cluster was more likely to use the voice assistant for functions related to support, mood management, and knowledge acquisition.More specifically, IVAs were more often used, for example, for checking the news, as an alarm and clock, or as a calendar.There was also a tendency for individuals from the friend cluster to use the voice assistant to boost their self-esteem (sample voice command: "Say something nice to me") and have more fun with the IVA to entertain themselves (sample voice command: "Activate self-destruction").The friend cluster was also more likely to apologize to their voice assistant than the non-friend cluster.The friend cluster tended to interact more socially with the IVA.Thus, the friend cluster was more interested in the voice assistant's personality (sample voice command: "Do you have friends?","How are you?", or "How old are you?"), more likely to greet or say goodbye to the voice assistant, and more likely to address it using the personal pronoun "you".
The subjective measures indicated that the friend cluster was more likely than the non-friend cluster to recommend the voice assistant to others in the future.In addition, the friend cluster self-reported that they had integrated the voice assistant into their daily lives and interactions with others more than the non-friend cluster.Interaction modeling based on the categorized user logs showed that users in the friend cluster used the IVA for a wider range of functionalities.Although both groups used the voice assistant primarily for media control, listening to music, or controlling their smart home, the friend cluster used the voice assistant more frequently for obtaining daily news, as an alarm clock, or for checking the time.In addition, the friend cluster used marginally more words in their voice interactions with the voice assistant.This may indicate that users who

Brief Discussion of Section 2
Our analysis revealed differences in the usage behavior between individuals from the friend cluster and the non-friend cluster.The friend cluster was more likely to use the voice assistant for functions related to support, mood management, and knowledge acquisition.More specifically, IVAs were more often used, for example, for checking the news, as an alarm and clock, or as a calendar.There was also a tendency for individuals from the friend cluster to use the voice assistant to boost their self-esteem (sample voice command: "Say something nice to me") and have more fun with the IVA to entertain themselves (sample voice command: "Activate self-destruction").The friend cluster was also more likely to apologize to their voice assistant than the non-friend cluster.The friend cluster tended to interact more socially with the IVA.Thus, the friend cluster was more interested in the voice assistant's personality (sample voice command: "Do you have friends?","How are you?", or "How old are you?"), more likely to greet or say goodbye to the voice assistant, and more likely to address it using the personal pronoun "you".
The subjective measures indicated that the friend cluster was more likely than the non-friend cluster to recommend the voice assistant to others in the future.In addition, the friend cluster self-reported that they had integrated the voice assistant into their daily lives and interactions with others more than the non-friend cluster.Interaction modeling based on the categorized user logs showed that users in the friend cluster used the IVA for a wider range of functionalities.Although both groups used the voice assistant primarily for media control, listening to music, or controlling their smart home, the friend cluster used the voice assistant more frequently for obtaining daily news, as an alarm clock, or for checking the time.In addition, the friend cluster used marginally more words in their voice interactions with the voice assistant.This may indicate that users who associated the voice assistant more strongly with friendship qualities used more complex sentence structures when speaking to the voice assistant.In contrast, the non-friend cluster used fewer words per speech input.It is also possible that the number of words per voice command used is related to the types of functions used by the respective clusters.

Section 3-User Experience
When users perceive technologies as social, it can have a positive impact on meeting their needs [44].For example, the stronger the social relationship between the user and the IVA, the higher the perceived usefulness [46].The fulfillment of usage motives (e.g., pragmatic and hedonic) and usage needs (e.g., autonomy and competence), along with usage evaluation (e.g., perceived value and awe), are considered essential components in evaluating the user experience [70-73].Accordingly, we used valid scales to measure these elements of the user experience and conducted two-tailed Welch's t-tests to determine whether the two clusters differed in their user experience.To identify temporal trends in the user experience variables across clusters and examine differences within and between clusters, RM-ANOVAs were performed.

Measures
Fulfillment of Usage Motives.Four items were used to assess the pragmatic (e.g., "The interaction fulfilled my seeking for simplicity.",α = 0.80) and hedonic (e.g., "The interaction fulfilled my seeking for pleasure.",α = 0.79) quality based on the short version of the AttrakDiff mini [70].The eudaimonic quality was assessed by four items (e.g., "The interaction fulfilled my seeking to do what you believe in.", α = 0.82) adapted from Huta [71].Four items were used to evaluate the social quality (e.g., "The interaction fulfilled my seeking for social contact.")based on Hassenzahl, Wiklund-Engblom, Bengs, Hägglund, and Diefenbach [72] (α = 0.84).Questions were asked directly in terms of motive fulfillment through the interaction with the IVAs.The items were rated from 1 (not at all) to 7 (very much).

Fulfillment of Usage Needs (Group Differences
).Individuals from the friend cluster showed significantly higher scores in competence (t( 51 7).

Brief Discussion of Section 3
Our analyses show significant differences in the user experience between the two clusters.Users in the friend cluster reported significantly higher satisfaction in terms of pragmatic hedonic, eudaimonic, and social needs than the non-friend cluster, meaning that the friend cluster experienced a more enjoyable interaction with their voice assistant that was more meaningful and social.Similarly, key usage needs were better met for the friend cluster than for the non-friend cluster.In addition, the friend cluster perceived the interaction as more emotionally moving, valuable, meaningful, and inspiring than the non-friend cluster.In terms of expectations, the friend cluster was more likely to want IVAs to help with difficult tasks, value their opinions, and provide a sense of social closeness.
Time-related effects throughout the study's duration indicate that the satisfaction of eudaimonic needs decreased, regardless of the cluster assignment.Accordingly, interaction with the voice assistant over time (regardless of its social role) was seen less as a meaningful experience that enabled personal growth and the expression of self-actualization [74].Regardless of the cluster assignment, participants felt that their autonomy was increasingly violated when interacting with the IVA.The non-friend cluster perceived the interaction with the voice assistant over time as increasingly less stimulating.These results complement previous studies that have reported a decline in usage interest after a diminishing novelty effect [11,26,[75][76][77].

Section 4-Social Perception
Research has shown that parasocial interactions [36,42], empathy [78], social presence [24,47], attachment [79], and perceived humanness correlate with perceived relationship quality with IVAs and other AI systems.To examine whether these aspects of social perception differed between the friend cluster and the non-friend cluster, we conducted two-sided Welch's t-tests.To analyze the temporal patterns of these social variables within and across clusters, we performed RM-ANOVAs.

Measures
Parasocial Interaction.We measured parasocial interactions (PSI) using the Universal PSI Scale [80].The scale measured the PSI processes on a total of 14 subdimensions, where each of the subdimensions contained four items.The PSI processes were summarized on cognitive, affective, and behavioral/non-verbal dimensions.All were answered on a 5-point scale (1 = not at all; 5 = very much).The reliabilities of the individual subdimensions ranged from α = 0.69 (counter empathy) to α = 0.88 (antipathy).
Empathy.To measure participants' empathy toward their voice assistant, the Psychological Involvement-Empathy subscale (α = 0.86) from the Social Presence module of the Game Experience Questionnaire was used [81] and adapted to the voice assistant (e.g., I felt connected to the voice assistant).For six items, subjects indicated the extent to which the statements applied to them using a 5-point Likert scale (0 = not at all; 4 = extremely).
Social Sense.To measure how social participants perceived their voice assistant, the social presence (α = 0.73; With the voice assistant, it feels like there is another person in the room), likeability (α = 0.52; I like my voice assistant), and status (α = 0.66; The voice assistant has a higher social status than I do) subscales of Bailenson, et al. [82] were measured with a total of 10 items.On a 7-point Likert scale, participants indicated the extent to which the statements applied to them (−3 = strongly disagree; +3 strongly agree).
Attachment.To measure how attached participants felt to their voice assistant, the Inclusion of Others in the Self (IOS) Scale [83] was used (α = 0.93).Participants were asked to select the circle illustration that best described their relationship with their voice assistant.The more the circles overlapped, the greater the perceived attachment to the voice assistant.
Uncanny Valley.Ho and MacDorman [84] subscales were used to assess the perceived Humaneness and Eeriness of the voice assistant in terms of the Uncanny Valley effect with 14 items.Participants rated the voice assistant on a five-point Likert scale for bipolar adjectives of humaneness (6 adjectives, e.g., artificial vs. natural; α = 0.85) and eeriness (8 adjectives, e.g., calming vs. scary; α = 0.74) indices.
Attachment (Group Differences).T10 served as the reference time point for analyzing attachment.The attachment scores of individuals in the friend cluster (n = 40, M = 1.58,SD = 0.93) were found to be significantly higher towards the voice assistant (t(48.03)= 3.11, p = 0.003, d = 0.70) as compared to those in the non-friend cluster (n = 33, M = 1.09,SD = .29).

Brief Discussion of Section 4
Our analyses of the social perception of IVAs revealed several differences between individuals in the friend cluster and those in the non-friend cluster.Cognitive parasocial interactions were more pronounced in the friend cluster, which suggests that they paid more attention to the voice assistant, evaluated its actions, and perceived similarities between themselves and the assistant [80].These findings are consistent with previous research indicating that parasocial relationships can foster perceptions of friendship [85,86].Both clusters showed an increase in affective parasocial interactions throughout the study, indicating that participants may have experienced emotional interactions marked by sympathy, empathy, or antipathy, regardless of the cluster assignment [80].
The friend cluster had a stronger perception of the voice assistant's social presence and a greater association of the voice assistant with higher status and likeability.Notably, the sample's perception of the voice assistant's social presence decreased over time, regardless of the cluster assignment.However, overall, it is evident that people from the friend cluster felt more connected with and had more empathy toward the voice assistant.The perceived empathy toward the voice assistant increased over time in the friend cluster.The data showed that the friend cluster perceived the voice assistant to be more human and more eery.This relationship is unsurprising considering the Uncanny Valley effect.The Uncanny Valley effect refers to the phenomenon that when artificial human-like entities (such as robots) become more and more human-like, viewers' positive impressions and sympathy diminish until they finally reach a point where they are almost (but not quite) human-like [87].Conversely, we can assume that the friend cluster may have perceived the voice assistant to be so human-like that this perception led to an unsettling feeling.

Section 5-Personality Traits
Previous research has shown that users' personalities (e.g., extraversion, agreeableness) [88], loneliness [52,54], and attachment styles [18] are related to how human and social IVAs and other AI systems are perceived to be.This may suggest that certain personality traits of humans can promote the perceived friendship quality of IVAs.Therefore, we performed a two-sided Welch t-test to determine if there were any differences between the two clusters concerning the above personality traits.

Measures
Personality.The NEO-FFI [89] is a questionnaire that uses 60 items to measure the dimensions of the Big 5 personality model (neuroticism, extraversion, openness, agreeableness, and conscientiousness).Participants indicated the extent to which they agreed or disagreed with each statement on a five-point Likert scale (0 = strongly disagree; 4 = strongly agree).The German version of the questionnaire was developed by Borkenau and Ostendorf [90].The reliability of the questionnaire is in the acceptable to good range (neuroticism α = 0.81, extraversion α = 0.77, openness α = 0.73, agreeableness α = 0.73, conscientiousness α =.83).
Loneliness.We used the De Jong Gierveld Loneliness Scale with six items [91] to measure loneliness.The questionnaire consisted of the subscales emotional loneliness (α = 0.74) and social loneliness (α = 0.73).Each item was assessed on a 7-point Likert scale (1 = strongly disagree; 7 = strongly agree).The internal consistency of the questionnaire was α = 0.76.

Results
Personality.When analyzing the potential personality differences between the clusters, we observed the initial tendencies.The difference between the friend cluster (n = 40) and the non-friend cluster (n = 32) for neuroticism was marginally significant (t(62.18)= 1.75, p = 0.084, d = 0.42).The friend cluster showed higher scores in neuroticism (M = 4.28, SD = 0.88) than the non-friend cluster (M = 3.89, SD = 1.00).The differences in extraversion, openness, agreeableness, and conscientiousness were not significant.

Brief Discussion of Section 5
We found differences between the friend cluster and the non-friend cluster in terms of their personality traits.First, the friend cluster showed marginally higher scores in neuroticism.Other studies have previously found a positive relationship between neuroticism and anthropomorphizing tendencies [88,94].Neuroticism is characterized by negative emotional states and instability and is associated with social anxiety and avoidance of social evaluations [95,96].Moreover, neuroticism is positively correlated with loneliness [97,98], which can increase the probability of being pessimistic, anxious, and distrustful in social situations [99,100].This may promote a motivation to value voice assistants as friends, as their social presence may provide a low-threat alternative to real-world contact and is less associated with a fear of judgment.Our results regarding loneliness show that individuals in the friend cluster were significantly more socially and emotionally lonely than individuals in the non-friend cluster.Consistent with this finding, Epley et al. [101] showed that the lonelier people are, the more likely they are to socialize and anthropomorphize with non-human entities.Additionally, individuals with insecure attachment styles and attachment anxiety also tend to exhibit higher levels of anthropomorphization [50].In addition, our results indicate that the friend cluster was characterized by attachment styles driven by a fear of closeness and lack of trust.The resulting relationship building with IVAs may then compensate for social needs and deficiencies [18].The suspected relationships should be further investigated in future studies.

General Discussion
AI-based technologies with adaptive and intelligent features imitate human social traits and evolve into putative social interaction partners or friends.We combined data science and social science methods to cluster participants regarding their attribution of friend-like social roles to IVAs.The longitudinal study investigated how these roles developed over nine months and their impact on usage behavior, user experience, and social perceptions.The results revealed that users who associated IVAs with higher friendship quality differed from those who did not, as they used the devices more often for various types of tasks, were more satisfied, rated the interactions as more enjoyable, and indicated a greater intent to use voice assistants in the future.In addition, we found differences between the clusters in terms of the social perceptions of voice assistants.Users attributing friendship to their IVA reported feelings of empathy and connectedness toward their voice assistant.In addition, the user's personality determined the emergence and manifestation of role attribution.Thus, the cluster that associated the voice assistant more strongly with a perception of friendship showed a higher expression of loneliness and an insecure attachment style.
In relation to our first research question (see Section 1), we found significant differences in how users attributed friendship to voice assistants and how the perceived friendship quality changed over time.Our results showed that implicit measures could better classify role attribution than explicit measures.Thus, our findings support previous results that social roles are often perceived unconsciously by users [15,16] and explicit questioning may lead to the denial of these roles [17,55].Our analysis identified two clusters based on whether participants associated voice assistants more (friend cluster) or less (non-friend cluster) with friendship qualities.Time-based analyses showed that the perceived friendship quality increased significantly over time for the friend cluster, whereas it remained unchanged for the non-friend cluster.
Regarding the second research question (see Section 2), we examined how user behavior with the IVA differed between the friend cluster and non-friend cluster and how it changed over time.We found that users who viewed the voice assistant as a friend (friend cluster) interacted with it in different ways than the non-friend cluster.Users who associated the IVA with higher friendship quality tended to use it more often for support and social interaction.They were also more likely to use it for knowledge acquisition and mood management.In social interaction, the friend cluster showed more interest in the voice assistant as a personality and addressed it directly.These results are consistent with those of Purington, Taft, Sannon, Bazarova, and Taylor [10], who found that users were more likely to interact socially with their voice assistant and address it as a personality when they saw it as a friend or companion.Data science methods were used to reveal additional role-specific interaction patterns over time.For example, individuals in the friend cluster used more words for their voice commands by the end of the study period, whereas the opposite trend was observed in the non-friend cluster.
The third research question (see Section 3) explored the differences in user experience with IVAs between the friend cluster and the non-friend cluster) and how the user experience evolved over time.Individuals in the friend cluster perceived the IVA differently based on their user experience.They valued their interactions with the IVA more and evaluated their experiences as more meaningful, significant, inspiring, and emotional.We found that the need for competence and autonomy, for example, was more strongly fulfilled in individuals from the friend cluster.Thus, perceived friendship with the voice assistant may be a predictor and key variable for a positive user experience.Furthermore, we found that the IVA's role as a friend was associated with more social interactions.This supports the findings of Purington, Taft, Sannon, Bazarova, and Taylor [10], who found that personification (i.e., treating the voice assistant as a human) was associated with higher user satisfaction.The significant influence of the parasocial perception of IVAs on the user experience suggests a vast design space.Although our results show only correlations, the consequences can be positive or negative, underscoring the high responsibility in designing future human-AI interfaces.
In the fourth research question (see Section 4), we determined whether social perceptions differed between the friend cluster and the non-friend cluster and the temporal trends of these social perceptions.Users who associated IVAs with higher friendship quality had a different social perception of them.Individuals in the friend cluster felt more empathy, as well as an attachment to the voice assistant and cognitive parasocial interactions, which increased over time.The studies by Youn and Jin [36], Hernandez-Ortega and Ferreira [85], and Ramadan, F Farah, and El Essrawi [86] showed positive correlations between parasocial relationships and users' perceived attachment to IVAs, which may contribute to the perception of IVAs as friends.This could be one reason for the increase in friendship quality over time in the friend cluster.In addition, the anthropomorphic perception of AI can influence the nature of the relationship with the user [102].Our study found that individuals in the friend cluster perceived the IVA as more human and socially present.The fact that voice assistants are always present and listening can foster a sense of friendship but it should be noted that they cannot replace human friends and are merely programs that perform tasks and provide information.It is important to continue monitoring the impact of Artificial Intelligence on the perceived boundaries between technology and human relationships.
In our fifth research question (see Section 5), we explored the potential differences between the personality traits of users in the friend cluster and those in the non-friend cluster.In this way, we gained valuable insights into whether certain personality traits might make people more prone to associate voice assistants with friendly qualities.We found that individuals in the friend cluster were more likely to exhibit lonely and attachment-anxious personality traits and slightly higher scores on neuroticism.Voice assistants may be appealing to individuals with these personality traits because they can compensate for socially imbalanced needs and provide a sense of companionship [18,52,53].In this way, the voice assistant can act as a friend that users can turn to when they are lonely or need someone to talk to.Although voice assistants cannot feel emotions, they can provide users with a sense of connection.In the future, it will be important to recognize that although IVAs are capable of human-like conversations and responses, they have no real emotions or needs and cannot replace human relationships.

Limitations
The participants in the study were very young and predominantly female, thereby limiting the representativeness and generalizability of the results.Furthermore, the study was limited to Alexa and Google Assistant, which are only two of the available voice assistants.Future research should consider other voice assistants to allow for a comparison of their impact on user perception and behavior.The present study is also limited in its internal validity.Unlike laboratory studies, for example, it was not possible to perfectly control whether participants conscientiously integrated smart speakers into their daily lives.This may also be why this study experienced some data losses.The usage analysis used participants' technical interaction data, which were documented by the provider in usage logs.We received some empty files, which could have been due to system complications or the fact that some participants were inconsistent in participating in the study.Nonetheless, the present study benefits from real-world interaction scenarios and external validity, which are usually limited in laboratory settings.
To measure the social roles attributed by participants to the IVAs, we created a questionnaire based on the findings of Purington, Taft, Sannon, Bazarova, and Taylor [10].This demonstrated low internal consistency, which may have limited the reliable measurement of social roles.Participants' denial of the voice assistant's role as a friend was very high.Although studies suggest that the conscious assignment of a friend-based role to voice assistants is low [31], measurement error cannot be ruled out due to the limitations of the assessment tool.We encourage future studies to develop psychometric instruments that measure the role-specific characteristics and perceptions of voice assistants and compare their effectiveness with implicit methods for measuring the role of IVAs.

Future Directions
Future research should aim for greater participant diversity and consider factors such as cultural and socioeconomic backgrounds, usage levels, and device ownership.Our study's predominantly female, tech-savvy participants without prior experience with voice assistants highlights the need for a more diverse range of participants.The inclusion of more diverse users and contextual variables could provide greater insight into the factors underlying the different clusters.In addition, it would be beneficial to investigate the presence of additional clusters with social relationships.Examining the impact of users' experience levels with voice assistants on user experience, social perception, and usage is also crucial.Regarding usage, our study found that the primary use of voice assistants was to control the smart home and listen to music.This usage pattern can be attributed to a variety of factors such as a lack of knowledge about the IVA's capabilities, lack of practicality, frustration with speech recognition errors, or motivation issues during the study.To gain a deeper understanding of the impact of voice assistants on user experience and behavior, future research should involve more extensive usage of these systems.It is important to consider that advancements in the field of voice-based systems, especially regarding their anthropomorphic design and ability to communicate proactively, fluently, and naturally, may have an impact on the quality of the interactions [103].Therefore, training programs to educate participants on the full range of functions of voice assistants could increase or change the kind of usage.Evaluating the effectiveness of such training programs might be an interesting topic for future research.
Currently, IVAs are not able to initiate conversations and their dialogue flow often fails to meet users' expectations [104].However, as IVAs become more advanced and capable of more natural and fluid conversations, social bonds, including feelings of friendship, may become stronger.Recent developments such as ChatGPT indicate the potential for this ability.OpenAI announced the powerful ChatGPT conversational language model, which can generate natural language texts and conduct dialogues [105].This development could be particularly significant for smart speaker vendors, as machines will be better equipped to respond to follow-up questions and recognize the humor and mood of users [105,106].Furthermore, even slight changes in the volume, speech rate, or pitch can affect the personified perception of voice assistants [25].Therefore, future studies should investigate the effects of IVAs' dialogue capability or adaptability (e.g., gender of voice) on users' perceived friendship and relationship quality.We anticipate that even small social adaptations to AIbased voice systems can have large impacts.In our study, we observed that the perceived friendship quality scores of users with IVAs varied significantly between clusters, despite the differences in the descriptive values not being particularly large.Nevertheless, even these small differences between groups had significant effects on the variables we studied.As technology continues to advance and become more social, the potential consequences and effects of such developments may become even stronger.In this context, we must take a closer look at the potential negative effects that could arise from friendship relationships between users and IVAs, such as issues related to information credibility [76], impulsive shopping behavior [107], or disclosure of personal information [56].Wienrich, Reitelbach, and Carolus [7] showed that the social role of IVAs can have an impact on the disclosure of sensitive information.Therefore, it would be particularly important to examine users' disclosure behavior as a function of perceived friendship quality and consider privacy concerns and behaviors in this context.
The results of this study have implications for the design and development of voice assistants.Friendship with voice assistants can increase brand engagement and customer loyalty or lead to higher customer satisfaction.Studies show that a strong social relationship can lead to user satisfaction with IVAs [46].Sharabany's friendship quality subscales [58] could be used as design guidelines to build deeper, friend-like relationships between users and voice assistants.For example, a voice assistant could proactively ask users about their plans to fulfill the "Frankness and Spontaneity" subscale.Given our research and that of others, this may be particularly relevant for users of IVAs in seeking social interaction or compensating for a lack of social contact [108].The effects of design using Sharabany's subscales should be investigated in future laboratory and longitudinal studies.The results of this study could have further applications in clinical psychology research and practice.In particular, computer-assisted methods in mental healthcare are becoming increasingly important.In this context, chatbots are an effective method for alleviating depressive and anxiety symptoms [109].Furthermore, recent studies have shown that voice assistants are accepted and preferred by older people in the communication of therapy methods [110].The relationship between the patient and therapist is a critical factor in determining the success of therapy [111][112][113].Nevertheless, the quality of the friendships that patients cultivate in their personal lives can also positively impact treatment outcomes [114].The findings from our study could be used in the design of future voice-based systems to make applications more personal and therapy services more helpful.

Conclusions
Overall, it was shown that people who attributed friendship to their voice assistant exhibited different usage behavior and had a different user experience and quality of interaction.In addition, our findings provide valuable insights into how user personality traits may influence the perceived social role of IVAs.Our long-term approach and interdisciplinary data collection and analysis contribute to a holistic analysis of users in their natural environment.The results provide new research and design ideas and promote a deeper understanding of AI-human interaction.The results show that the consideration of human needs and social processes is essential in the design of IVAs.Recent developments in conversational language models such as ChatGPT show that the current study is highly relevant and provides an outlook on the effects of the attributions and expectations of human-like voice-based AI.

Section 1 -
Cluster Formation and Time Effects: Do users differ in their attribution of friendship to IVAs and how does perceived friendship quality change over time?Section 2-Usage Behavior: How do patterns of use differ as a function of perceived friendship with the IVA and how do they change over time?Section 3-User Experience: How does user experience differ as a function of perceived friendship with the IVA and how does it change over time?Section 4-Social Perception: How do social perceptions differ as a function of perceived friendship with the IVA and how do they change over time?Section 5-Personality Traits:

Figure 1 .
Figure 1.The course of the long-term study with the different stages, collected constructs, and data types.

Figure 1 .
Figure 1.The course of the long-term study with the different stages, collected constructs, and data types.

Figure 2 .
Figure 2. The gap-stat plot of the optimal number of clusters.

Figure 2 .
Figure 2. The gap-stat plot of the optimal number of clusters.

Figure 3 .
Figure 3. Changes in friendship quality over time separated by cluster with standard errors.

Figure 3 .
Figure 3. Changes in friendship quality over time separated by cluster with standard errors.

Figure 4 .Figure 4 .
Figure 4. Functions used and IVA usage across a week separated by cluster.3.2.3.Interaction Modeling Below, we model additional usage indicators based on IVA data to better understand the interactions with IVAs and potentially disaggregate usage differences as a function of The friend cluster used marginally (t(39.36)= 1.87, p = 0.069, d = 0.58) more words per voice command on average (M = 3.37, SD = 2.38) than the non-friend cluster (M = 2.59, SD = 1.93).omputers 2023, 12, x FOR PEER REVIEW 15 of 34

Figure 5 .Figure 5 .
Figure 5. Length (number of words) of voice commands separated by cluster.

Figure 6 .
Figure 6.The number of words used per voice command over time separated by cluster.

Figure 6 .
Figure 6.The number of words used per voice command over time separated by cluster.

Figure 8 .
Figure 8. Change in cognitive parasocial interaction (PSI) over time separated by cluster with standard errors.

Figure 8 .
Figure 8. Change in cognitive parasocial interaction (PSI) over time separated by cluster with standard errors.

Figure 9 .
Figure 9. Change in empathy over time separated by cluster with standard errors.3.4.3.Brief Discussion of Section 4 Our analyses of the social perception of IVAs revealed several differences between individuals in the friend cluster and those in the non-friend cluster.Cognitive parasocial interactions were more pronounced in the friend cluster, which suggests that they paid

Figure 9 .
Figure 9. Change in empathy over time separated by cluster with standard errors.

Computers 2023 ,Figure 10 .
Figure 10.Comparison of means with plotted standard deviations of loneliness and attachm styles separated by clusters.3.5.3.Brief Discussion of Section 5

Figure 10 .
Figure 10.Comparison of means with plotted standard deviations of loneliness and attachment styles separated by clusters.

Table 1 .
Measurement time (short "T") points and years of the long-term study.

Table 1 .
Measurement time (short "T") points and years of the long-term study.

Table 2 .
Means and standard deviations of perceived friendship quality.

Table 3 .
Scales, descriptive values, and t-tests of the friendship quality subscales of the IFS.

Table 4 .
Descriptive values and group differences at the category level separated by cluster.

Table 5 .
Means, standard deviations, percentages, and significance tests with the effect size of the function used (subcategories) separated by cluster.

Table 6 .
Overview of descriptive values and inferential statistics on the fulfilled usage motives.

Table 7 .
Overview of descriptive values and inferential statistics on the fulfilled usage motives.

Table 8 .
Overview of descriptive values and inferential statistics for usage evaluation.

Table 8 .
Overview of descriptive values and inferential statistics for usage evaluation.

Table 9 .
Overview of descriptive data and results of Welch's t-tests on social variables.