Using an Artiﬁcial Intelligence Based Chatbot to Provide Parent Training: Results from a Feasibility Study

: Online parenting training programs have shown to be effective. However, no studies on parent training programs delivered through chatbots have been reported yet. Aim. This study aims to assess the feasibility of delivering parenting skills through a chatbot. Methods. A sample of 33 parents completed a pilot feasibility study. Engagement, knowledge, net-promoters score and qualitative responses were analyzed. Results. A total of 78.8% of the sample completed the intervention. On average, participants remembered 3.7 skills out of the 5 presented and reported that they would recommend the chatbot to other parents (net promoter score was 7.44; SD = 2.31 out of 10). Overall, parents sent a mean of 54.24 (SD = 13.5) messages to the chatbot, and the mean number of words per message was 3. Main themes parents discussed with the chatbot included issues regarding their child’s habits, handling disruptive behaviors, interpersonal development, and emotional difﬁculties. Parents generally commented on the usefulness of the intervention and suggested improvements to the chatbot’s communication style. Conclusions. Overall, users completed the intervention, engaged with the bot, and would recommend the intervention to others. This suggests parenting skills could be delivered via chatbots.


Introduction
Global prevalence of mental disorders in children and adolescents ranges between 12 and 15%, which covers approximately 241 million young people (Verhulst and Koot 1995;Roberts et al. 1998;Polanczyk et al. 2015). Parent training programs have proven useful for the prevention and treatment of behavior problems (Michelson et al. 2013;Pidano and Allen 2015;Forgatch and Gewirtz 2018;Zisser-Nathenson et al. 2018). However, the implementation of these interventions often has limitations, such as the shortage of professionals with adequate training, the lack of time for parents to attend therapy (Enebrink et al. 2014), or stigma that exists to consult a psychologist (Jones et al. 2016).
New therapeutic models of delivery are needed (Kazdin and Rabbitt 2013) and behavioral intervention technologies (BITs) have been proposed as an alternative to increase accessibility to parents. BITs refer to the use of technological devices (i.e., cell phones, tablets) in the field of health, with the aim to promote behavioral changes (Mohr et al. 2014). Different parent training programs have already been implemented using BITs, proving to be effective in reducing disruptive behavior, improving the sense of parental self-efficacy, and developing parental skills (Bausback and Bunge 2021;Cefai et al. 2010;Corralejo and Rodríguez 2018;Morawska et al. 2014;Sourander et al. 2016).
Within the framework of current technological developments, artificial intelligence (AI) is one of the fastest-growing areas. AI uses technologies that fulfill functions usually assigned to human intelligence (Luxton 2014) and is implemented through Computer Conversational Programs, better known as "chatbots"-software that uses natural language to interact with human users (Shawar and Atwell 2007). The advantages of using chatbots for prevention, treatment, or follow-up purposes should be considered in light of their comparison with human therapists; unlike them, chatbots do not become tired nor do they have personal biases. They are available 24 h, no matter where the patient is, and the use of algorithms and neural learning could allow them to offer the most appropriate intervention according to the patient's diagnosis and treatment evolution (Gaggioli 2017). Chatbots may be able to avoid traditional psychotherapy endemic barriers in order to offer psychoeducation or psychotherapy according to the user's needs (Miner et al. 2016).
The incorporation of chatbots in the mental health field is in the early stages but growing steadily (D'Alfonso et al. 2017). Several chatbots for mental health have been tested: Woebot and Tess for anxiety and depression (Fitzpatrick et al. 2017;Fulmer et al. 2018), Mylo for problem solving (Bird et al. 2018), Tess for social isolation , and others for substance use disorders, autism spectrum disorder, and post-traumatic stress (Laranjo et al. 2018). Previous research on usage patterns (Dosovitsky et al. 2020) and user experience  of chatbots can inform future chatbot developments. However, most of this research was pilot studies (Bendig et al. 2019).
To our knowledge, there are no studies on parent training programs delivered through chatbots. Most parenting training programs teach skills such as how to praise effectively, how to give instructions, positive attention, quality time, and use of time out (Michelson et al. 2013;Pidano and Allen 2015;Forgatch and Gewirtz 2018;Zisser-Nathenson et al. 2018). A parent training program delivered via chatbot can teach all these skills in an interactive and engaging manner for parents.The current study aimed to test the feasibility of a brief parent training intervention delivered through a chatbot. The brief intervention teaches parents how to effectively praise their children. More specifically, the study aims to analyze adherence and parents' feedback.

Design
The current study was a pilot feasibility study. The pilot study included a small sample of parents who tested the chatbot for acceptability, feasibility, and technical issues. Researchers obtained quantitative and qualitative information to understand which aspects users found more and less useful, whether they would consider changes to the intervention and which ones, and potential technical difficulties.

Participants
Participants were recruited through Facebook posts with a non-probabilistic strategy by a voluntary approach. Parents aged 18 and older were eligible to participate if they resided in Argentina, had at least one child between 2 and 10 years old, and were not looking for psychological treatment but considered they could benefit from the intervention. Parents would be excluded if they were under legal age or did not reside in Argentina.

Intervention
The intervention consisted of a chatbot that could be accessed through smartphones via Facebook Messenger and lasted approximately 20 min.
The content of the intervention was based specifically on praise strategies from The Incredible Years (IY) parent training program (Webster-Stratton and Reid 2006). IY programs have shown evidence regarding its ability to improve parental attitudes, parent-child relationships, reduce the use of harsh discipline and reduce behavior problems in children, in clinical populations, and also as a preventive program with families of low socioeconomic status (Webster-Stratton 2011).
The primary purpose of the chatbot intervention was to teach parents how to use positive attention and praise to stimulate positive behaviors in their children. Specifically, the intervention taught five brief modules for parents to be effective when offering praise: 1. Define, 2. Be specific, 3. Avoid combining praise with criticism, 4. Show enthusiasm, and 5. Praise immediately.
Since the objective of the study was to explore a design that would help participants complete the intervention, the focus was on finding an appealing design and conversation style that parents perceived as useful. Initial discussions were focused on determining the appropriate length of the intervention, type of vocabulary used by the chatbot, and how to measure the results. Additionally, researchers sought to identify possible technical and content obstacles and aspects that may add value to the intervention.

Measures
Sociodemographic questionnaire. Participants were asked about their age, gender, age of their children, and country of residence.
Engagement measures. Consistent with standard practices on engagement studies (Dosovitsky et al. 2020;Rogers et al. 2021), user engagement was measured by the mean number of messages sent by the participant, number of characters typed, and number of words per message sent.
Knowledge. Knowledge was measured by asking participants which skills they remembered (i.e., "What skills do you remember from the intervention?"). Open-ended responses were coded as correct or incorrect.
Net Promoter Score. To assess the user satisfaction, participants were asked: "How likely is it that you would recommend the intervention to someone?" on a Likert scale of 0 ("would not recommend it") to 10 ("completely recommend it"). The NPS score has been proposed as a measure to assess overall impressions of a product (Reichheld 2003) and has been used in other chatbot studies .
Qualitative questions: Parents concerns. Participants were asked an open-ended question: "What concerns you most about your child's behavior?" User experience. Participants were asked three open-ended questions: "Which aspects of the conversation were most or least useful?", "Was there a skill or message that was difficult to understand?", and "Is there any recommendation you would make to improve the intervention?".

Procedures
Participants were recruited through regular Facebook posts. The posts offered the possibility of having a brief conversation with a chatbot that would teach parenting skills. Those who clicked on the post were automatically directed to a Facebook Messenger chat window. Once there, the chatbot provided the privacy policies and explained what the intervention consisted of. It also detailed that the intervention was not a therapeutic intervention and that the conversation was part of a research study. Eligible participants were given informed consent and told that they could withdraw from the study at any time. Those who consented completed the demographic questions and began the intervention immediately. Once the intervention was completed, participants were asked about their user experience and the level of satisfaction.

Statistical Analysis
Descriptive statistics were used to present demographics, dropout rates by age category, engagement, knowledge, and net promoter score. A chi-square analysis was conducted to assess differences between dropout rates by age category.
A thematic analysis was conducted to identify the main themes of parents' responses and user experience using the process developed by Braun and Clarke (2006). The authors reviewed all the responses and identified the themes that were present, preliminary codes were reduced to four categories by consensus between all authors. Researchers decided not to categorize according to frequencies because, in this instance (a pilot study), researchers considered all the information relevant for improving the software feasibility X 2 .

Results
A total of 85 participants residing in Argentina accessed the site: 53 met eligibility criteria and 33 provided consent; 10 were men (30%) and 23 women (70%). Of those who consented, 26 (78.8%) completed the intervention (see Figure 1). A total of 33.3% (n = 11) of the participants were between 30-33 years old, 30.3% (n = 10) between 34-37 years old, and 36.4% (n = 12) were 38 or older. Twenty-one parents (63.6%) had an only child, eleven had two children (33.3%), and only one had three children (3%). Results from a X 2 analysis revealed that there were no significant statistical differences (X 2 = 4.72, p = 0.094) for dropout rates between parental age groups. Of the seven participants who dropped out, three did so during the first module (i.e., Define), one in the third (i.e., Avoid combining praise with criticism), and two during the last (i.e., Praise immediately). No participants dropped out in modules two (i.e., Be specific) or four (i.e., Show enthusiasm). One participant dropped out during the post-intervention assessment.

Statistical Analysis
Descriptive statistics were used to present demographics, dropout rates by age category, engagement, knowledge, and net promoter score. A chi-square analysis was conducted to assess differences between dropout rates by age category.
A thematic analysis was conducted to identify the main themes of parents' responses and user experience using the process developed by Braun and Clarke (2006). The authors reviewed all the responses and identified the themes that were present, preliminary codes were reduced to four categories by consensus between all authors. Researchers decided not to categorize according to frequencies because, in this instance (a pilot study), researchers considered all the information relevant for improving the software feasibility X 2 .

Results
A total of 85 participants residing in Argentina accessed the site: 53 met eligibility criteria and 33 provided consent; 10 were men (30%) and 23 women (70%). Of those who consented, 26 (78.8%) completed the intervention (see Figure 1). A total of 33.3% (n = 11) of the participants were between 30-33 years old, 30.3% (n = 10) between 34-37 years old, and 36.4% (n = 12) were 38 or older. Twenty-one parents (63.6%) had an only child, eleven had two children (33.3%), and only one had three children (3%). Results from a X 2 analysis revealed that there were no significant statistical differences (X 2 = 4.72, p = 0.094) for dropout rates between parental age groups. Of the seven participants who dropped out, three did so during the first module (i.e., Define), one in the third (i.e., Avoid combining praise with criticism), and two during the last (i.e., Praise immediately). No participants dropped out in modules two (i.e., Be specific) or four (i.e., Show enthusiasm). One participant dropped out during the post-intervention assessment.  Knowledge: The mean number of skills remembered by participants was 3.07 out of 5 (SD = 1.73). The most remembered skill was to Define (19; 73.01%), followed by Avoid combining praise with criticism (16; 61.54%), Praise immediately (16; 61.54), Be specific (13; 50%), and Show enthusiasm (13; 50%).
Net Promoter Score: User experience questions were obtained from participants who completed the whole intervention, 78.8% (26) of the total sample. When asked about NPS, participants answered, on a scale of 1 to 10, that they would recommend the intervention on an average of 7.44 (SD = 2.31) points.
Habits. This theme included answers that mentioned everyday life activities in which parents found difficulties, such as their child sleeping on their own, their child taking showers, doing homework without insisting many times, and difficulties regarding digital devices. The final subthemes included were: "eating habits" (7) (e.g., "I would like my child to eat varied food"), "school homework" (6), "hygiene habits" (5) (e.g., "I would like my child to take a shower when I tell him/her"), "tidiness" (5) (e.g., "I would like him/her to tidy up his/her bed"), "sleeping habits" (4) (e.g., "I would like her to sleep in her bed"), and "handling with technology" (4) (e.g., "I would like my son not to waste too much time on games and the internet").
Interpersonal Development. This theme included issues related to the way children bond with adults or peers and milestones related to child care and development. The domain included "Dialogue" (5) (e.g., "I hope I could help him to express what is happening to him"), "Independence" (4) (e.g., "I hope he can do things expected for his age, such as dressing by himself"), and "Relationship with siblings/peers" (3) (e.g., "I hope she could share with her brother").
Emotional difficulties. This theme included answers in which parents expressed emotional difficulties not related with disruptive behavior, such as sadness, fears, or problematic problem solving. The subthemes were "Emotional regulation" (6) (e.g., "Nerves management", "Solve without anguish", or "Moodiness when waking up") and "Frustration tolerance" (3) (e.g., "He gets frustrated when he has to turn off the t.v.").
Qualitative Analysis of User experience. A total of 26 parents completed all the conversations with the chatbot and the postintervention assessment. The user experience responses were categorized into two main themes: comments and suggestions.
In regards to the comments related to the bot conversation content, 10 parents found the intervention useful. Out of these, six said that everything was useful and four said that the advice was useful. Five parents reported on a specific skill that they found useful (e.g., The most useful skill was the mnemonic). One parent reported on the clarity of the chatbot (e.g., Everything was clear). Four parents reported on things that were not useful (e.g., The least useful thing was when the chatbot asked and insisted for an example and I did not have one).
Regarding the suggestions of the bot's communication style, seven of the participants expressed they would not change anything. Six parents made comments about the chatbot being too mechanic or sounded impersonal (e.g., "the answers were too predetermined", "I wish you were more flexible"), three reported technical problems (e.g., "once I put only nonsense letters and you congratulated me"), two said they wanted to have more examples, two said the information was repetitive, two reported on the length, two reported that it was boring, and six parents made miscellaneous comments that were not grouped.

Discussion
Several parenting programs have strong empirical evidence (Pidano and Allen 2015;Forgatch and Gewirtz 2018;Zisser-Nathenson et al. 2018). However, the implementation of these interventions often has limitations, such as the shortage of professionals with adequate training or the lack of time parents have to attend therapy (Enebrink et al. 2014). The incorporation of chatbots in the mental health field is in the early stages but growing steadily (D'Alfonso et al. 2017). No studies on the feasibility of parent training programs delivered through chatbots have been carried out in clinical or non-clinical settings. The aim of the present research was to conduct a feasibility study of a parent training microintervention delivered in a non-clinical setting through a chatbot. Furthermore, this study aimed to apply the principles of agile software design (Bunge et al. 2017) to an intervention that teaches parents how to effectively praise their children, conduct a feasibility study of the designed intervention, and analyze users' experiences.
In terms of completion rates, a total of 26 (78.8%) parents completed the intervention. Even though the sample was small, the completion rate was high compared to other digital interventions (Eysenbach 2005). More specifically, previous studies on chatbots have reported completion rates below 41% (Klos et al. 2021;Linardon and Fuller-Tyszkiewicz 2020). While these studies were longer than the current study, the 78.8% completion rate is encouraging. Interestingly, there were no significant statistical differences for dropout rates between parental age groups showing that this intervention can be completed by parents across a wide range of ages. The portion of intervention completers seems to support the feasibility of the chatbot intervention.
Regarding engagement, parents sent an average of 54.24 messages to the bot and typed an average of 10,055.69 characters, with a mean of three words per message sent. This suggests that parents had a high level of engagement with the bot sending many short messages. This represents a higher engagement level than the 17.57 messages sent that was reported in a previous study of a chatbot for older adults with depression in the US (Dosovitsky et al. 2020). This could be explained by a cultural difference, such that Latinx populations tend to be more talkative. Indeed, a college sample in Argentina (Klos et al. 2021) showed a higher engagement level than a study on chatbots for college students in the US (Fulmer et al. 2018).
In terms of knowledge, participants were able to remember an average of three skills out of five, suggesting that parents were able to recall the majority of the intervention. Although the first module presented was the most remembered, the order of the modules presented did not appear to have an impact on which skills were remembered. This suggests that the modules were clear and the content was relevant enough for parents to remember. Moreover, two components may have helped to consolidate the learning: submitting the knowledge question just a few seconds after finishing the chatbot conversation (which requires participants to retrieve from memory recently acquired information) and using an acronym throughout the intervention as a memory aid (the word "felices", which in Spanish means "happy", summarized the initials of the 5 skills taught). Future research could assess pre and post information to measure acquired knowledge versus baseline knowledge.
Most participants provided a net promoter score of 7.44 out of 10 (SD = 2.31) points. There are no other chatbot studies that report NPS scores, so it is uncertain how this score compares to other chatbots. However, the high score suggests that, overall, users had a positive experience with the intervention and would consider recommending the intervention to someone else.
Regarding the parents' concerns, four main themes emerged: the most frequent one was Habits, followed by Handling disruptive behaviors, Interpersonal development, and Emotional difficulties. For example, in terms of Habits, parents reported problems with daily habits including eating, school, hygiene, and tidiness. Overall, the themes observed show that parents were replying with similar behaviors to the ones frequently reported in clinical and educational settings, which suggest that parents were meaningfully engaged with the chatbot intervention. Besides, being aware of what parents are concerned about is a valuable tool to guide the development of future modules. It suggests which issues will be relevant to address, making the intervention not only more attractive but useful.
When parents were asked about their user experience, some parents commented that they found the intervention useful, and others commented that a specific skill was useful. A smaller portion of parents reported on things that were not useful or responsive to their experience (e.g., The least useful thing was when the chatbot asked and insisted on an example and I did not have one), that the chatbot was too mechanical or sounded impersonal, or that there were technical problems. Thus, the intervention was useful despite some technical difficulties and the need to continue to improve the conversational style.
Overall, completion and engagement rates, and participants' level of knowledge at the end of the intervention, suggest that additional studies should utilize chatbots to provide parenting skills training. The qualitative information collected seems to provide valuable data about other skills to train and issues to address in future research.

Limitations and Future Directions
The main limitation of the current pilot study is the small sample size. While pilot studies usually have small samples, the results of the current study may not generalize to a wider population. Future studies with a larger sample size could yield important information about variables such as attrition, engagement, and efficacy. This will allow a better understanding of the effectiveness of the intervention within different populations, such as older or younger parents. Since this was a pilot study that focused mostly on feasibility and user experience, the main outcomes of the intervention were not assessed. While the intervention may have been well accepted by the parents, it is unknown whether this actually improved parenting skills. Future studies involving randomized control trials would be able to assess the efficacy of parent training interventions through AI. Finally, while completion rates were high, parents completed the whole intervention with the chatbot during one time point. It is unclear whether parents would have returned for a second set of modules with the bot. Since most users of digital interventions do not return after the first two sessions (Titov et al. 2013), future studies on chatbots for parents need to assess whether parents would continue engaging with the bot.
Future studies on chatbots for parenting should include components of conversational design that would address some suggestions users made, such as having customized answers and not general answers. More specifically, developers could design more interactive and engaging conversations, utilizing different types of questions. For example, questions that lead to yes/no answers are easy to reply to but require little engagement. On the other hand, questions that require more thoughtful responses are more engaging, but parents may not respond to them. Developers need to find a balance between questions that are easy to reply to and also require parents' reflections. Reaching this balance would require developers to create several iterations of this design with users.

Conclusions
Overall, these results are promising and suggest that users completed the intervention, quantitatively and meaningfully engaged with the bot, remembered the skills taught, and would recommend the intervention to others. However, the sample was small, and a portion of parents commented on aspects that could be improved, such as sounding impersonal or the technical problems experienced. Chatbots are an acceptable and promising tool for teaching parenting skills yet to be evaluated in larger samples and more robust interventions.