Proposal and Evaluation of a Robot to Improve the Cognitive Abilities of Novice Baseball Spectators Using a Method for Selecting Utterances Based on the Game Situation

: Herein, an engineering method is developed to improve spectators’ sports - watching abilities. We present an interactive robot that supports the cognitive abilities of baseball novices in sports spectating. The robot watches a game with a person and utters words applicable to the game situation. We propose two methods to support cognitive watching: ﬁve categories of utterances (conﬁrmation, information, prediction, emotion, and evaluation) and utterance rules for player participation and game scenes. We also propose a method for generating utterances for each category. To evaluate the generated utterances, we conducted an experiment whereby spectators watched baseball footage with the robot. The results of the experiment showed that the robot’s utterances could support the cognitive ability sub - factor of individual game intelligence, speciﬁcally “Spectating while knowing the player’s strengths.” In addition, the feeling of heaviness that occurs when watching alone could potentially be reduced by watching with a robot. This study is the ﬁrst to attempt to support and improve spectators’ abilities to watch sports using the human – robot interaction approach. We anticipate that this approach will be used in the future to build a relationship (increase intimacy) with an agent (robot) and to support and improve cognitive abilities regardless of the type of sport.


Introduction
In sports, spectating skills are critical to understanding and enjoying games.Spectators typically find value in sports by using their spectating skills [1].A high level of spectator ability is required to enhance the experience of watching sports.However, beginners with lower spectator skills may find it difficult to understand and evaluate the game, resulting in difficulty in recognizing the value of sports.Therefore, it is necessary to help beginners watch sports to improve their spectating skills.Saito et al. [2] revealed the structure of sports spectators' cognitive abilities.However, methods to improve these abilities were not considered.Novice spectators can become more proficient at watching baseball games by learning from the words and actions of highly experienced spectators with high cognitive abilities.Improving the cognitive ability of novice spectators will contribute to the development of the sports industry.Professional sports teams (especially baseball) are the target group that can benefit from the results of this research.This is because the elucidation of the mechanism by which the robot improves the spectator's ability to watch a game will increase how sports spectators can increase their loyalty to their teams.
A previous study proposed a method to support game viewing by displaying the statistics of athletes using AR [3].This method allows users to watch the game without prior knowledge of the players by referring to quantitative information about them.The results showed that the flow experience of watching a sporting event using VR improves the satisfaction of users who are not very interested in that sport [4].Although these methods are an effective means of improving users' satisfaction considering the spectator experience, they do not improve users' cognitive abilities when watching sports.To improve users' cognitive abilities, it is considered effective to provide users with an environment in which they can learn game-watching strategies from the words and actions of skilled spectators.
In the field of human-robot interaction (HRI), research is being conducted on robots that watch television (TV) programs together with humans [5,6].In particular, Nishimura et al. [6] proposed a robot that shares the atmosphere during TV viewing to improve dialog motivation using soccer as the viewed sport in the experiment.The results of the experiment show that humans and robots can share in the excitement.Yamamoto et al. [7] proposed a group robot that expresses emotions during baseball viewing to improve the sense of presence.An experiment in which a baseball video was watched, showed that a group robot can create a sense of unity and presence.These studies demonstrate that it is possible for robots to watch sporting events together with humans.
In the studies aforementioned [6,7], one challenge was that robots are not fully capable of behaving akin to sports spectators.In Nishimura's study [6], the robot performed linguistic emotional expressions based on comments on TV programs posted on social media platforms.In contrast, in Yamamoto's study [7] the robot performed nonverbal emotional expressions by moving its body.Sports spectators often exhibit emotional behaviors during games.However, seasoned spectators with advanced cognitive abilities exhibit not only emotional behaviors but also behaviors such as predicting future plays and criticizing plays depending on the game situation [8].If robots were capable of selecting different behaviors corresponding to game situations similar to experienced spectators, they could help inexperienced spectators.In this study, we propose a robot that selects appropriate utterances depending on the game situation to improve the cognitive abilities of novice baseball spectators.In this study, we focus on the Nippon Professional Baseball Organization (NPB) as the object of spectatorship.The purpose of this study is to provide a method to improve users' cognitive spectator skills using a robot and to reveal the effectiveness of this system.The proposed method utilizes social media data to enable the robot to speak in response to game situations like a skilled spectator.We constructed a BERT model to classify the emotional categories of social media data.The evaluation of the proposed system was conducted by having participants use the proposed system, and the results of a questionnaire were analyzed.
The structure of this study is as follows.In Section 2, we explain the cognitive abilities of sports watching.In Section 3, we propose an utterance method using robots to improve the cognitive abilities of novice baseball spectators.In Section 4, we present the results of an experiment in which a robot and a human watched a baseball game video together.In Section 5, we discuss the experimental results.Finally, Section 6 concludes the study.

Baseball Spectator Assistance
In sports games, augmented reality (AR) and virtual reality (VR) technologies are designed to enhance the spectator experience.As an application of AR technology, research has been conducted to create virtual scenes that can be viewed from any angle by synchronizing and synthesizing multiple sports videos [9][10][11].Research has also been conducted to generate virtual sports scenes from TV videos [12][13][14], and systems have been proposed that can display these scenes in real-time [15].In addition, research has been conducted on AR systems that display statistical and supplementary information about teams and players on images of sports games [3,16,17].
In a study using VR technology, Mizushina et al. [18] proposed an "Interactive Instant Replay" system that allows users to experience past recorded sports plays as a 360degree spherical image with tactile feedback.Systems using AR or VR technology can reproduce a sense of presence similar to that of a game in a stadium and will likely enhance the spectator experience.In addition, the information provided in the virtual space, which is not available in conventional spectating, can contribute to the understanding of sports.However, it has been pointed out that spectators should interact with others who share their viewing experience rather than approach the game itself [19].
There is research on the use of robots as companions for sports spectators.Yamamoto et al. [7] proposed a group robot that can speculate about baseball in a VR space.Nishimura et al. [6] proposed a robot that can watch sports programs together with a spectator and share in their excitement.However, in these studies, the behavior of the robots was limited to expressing emotions and enthusiasm, which is insufficient for spectator behavior.Therefore, the robots need to understand the game situations and behave accordingly.Although studies have generated automatic commentary for sports games based on the situation [20][21][22][23][24][25][26][27], the commentary is in the position of an indirect third party with respect to spectators, hindering the shared experience.
In this study, we focused on robots watching games together with humans to create a shared viewing experience.The robots demonstrate their role as spectators by not only expressing emotions but also behaving according to the situation.

Cognitive Spectating Ability
Saito et al. [2] revealed the structure of cognitive ability in sports watching, which is the cognitive domain of spectator ability.Cognitive ability in sports watching is defined as "the ability to make sense of oneself by understanding and evaluating during observation using knowledge about the play or games" and includes the following six factors:

•
Individual game intelligence cognitive ability: the ability to focus on individual skills, understand the meaning of individual skills and movements and analyze and evaluate the game.

•
Team-play intelligence cognitive ability: The ability to analyze and evaluate the tactical aspects of the movements of all team members.

•
Psychological empathy: The ability to sense and empathize with players' feelings of joy, anger, sadness, and emotion.

•
Physical empathy: The ability to understand and empathize with the sensation of moving a player's body.

•
Esthetic intuition: The ability to appreciate the excellence of a play, such as the beauty of individual skill and form.

•
Emphasis on fair play: The ability to respect the values associated with fair play.
However, Saito et al. [2] pointed out that their research only surveyed spectators of the J2 soccer league, and the constituent concepts were constructed based on this sample; therefore, the results may not reflect all sports.For example, one of the constituent factors, "team play intelligence and cognitive ability," applies to soccer, where the tactical movements of the team as a whole are important.However, it is difficult to apply this concept to baseball, where individual plays are more often emphasized.In addition, the ability to emphasize fair play translates well to soccer, where most plays involve contact between players, which makes fair play important.In baseball, although there are also contact plays, such as cross-plays and being hit by a pitch, the proportion is not as high as that for soccer, thus placing less emphasis on fair play.
In this study, "individual play intelligence cognitive ability" and "psychological empathy" were identified as important abilities for watching baseball games.Therefore, our goal was to improve the cognitive abilities of novice baseball viewers using robots.Skills not addressed in this study include "esthetic intuition," which is strongly influenced by a person's subjective interpretation, and "physical empathy," which depends significantly on a person's sports and exercise experience because we found these skills difficult to support with robots.

Baseball Spectator Assistance Robot
In this section, we propose a method for supporting novice baseball spectators in watching games using robots.The proposed method involves a robot watching a baseball game alongside a human while making appropriate utterances depending on the game situation.Figure 1 shows an overview of the robot that supports baseball spectating, Sota, which was developed by Vstone.Sota's size and functionality are sufficient to realize our proposed method.Although a similar robot, NAO, exists, it has a walking function and is larger than Sota.Our proposed method does not require the robot to walk.In addition, it is desirable to use a small robot considering the cost associated with the robot's installation space.The system generates a sentence based on the estimated game situation from a live game video and the robot delivers the utterance.To create an atmosphere of shared spectating, the robot supports the same team as the user.In this study, we implemented an utterance-generation process to determine the appropriate utterance content for the robot.In Section 3.1, we propose the robot's utterance content and implement the utterance generation process in Section 3.2.

Definition of Utterance Categories and Rules
To improve the cognitive abilities of novice baseball spectators, we defined five categories of utterances that the robot would use during spectating.The utterances were categorized as follows:

•
Confirmation: Utterances to confirm the name of the player currently performing or details of the play the player is performing.

•
Information: Utterances about the abilities or strengths of the player currently performing.

•
Prediction: Utterances that predict future moves of the player.

•
Emotion: Utterances that express emotion toward players' plays.
Each speech category corresponded to all the sub-items of individual game intelligence, cognitive ability, and psychological empathy [2], except for confirmation (Tables 1 and 2).These categories assume the behavior of an experienced spectator.Confirmation was included to increase the receptivity of inexperienced baseball spectators to the other four utterance categories.Therefore, confirmation is combined with other categories to allow the robot to speak.For example, by uttering the confirmation first and then the information, users can hear a player's name and learn about their abilities and strengths.The definition of each utterance category is based on a study by Sumino et al. [8], who analyzed conversations between people while watching soccer.All conversations analyzed by Sumino et al. [8] were with people who had experience with soccer.We assumed that the conversations were conducted by experienced viewers with high cognitive abilities.

Items
Utterance Category 1. Spectating while analyzing the player's abilities Information 2. Spectating while knowing the player's strengths Information 3. Spectating while predicting the tactics Prediction 4. Spectating while understanding the meaning of the players' movements Evaluation 5. Spectating and distinguishing between technical errors and judgment errors Evaluation 6. Spectating while paying attention to the player's play choices Prediction

Items
Utterance Category 1. Spectating while empathizing with the player's frustration Emotion 2. Spectating while being moved by the sadness of the player 3. Spectating while empathizing with the player's psychological state 4. Spectating while being moved by the sight of players being happy 5. Spectating while empathizing with the player's anger Next, we considered game situations in which the robot made utterances.As a premise, we thought of situations in which users are more receptive to the robot's utterances.In sports, there are moments when spectators are focused on the game, such as when the pitcher and batter are in the middle of a duel or when a player is making a play.If the robot expresses itself in such moments, users might feel annoyed, and their receptivity to expressions might decrease.Therefore, the robot should rather speak at moments of relatively low concentration.
In this study, we identified two situations that lend themselves to robotic speech, namely, the player's appearance and gameplay scenes.The player's appearance scene refers to the situation before the pitcher and batter face off in the game, that is when the pitcher steps onto the mound or the batter steps into the batter's box.A player's play scene refers to the time when the game situation changes, such as when a runner is on base, after a certain number of outs, or when the score changes, and refers to the situation after a player has made a play.
Finally, to determine the situations in which the robot would make utterances for each category, we defined the following three rules: 1.If there are no runners on base in the player's appearance scene, the robot will make confirmation and information utterances.2. If there are runners on base in the player's appearance scene, the robot will make confirmation and prediction utterances.3.In the case of the player's gameplay scene, the robot will make utterances in the order of emotion, confirmation, and evaluation.
In a player's appearance scene, confirmation and information utterances (rule 1) or confirmation and prediction utterances (rule 2) are made with respect to the player.The reason for the different utterance categories, which depend on the presence or absence of runners on base, is that we assume that the spectators should be different for the different scenes.In the scene where a player appears, spectators are concerned about how the player will perform in an upcoming situation.Therefore, there is a high demand for utterances that provide information on the players' abilities and strengths.In contrast, when runners are on base, the likelihood of scoring increases, and the excitement of the game for the spectators also increases.Therefore, we assumed that there is a high demand for utterances that predict future play.In the player's gameplay scene, the utterances were made in the category order of emotion, confirmation, and evaluation (rule 3) regarding the play.
Figure 2 shows a flowchart of the utterance rules.The robot waited and selected an appropriate utterance category for the scene depending on the game situation changes.In baseball, the player's appearance and play scenes are repeated; thus, utterances based on rules 1, 2, and 3 are generally repeated.

Confirmation
Utterances in the confirmation category are made in both the player's appearance and the player's gameplay scenes, with different content depending on the scene.The generation rules for confirmation utterances are listed in Table 3.

Game Situation
Generated Utterance The player's appearance scene (when the pitcher appears on the mound for the first time in the game).{player's name} + "sensyu ga maundo ni agatta ne" (in Japanese); {player's name} + "has taken the mound, you see" The player's appearance scene (when a batter enters the batter's box).
{player's name} + "sensyu ga daseki ni haitta ne" (in Japanese); {player's name} + "has stepped up to the plate, you see" The player's play scene (when the player's play is advantageous to the robot's cheering team).{play content} + "dane" (in Japanese); {play content} + "you see" The player's play scene (when the player's play is disadvantageous to the robot's cheering team).{play content} + "ka" (in Japanese); {play content} + "huh" When a player appears in a scene, utterances are generated to confirm the player's name.When the pitcher appears on the mound for the first time in the game, the utterance {player's name} + "sensyu ga maundo ni agatta ne" (in Japanese)" is generated.When a batter enters the batter's box, the utterance {player's name} + "sensyu ga maundo ni agatta ne" (in Japanese) is generated.
When the player's play scene, the utterance {play content} + "dane" (in Japanese) or "ka" (in Japanese) is generated.Examples of {play content} include "hit to left" and "grounded to second.".The expressions "dane " or "ka" are used depending on whether the player's play is advantageous or disadvantageous to the robot's cheering team.If the play is advantageous, the expression "dane" is used to express enthusiasm.Alternatively, if the play is disadvantageous, the expression "ka" is used to express disappointment.

Information
The information category requires data regarding the players' abilities and strengths to generate utterances.In this study, we used Japanese Wikipedia articles as the data source for player information.NPB player Wikipedia articles of NPB players are written in the same format and usually contain one of the following sections: "Purēsutairu" (playing style), "Jinbutsu" (person), "Purēsutairu jinbutsu" (playing style and personality), "Senshu to shite no tokuchō" (player characteristics), "Senshu to shite no tokuchō jinbutsu" (player characteristics and personality), or "Toukyū sutairu" (pitching style).These sections describe the players' abilities and strengths, which are usually explained succinctly in the first sentence.
In this study, for the utterances in the information category, we extracted the first sentence of the aforementioned fields in the player's Wikipedia article and generated them through a spoken language conversion process.The conversion rules for spoken language are listed in Table 4.We selected additional sentences based on the part of speech, focusing on the end of the sentence.The MeCab morphological analysis engine was used to confirm the parts of speech at the end of a sentence.Other senshu dayo (He is the player) dageki de wa, hikume no dakyuu mo chouda ni dekiru pawā o motsu senshu +dayo (in Japanese); He is a player who has the power to hit low pitches for long hits.
The extracted sentences were converted to spoken language because Wikipedia text uses written language.When the robot directly pronounces a captured Wikipedia sentence, the user may possibly feel uncomfortable with the language used.For the conversion process, we created rules that focus on the part of speech at the end of a sentence, referring to the research conducted by Hayashi et al. [28] on text conversion from written to spoken language.

Prediction and Evaluation
The utterance category "prediction" is defined as an utterance that predicts a play to be made later by a player.Predictive utterances are created in advance and selected according to the game situation.Based on baseball theory [29], we created utterances for each base and out count scenario.For example, if the team that the robot is cheering for is attacking and the out count is zero with a runner on first base, the robot will utter, "This is a situation where we want to advance the runner with a bunt or end run".
The "evaluation" utterance category is defined as an utterance that evaluates a player's play.Evaluation utterances are also created in advance and selected according to the game situation.We created utterances that evaluate the play based on the type and course of the pitches thrown by the pitcher.For example, when the batter of the team the robot is rooting for gets a hit on a straight ball thrown by the pitcher to the middle course, the robot says, "He surely caught the easy-to-hit ball in the middle".

Emotion
According to Nishimura et al. [6], the emotion category generates utterances based on comments posted by baseball viewers on social media.In recent years, social media has become a rich resource for investigating a wide range of research questions [30].Given the difficulty involved in gathering news from large data sources such as social media, a series of studies have been conducted using data mining and natural language processing to facilitate this task [31].Therefore, using social media data obtained using natural language processing techniques is an effective approach for this study.However, in their study [6], the comments were randomly selected from real-time posts, which makes it difficult to select appropriate utterances depending on the game situation.In this study, we proposed a method for selecting appropriate utterances using a social media comment classification model.We used tweets posted by NPB spectators as comments on social media.The following section describes the process of collecting tweets for utterance selection.

Procedure 1.
When the game situation is reflected in the player's play scene, tweets posted by NPB spectators within 10 s of the gameplay are captured.Procedure 2.
Select utterances from tweets that are classified as "Positive" when the team the robot is cheering for gains an advantage from the player's play and as "Negative" if it is detrimental.
In Procedure 1, tweets were collected using the Twitter API.For this study, we used tweets from viewers of Yokohama DeNA BayStar, a team in the Central League of the NPB.Therefore, to acquire relevant tweets, we conducted an OR search for the following hashtags: "#baystars," "#Baystars" (written in Katakana), "#Yokohama Baystars" (Yokohama written in Kanji, Baystars written in Katakana), and "#Yokohama DeNA Baystars" (Yokohama written in Kanji, Baystars written in Katakana) [32].In this study, the Standard Search API (version 1.1) was used to collect tweet data.This API can retrieve 100 tweets per request.However, there is a limit of only 180 requests per 15 min.Therefore, when this limit was reached, we waited 15 min before sending a request to the API to continuously collect tweet data.
To classify the tweets in Procedures 2 and 3, it was necessary to train a model.First, tweets were collected to obtain the training data for the classification model.Data were collected on 5 days on 2, 9, 16, 26, and 30 August 2022, and a total of 51,473 tweets were collected.To eliminate any bias in the tweets, data were collected on days when the Yokohama DeNA Baystars and five other Central League teams (Hiroshima Carp, Hanshin Tigers, Yomiuri Giants, Chunichi Dragons, and Tokyo Yakult Swallows) had games.In addition, the following preprocessing steps were performed to treat the tweets as training data.
Next, we extracted the tweets that were posted during the players' gameplay scenes.In Procedure 1, we specified that we would handle tweets posted within 10 s of a player's play scene.However, during training, we processed tweets posted within 30 s to improve the accuracy of the classification model.To investigate the game situation at the time the tweets were posted, we manually recorded the time of players play scenes using game videos.Following the extraction of the tweets for 30 s after the player's gameplay scene based on the recordings, 7629 tweets were obtained.Note that tweets that became zero characters due to character removal in preprocessing were not treated as learning data and were not counted.
Next, we created training data for the tweet-classification model.To train the classification model in Procedure 2, we labeled the tweets that were classified as emotional as "Emotional" and those that were not as "Not_Emotional."Table 5 lists the labeling rules.In addition, even if a tweet was classified as "Emotional," if the text was unnatural, it was labeled as "Not_Emotional."Moreover, to train the classification model in Procedure 3, tweets labeled as "Emotional" were further classified into positive and negative content and labeled as "Positive" or "Negative."To evaluate the model, the labeled tweets were split into training, validation, and test data in an 8:1:1 ratio.Tables 6  and 7 list the number of tweets in each dataset.To train the classification model, we fine-tuned a pre-trained BERT model [33].For pre-training, we used a pre-trained Japanese Wikipedia model [34] published by Tohoku University.The hyperparameters (batch size, dropout rate, learning rate, and number of epochs) were optimized by 1000 trials of Optuna, an automatic hyperparameter optimization system.The optimized hyperparameters are as follows.

•
Batch size: 16 Based on the learning results, the accuracy of classifying tweets into "Emotional" and "Not_Emotional" categories was 0.858 for the validation dataset and 0.849 for the evaluation dataset.In addition, the accuracy in classifying tweets into the "Positive" and "Negative" categories was 0.927 for the validation dataset and 0.939 for the evaluation dataset.Therefore, the probability of selecting the intended utterance from the tweets obtained by Procedures 2 and 3 was 0.795 for the validation dataset and 0.797 for the evaluation dataset.
Moreover, when multiple tweets were classified as "Positive" or "Negative," in procedure 3 there were multiple candidates for utterances.In such cases, we determined the most appropriate utterance by selecting the tweet with the highest output value from the classification model.Furthermore, in Procedure 1, there may have been no available tweets to be used as utterances during the short time span of 10 s after the player's gameplay scene.In such cases, we performed the classification in the same way using the previously collected tweets.The following steps describe the procedure for selecting utterances from past tweets.Procedure 1.
Compare the play results associated with the tweets obtained in the past with the current play results and extract the tweets if they match.Procedure 2.
To compare the game situation when the tweet was posted with the current situation, we added +1 to the output value of the classification model for each matching situation.Procedure 5.
Select the utterances from the tweets that are classified as "Positive" when the team the robot is cheering for gains an advantage from the player's play and as "Negative" if it is detrimental.
Although there is concern that bot-generated tweets may affect the analysis in this paper, in this paper, the bot-generated tweets are labeled as non-emotional (Not_Emotional) according to the rule of Table 5.This is because bots that tweet about Japanese professional baseball games follow a specific format in most cases (e.g., "[Back of the 1st inning] XX player hit a home run").As shown in this section, the accuracy of the sentiment classification model in this study is sufficiently high, and the influence of the bot-generated tweets in this study is considered small.

Experiment
To investigate the influence of defined utterance categories and rules on cognitive abilities during sports viewing, an evaluation experiment was conducted with robotic spectators watching a baseball game video.
In the experiment, participants watched a 30-min baseball game video between the "Yokohama DeNA BayStars" and the "Yomiuri Giants" that took place on 18 August 2022.The video consists of two innings from the top of the 3rd inning to the bottom of the 4th inning.The game featured a run by the Yomiuri Giants in the top of the 4th inning and a run by the Yokohama DeNA BayStars in the bottom of the 4th inning.The video was played directly on a website provided by Dwango Co., Ltd., based in Tokyo, Japan, through its "Niconico Live Broadcast" service on the Niconico Pro Baseball Channel.Following the inquiry with the rights holders, we confirmed that there were no problems with the use of the game video for our experiment.This study was approved by the Ethics Committee of the Tokyo Polytechnic University (approval number: Rin2020-12).
The experimental setup is shown in Figure 3.The robot used for the experiment was Sota, from Vstone (shown in the lower-right corner of Figure 3).The baseball video was shown on a display (located in the upper center of Figure 3).To eliminate the possibility that participants could not hear the robot's spoken utterances, which may include baseball-related terms and player names, the utterance sentence [5] was displayed on Microsoft's Surface Pro 4 (located in the lower-left corner of Figure 3).

Condition
The experiment used an experimental design with two conditions, namely, robotpresent and robot-absent.In the robot-present condition, participants watched a game video with the robot.The robot supported the Yokohama DeNA BayStars, and the participants were instructed to support the same team as the robot.In the robot-absent condition, participants watched the game video alone.Therefore, Sota and the display of the utterance sentences were removed from the experimental design.Participants were instructed to support the Yokohama DeNA BayStars in the robot-absent condition.By comparing the evaluations of the two conditions, we were able to examine the influence of the proposed method on participants' cognitive abilities.
In addition, in the robot-present condition, the robot utterances were performed using the Wizard of Oz method [35].The Wizard of Oz method is a simulation technique in which a human, acting as a system, interacts with a user.The experimenter sent prerecorded utterances (Tables 8 and 9) to the robot at specific times to initiate speech.The timings for sending the utterances were set during two scenes: when the player appeared on the field and during the player's gameplay scene.

Game Situation Utterance Category Utterance
Player's participation scene Prediction (This is a situation where we want to aim for a strikeout because even a ground ball or fly ball could score a run.)goro ya hurai demo tokten ga haitte simau kara sanshin wo neratte ikitaine (in Japanese) Player's play scene Evaluation (His pitch went high and sweet.)bōru ga takame ni amaku haitte simattane (in Japanese) To prevent the timing of the robot's utterances from significantly affecting the evaluation in each experiment, we standardized the timing of utterance sending.In the player's appearance scene, the utterance is sent after the pitcher throws the first pitch and receives the return.In the player's play scene, the utterances are sent 10 s after the play has occurred.However, when the pitcher throws the first pitch and transitions to the player's play scene, no utterance is sent during the player's appearance scene.

Procedure
As part of the experiment, participants completed a questionnaire regarding their experience of watching sporting events prior to the study.Subsequently, they received an explanation of the experiment and watched a 30-min baseball game video.Finally, they completed a post-test questionnaire that included assessment questions assessing the cognitive ability, emotion, and spectator value of sports spectators.
The post-test questionnaire on sports spectator cognitive ability asked participants the extent to which they used spectator methods consistent with the six items on individual gameplay intelligence cognitive ability (Table 1) and the five items on psychological empathy (Table 2) [2].Responses were made on a 7-point scale ranging from "1.Not at all" to "7.Very much so."The effectiveness of the proposed method was investigated by comparing the ratings of cognitive observational ability under different conditions.
In the emotion questionnaire, participants were asked to rate how much they felt about the 12 items presented in Table 10 while watching the video.Responses were recorded on a 7-point scale ranging from "1.Did not feel anything at all" to "7.Felt very strongly".The 12 items on emotions were taken from the study by Sumino et al. [36], which examined the emotions that occur when watching games.These items included anger, joy, and sadness, emotions related to psychological empathy.Psychological empathy involves empathetic feelings toward players, and an emotion questionnaire assesses one's emotions.Additionally, the five evaluation items of psychological empathy are shown in Table 2.
In the questionnaire on the value spectatorship, participants are asked to rate their level of agreement with six items (Table 11) on a 7-point Likert scale ranging from "1.Strongly disagree" to "7.Strongly agree".Items 4, 5, and 6 were adapted from the items The results of the evaluation of the six items for individual gameplay intelligence and cognitive ability are shown in Figure 5a.The evaluation values were obtained using a range of 1-7, with "1.Not at all" and "7.Did it a lot."We compared the average ratings for each item between the conditions.To confirm the significance of the differences in the ratings between conditions, we performed a non-parametric test without correspondence using the Mann-Whitney U test.The Mann-Whitney U test is one of the most frequently used nonparametric tests for evaluating the difference of medians between two independent samples [38,39].The test confirmed that the item "watching while knowing the strengths of the player" had significantly higher ratings (p < 0.01) in the condition with the robot present than in the robot-absent condition.No significant differences were found for the other items.In this study, statistical significance is recognized when p < 0.05.A significance level of p < 0.01 corresponds to a 1% level of significance.When p < 0.1, although statistical significance is not observed, we discuss the results as potentially providing valuable insights.
The evaluation results for the five items on psychological empathy are shown in Figure 5b.Similarly, a Mann-Whitney U test was performed, and the item "watching while empathizing with the player's anger" showed a tendency to be significantly higher (p < 0.1) in the robot-absent condition than in the robot-present condition.No significant differences were found for the other items.The evaluation results for the 12 emotional items are shown in Figure 6.The Mann-Whitney U test showed a significant difference for the item "feeling heavy" for sadness, with higher scores in the robot-absent condition than in the robot-present condition (p < 0.05).No significant differences were found for any other items.The evaluation results for each of the six items related to spectating value are shown in Figure 7. Similarly, a Mann-Whitney U test was conducted, but no significant differences were found in any of the items.

Discussion
The evaluation of individual gameplay intelligence cognitive abilities showed that in the item "watching while knowing the strengths of the player", the ratings were significantly higher in the robot-present condition than in the robot-absent condition.This suggests that the robot's utterances have the potential to convey knowledge regarding the players using the proposed method.It is possible that the participants found the robot's utterances useful in terms of information categories, such as the player's abilities and strengths.
In the psychological empathy category, it appeared that the robot-present condition tended to score lower compared to the robot-absent condition in the item "watching while empathizing with the player's anger."This suggests that watching baseball with a robot may decrease the empathic response to a player's anger.One participant commented in the post-questionnaire free description that "the robot's angry expressions were uncomfortable."Thus, the cause could be the emotional category of the expressions.In the emotional category of utterances in this experiment, anger was expressed in three situations.However, all three utterances expressed anger toward the player rather than empathizing with the player's angry feelings.Therefore, it is possible that participants found the robot's angry utterances toward the player unpleasant, resulting in a lower empathy rating.No differences were found between conditions for the other items.
In addition, there was no significant difference in the participants' emotion ratings for 11 of the 12 items.This suggests that the current method did not affect emotions before it promoted empathy toward the players.For the item "feeling heavy", ratings were lower in the robot-present condition than in the robot-absent condition.Although the average rating in the robot-absent condition was close to "3.I did not feel much," the proposed method suggests the possibility of alleviating the feeling of heaviness when watching alone.
In addition, no significant differences were found for any of the items on the spectating value.The current method showed that there was no difference in the values obtained while watching baseball.Spectators find value in sports by using their spectating abilities [1].Therefore, it is hypothesized that the inability to improve participants' intelligence, cognitive ability, and psychological empathy did not affect spectator value scores.

Figure 1 .
Figure 1.Overview of the robot system for assisting in watching baseball games (assuming watching baseball games on TV or the Internet).

Figure 2 .
Figure 2. Flowchart of the utterance rules.

Figure 3 .
Figure 3. Experimental environment.(The display on the left side of the figure shows robot utterances in Japanese).

Figure 5 .
Figure 5. Evaluation results of cognitive ability on sports viewing: (a) Individual game intelligence cognitive ability (b) Psychological empathy.

Figure 7 .
Figure 7. Evaluation results of spectator value.

Table 1 .
Correspondence between each of the play intelligence cognitive ability sub-items and each utterance category.

Table 2 .
Correspondence between psychological empathy sub-items and each utterance category.

Table 3 .
Utterance generation rules for each confirmation category.

Table 4 .
Rules for adding sentences to the end of Wikipedia sentences.

Part of Speech at the End of the Sentence An Additional Sentence at the End of the Sentence Example of Addition at the End of the Sentence
sou kou shu de yakudoukan ni afureru purē ga miryoku no gaiyoushu + dayo (in Japanese); He is an outfielder whose dynamic play is full of excitement in running, hitting, and fielding.

Table 6 .
The number of data points for each split label ("Emotional" and "Not Emotional" classification).

Table 7 .
The number of data points for each split label ("Positive" and "Negative" classification).

Table 8 .
The utterances of the robot used in the experiment (top of the 4th inning, Yomiuri Giants' offense).

Table 9 .
The utterance of the robot used in the experiment (bottom of the 4th inning, Yokohama DeNA Baystars' offense).He is a slugger who hits powerfully to all fields from a compact form with a bat held behind his right shoulder and has a very high slugging percentage utilizing his natural power.He has an aggressive style of hitting from the first pitch and goes after it.)migi ushiro ni kamaeta batto wo jyouge ni yurasu konpakuto na fōmu kara koukaku ni kyouda wo utsu suraggā deari, syokyuu kara sekkyokuteki ni uti ni iku sutairu de, motimae no pawā wo ikashita tyouda ritu ga hijyou ni takai sensyu dayo (in Japanese)