3. Research Questions (RQs) and Method
Based on the preceding theoretical analysis, the study sought to address the following research questions:
RQ1. Which functions should news chatbots perform when used in crisis reporting?
RQ2. How can news chatbots be designed and implemented to be trustworthy, reliable and acting in the users’ interests when used in crisis reporting?
Based on the ideal features of a chatbot, as earlier analyzed, a news chatbot was created based on the COVID-19 information offered on the web platform of the BBC. The study opted for the implementation of a retrieval based chatbot with predetermined responses based on specific requirements included in this study. Specifically, the scope of the effort was to develop and evaluate a news chatbot that would offer an alternative method of accessing existing information. Moreover, since such a solution is being proposed for crisis situations, a specific one was selected that could be deployed rapidly and does not require sophisticated programming. It is worth noting that chatbot implementations based on natural language interaction can be considered to be overrated for disseminating existing structured (in some sense) information (e.g., symptoms, existing cures, available medication, restriction in movement, etc.).
The specific news organization was selected first because it attracts a global audience through using the English language, and second because it presents not only national information regarding the UK but also global information on the rest of the world, in contrast to other national or local news organizations in Europe that tend to focus more on national/local aspects of the COVID-19 pandemic. This makes it easier for study participants to assess and evaluate the effectiveness of the news chatbot presented, as they are more familiar with the available information.
Once the news chatbot was created, two groups of journalism students (consisting of 45 participants each) were asked to evaluate its performance, both via mobile and computer screens. The questionnaire used for the focus groups’ evaluation is presented in
Appendix A. The two focus groups were independent and did not interact with each other. The first group was selected from 2nd-year students attending the BA Program of Journalism in the University of Cyprus; the second group was selected from 2nd-year students attending the BA Program of Journalism in the Aristotle University of Thessaloniki, Greece. All participants in both groups were familiar with news chatbot applications, having earlier attended relevant teaching modules. The ratio of men to women was around 1:1, and their ages ranged from 20 to 24 years old. They were deemed appropriate to evaluate the news chatbot mainly due to their familiarization with relevant applications. After all, this study is not concerned with the overall evaluation of using news chatbots; instead, the main target, as already stated, is to design an effective news chatbot to be used during a crisis situation.
Both focus groups were conducted by using the Microsoft Teams Platform for distance learning courses by the same independent researcher/moderator who guided the interactive conversation. The moderator was accompanied by each group’s instructor, who remained only as an observer throughout the online conversation. The initial questions were based on the specific features of effective news chatbots, as earlier analyzed in the Theoretical Framework. In the next stage of the study, all evaluation comments were categorized regarding functionality, reliability, design and specific features of the news chatbot (see analysis below) and embedded in the final application.
5. Evaluation of COVINFO Reporter Chatbot: Findings and Analysis
Drawing from earlier chatbot evaluation studies [
6,
10,
11], the COVINFO Reporter chatbot was assessed in terms of the following characteristics: (a) performance, (b) reliability, (c) functionality, (d) personalization, (e) interactivity, (f) ethics and behavior, and (g) accessibility. Although the two groups of participants did not interact with each other, both came to the same conclusions and offered similar assessments; therefore, their comments are jointly presented based on the earlier mentioned categorization of the chatbot’s characteristics and not on the basis of two separate focus groups.
The performance of the COVINFO Reporter chatbot was examined in regard to its ability to respond in a timely and efficient manner, both via a larger screen (tablet or laptop) and a smaller one (mobile phone). All participants deemed it efficient, and most indicated that, in times of crisis, the chatbot can save time when looking for crucial information: “It helped me save time while looking for much needed information, for example guidelines on medical facilities”, stated M.A. (female, 20 years old, Cyprus). The vast majority of participants stated that there were no differences between the two screen categories used. However, they found the mobile phone screen to be more efficient, perhaps because they were more familiar with it: “All the information is there; either you use it via a computer or via a smartphone, but my opinion is that COVINFO Reporter was constructed for a smartphone” (C.G., female, 22 years old, Cyprus). Several problems were initially detected by participants of both focus groups, focusing mainly on the chatbot’s technical performance. “I detected some ‘technical problems’ in regard to the chatbot’s ability to return to the main screen that need to be fixed; otherwise it would be tiring to navigate through it” (A.L., male, 22 years old, Cyprus); “When using it through a computer, it does not seem to be able to use the full-screen mode, only part of the screen” (I.A., male, 23 years old, Cyprus); “I agree, this could entail further problems with people with visual disabilities and older users” (N.C., female, 21 years old, Cyprus).
Functionality was tested in terms of linguistic accuracy and knowledgeable information offered. Most of the participants agreed that the language used is simple and accurate, and the information offered is useful for everybody during the pandemic crisis: “I found everything I was looking for—the information was filtered in a useful way and the news categories were expressed with simple words” (A.L., male, 23 years old, Greece). Although some participants indicated that they enjoyed the fact that the COVINFO Reporter offers the basic information needed, others argued that more information could be added: “At least another information category has to be added regarding information for the pandemic outside the UK, for people who want to know what is happening in other countries” (G.A., male, 20 years old, Greece; A.E., 21 years, female, Cyprus). An important point was made by several participants regarding the colors used: “the use of red color could eventually make us tired while using the chatbot for [a] longer duration of time” (A.B., female, 22 years old, Greece); “the partial use of white color should be avoided in a news chatbot because it can make users feel bored” (G.A., male, 20 years old, Greece); “definitely the use of white color for the fonts should be avoided in the chatbot responses; it makes it difficult to read the information offered” (G.K., female, 21 years old, Greece).
Reliability was measured in regard both to the content offered and to the chatbot’s proper function. The majority of the participants indicated that the chatbot functioned properly, and the timeframe for providing answers to users was adequately calculated. “It functions properly in all categories tested, and this kept me going for longer than I thought I would have stayed inside the application, definitely better than reading a conventional news website” (S.Ma., male, 21 years old, Greece). Regarding the news content, all participants deemed it reliable and trustworthy, as “it is based on the information offered by the BBC web page; therefore, I consider it reliable enough” (P.P., male, 23 years old, Cyprus); “I trust that the information is reliable, because it comes from a reliable source, since it is the source that guarantees reliability, not the robot application” (C.Ch., female, 22 years old, Cyprus).
Personalization refers to the form and depiction of the chatbot. All participants found the selected form as being “representative”, “reliable” and “affective”. To this end, several students emphasized the fact that the picture selected for the COVINFO Reporter chatbot “encompasses politeness and personality traits that can make it seem more human” (E.A., female, 21 years old, Cyprus), “it looks like a real reporter and uses vivid color, which is very pleasant” (C.C., female, 20 years old, Cyprus), whereas it offers “crucial information in a customized form, and this is important in a crisis situation, for all users” (N.K.Th., female, 22 years old, Greece). All participants argued that the depiction of the chatbot in a humanlike form was not confusing: “it is clear that this is a robot application, although it is depicted in a humanlike form” (N.P., male, 21 years old, Cyprus); “the humanlike picture selected cannot confuse the users; this is clearly a robot we are interacting with, although he looks really friendly and polite, exactly as a reporter should look” (G.I., male, 22 years old, Cyprus).
Interactivity was tested in regard to the chatbot’s ability to easily interact with users. “It was fun and enjoyable to read the news in this format; it really helped me to move on with my next questions”, stated D.Th. (male, 22 years old, Greece). All participants agreed that this was a more enjoyable way to acquire the information they were looking for regarding the pandemic crisis than the way information is offered in a conventional news site: “Even if I am in a hurry and looking for specific information quickly, this way is far more effective, because it is like the chatbot is trying to answer all my questions” (M.A., male, 20 years old, Cyprus).
Ethics and behavior refer to issues regarding users’ privacy and sensitivity toward social concerns. All participants found it positive that no personal data were needed, whereas the information offered was in line with social concerns around an issue as serious as a health pandemic. “I liked the fact that there was no need to state personal data, i.e., my e-mail; it made it easier for me to search the specific information I wanted; for example, if someone I infected looks for medical information, he/she do not need to identify themselves”, stated I.A. (male, 23 years old, Cyprus). “The chatbot provides information for every user of any age, and for me, this indicates social responsibility for all citizens”, argued Ch.K. (female, 21 years old, Greece).
Accessibility was tested in terms of users’ ability to easily access the chatbot and navigate through it. All participants in the study agreed that it was easily accessible and “fun to navigate through it, much more than any conventional news website (M.Z., female, 21 years old, Greece). “It was really easy to access and navigate inside the chatbot; everyone can do it, even older users with limited knowledge regarding chatbots; in fact, the chatbot itself guides you to the information you are looking for” (A.S., female, 21 years old, Greece). “The Q and A process escalates smoothly, and in this way, it is easy to access the specific information you are looking for; it is an easier narrative for telling the news” (P.T., male, 23 years old, Greece). “It is a better narrative when you are looking for emergency information, easier to access and navigate through it” (M.P., female, 22 years, Cyprus).
All expected characteristics, as well as achieved results, are depicted in
Table 1.
As this analysis has shown, chatbots used for news dissemination in a crisis situation seem to present certain differences in comparison to commercial chatbots. First, according to users’ assessment, they need to be as simple as possible so as any user can access and navigate through the information offered. This is extremely important during a crisis situation, since users need to acquire information fast and easily. In this light, the chatbot’s technical ability to respond in a timely manner is of equal importance during a crisis situation. The second basic difference is related to ethics and behavior. In a crisis situation, social concerns and “sensitive” issues may be related to patients’ identification and personal data publicized. As such, news chatbots used for access to emergency information (e.g., nearby medical facilities, guidance according to medical protocols, etc.) need to be in line with social concerns and ethical boundaries.
This analysis has also shown that the design and development of chatbots used for news dissemination in a crisis situation is rapidly evolving, following two basic factors: first, the latest technological trends, as well as the available technology, both to the developer and to the target audience. For example, while an international news organization can have access to the means and personnel needed to develop a more perplexed chatbot application, a local news entity does not have the means, nor does it employ the specialized personnel, to develop perplexed applications. In this light, news chatbots that are developed to meet urgent needs and audience demands need to be easily designed and managed, following existing development tools. Accordingly, every developer needs to keep in mind the technology available to the target audience. For example, users in countries of the Western world tend to enjoy more advanced technological tools than users in underdeveloped countries, following the existing digital divide.
Second, news chatbots need to be in line with the specific peculiarities of the crisis situation for which they provide information. Not every crisis situation presents similar characteristics to previous crises, even if they are related to the same social sectors. For example, the Great Recession of 2007 was radically different from previous economic crises mankind had to face. Accordingly, the pandemic crisis of 2020 due to COVID-19 was different compared to the SARS (Severe Acute Respiratory Syndrome) pandemic in the early 2000s. As such, users’ needs regarding information and news dissemination may differ, and this has to be taken into consideration during the development of the application.
6. Conclusions
This paper has focused on the design, implementation and evaluation, in terms of effectively fulfilling the social responsibility function of crisis reporting, of a news chatbot used in a crisis situation. In this light, the pandemic crisis of 2020 due to COVID-19 was used as a case study, and the COVINFO Reporter chatbot was developed, which aims to deliver timely and accurate information regarding the crisis. The novelty of the approach is based on the news chatbot’s easy implementation for news organizations, as well as on its ability to effectively deliver crucial information to a wide audience (users) in times of crisis.
Interesting conclusions can be drawn from the findings of the study. There is no doubt that automation is already having a significant impact on journalism and the dissemination of news in general. The introduction of chatbots in the media sector has shown that they can significantly reduce journalists’ workload, allowing them to concentrate on quality, in-depth analysis and reporting [
14]. Chatbots can facilitate an alternative narrative that can be customized based on users’ preferences. This is significant in the cases of crisis reporting, where the dissemination of accurate, timely and customized information is very important for the public.
The theoretical study of previous media chatbot projects informed the implementation of the COVINFO Reporter, a working chatbot that disseminates information published by an international media organization. The chatbot was developed on a commercially available chatbot platform (i.e., Quriobot) and can be easily customized and updated. It offers easy and predictive navigation, enabling users to access the information that interests them, without having to navigate through the significant number of webpages that a media organization site usually incudes. Its programming is relatively straightforward and can be easily integrated into the workflow of a typical media organization.
A thorough evaluation of various characteristics (performance, reliability, functionality, personalization, interactivity, ethics and behavior and accessibility) of the COVINFO Reporter chatbot was conducted by two separate groups of participants. The chatbot was positively evaluated in terms of its efficiency, although some participants reported minor technical problems. The preferred platform for accessing it was mobile phones. The majority of the participants was satisfied with the functionality of the chatbot, reporting that the language used was simple and accurate, and the information it provided was useful. The participants agreed that the chatbot was reliable, was functioning properly and provided answers in an acceptable time frame. As far as personalization is concerned, the COVINFO Reporter was reported to be representative, reliable and affective. All participants appreciated the chatbot’s interactivity. No problems were reported in terms of users’ privacy and sensitivity toward social concerns. Finally, all participants agreed that the COVINFO Reporter’s accessibility was very good, and they experienced no problems in navigating the chatbot. Overall, the evaluation of the chatbot was very positive, and the minor problems that were detected were noted and corrected, thus improving its performance.
Future extensions of this work could include additional research into the ways in which chatbots can be employed in crisis reporting, with a focus on their smooth incorporation in the journalistic workflow, with added features and inputs (textual and voice), so as to further assist users looking for specific information. Special attention should be given to chatbots’ ability to collect data from users, thus enabling them to be utilized in crowdsourcing schemes, which can be extremely valuable during crisis situations. However, in this case, the use of a method to prevent the spread of “fake news” (misinformation) would be necessary so as to ensure the reliability of the chatbot and the information disseminated. There exist a variety of techniques to monitor complex systems, in which the average behavior of the whole network is compared to particular nodes. One example is the use of deep learning to detect faults in systems through entropy measurement, as proposed by Martinez-Garcia et al. (2019) [
34].
Finally, since this implementation is proposed as an interaction with the general public, interaction in natural language was not the first choice. Nevertheless, the incorporation of a news chatbot that would support natural language interaction is considered to be one of the future extensions of this study.