Analízate : Towards a Platform to Analyze Activities and Emotional States of Informal Caregivers †

: An informal caregiver is exposed to an emotional overload, which can lead to high stress levels. To provide caregivers with awareness about their emotions, we propose an emotion-tracking platform based on Facebook status updates. For this purpose, we trained several classiﬁcation models to assign emotions to Facebook status updates. Then, we developed a Facebook application that connected to each user’s proﬁle, gathered their status updates, classiﬁed their emotions, and generated a summary. We tested the application with 54 participants, ﬁnding the relative precision to be 84%, while the platform was perceived to be valuable and novel. Participants thought the application allowed them to think about their emotions and the information they share.


Introduction
Suffering a complex, life-threatening illness such as cancer means a long and emotionally exhausting process for the patient and their family. An informal caregiver is usually a close family member or friend that will support the patient during their treatment, and whose own life is seriously affected by this role [1]. Informal caregivers of cancer patients suffer from emotional burnout, which may contribute to undermine their mental, social and physical health. The caregiver role leads caregivers to suffer more frequently from anxiety-depressive disorders, higher social isolation, worsening economic situation, higher general morbidity and even higher mortality [2]. Cancer caregiver families face problems in the areas of time and logistics, physical tasks, financial costs, emotional burdens, mental health, and physical health [3]. The general well-being of caregivers should be monitored to avoid emotional burnout and its consequences. Some strategies to improve caregivers' well-being include healthcare professionals who may provide emotional support, knowledge, and practical problem-solving suggestions [4].
Social network sites, such as Facebook, grant caregivers a space of virtual support through which they can share how they feel and get feedback from their friends and families [5]. For example, Facebook breast cancer support groups have been found to promote awareness, fundraise, promote services and products, and provide support to caregivers and patients [6]. The Facebook timeline allows users to have a personal record of events, helping to store significant moments of a user's life [7], making this platform a space where it is possible to share and communicate with others [8]. However, few tools exist on Facebook for users to review and reflect past information they have shared.
In the case of children under cancer treatment, mothers are commonly the informal caregivers, i.e., in charge of providing the main support and care for their ill child [9]. Parents of children with cancer use their personal Facebook accounts as mechanisms of communication and to share information and social support mobilization and new supportive programs could be designed for the Facebook platform helpful for caregivers [5].
This research presents the development of a web application, called Analízate (in Spanish, analyze yourself), which reviews the status updates published on Facebook by a user to provide a visualization with feedback about their expressed emotions. The main goal was to design and develop a tool that could be integrated into a caregiver's Facebook account, to use privately, processing all their shared data, and evaluating the person's emotional journey regarding their child's illness. The system developed was trained using information posted on Facebook by informal caregivers of pediatric cancer patients, and it was preliminarly evaluated with Facebook users, to identify relevant insights before testing it with caregivers. This was done because the parent caregivers are undergoing an extremely challenging period and therefore are of difficult access. This study was approved by the University ethical committee, project number 160622003.
The specific goals of this research were to create an automatic classification of emotions, to create a Facebook application capable of getting and processing user statuses, and to conduct a preliminary evaluation of the platform.
The main research questions were the following ones: • RQ1: Is an application to analyze emotions in Facebook status updates useful to Facebook users? • RQ2: Does the application help users self-reflect about their emotions?
It should be noted that these questions are deliberately targeting a general population of users and not caregivers, as at this stage of the research, as previously discussed, we wanted to validate the platform before testing it specifically with caregivers. This paper is organized as follows. First, we discuss related work, considering theories about emotions and their classification trough learning algorithms and text recognition. Then, we describe the design and implementation of the Analízate platform. Section 4 describes our methodology and experiments conducted to analyze the operation CFof the Facebook application, then Section 5 describes the results. Finally, Section 6 presents our conclusions and discusses possible avenues of future work.

Emotions and Social Network Sites
Social network sites are described as "web-based services that allow individuals to (1) construct a public or semi-public profile within a bounded system, (2) articulate a list of other users with whom they share a connection, and (3) view and traverse their list of connections and those made by others within the system". [10]. After this concept emerged, many different social network sites (SNS) with different operating models have been created, from SixDegrees.com in 1997 to Twitter and Facebook in 2004-2005. SNS are among the online tools with more users, in large part due to their high usability levels and easy access in moderately developed countries (excluding some places, for example Kenya, where using Facebook was in 2013 still considered a luxury [11]). SNS are used for several different activities, e.g., sharing photos, writing comments, gaming.
This work is focused on Facebook, which is the most popular SNS in the world, with 2190 million active users [12]. This study was conducted in Chile, a country with 18 million inhabitants and the highest rate of internet presence in the Latin America region (77%). In Chile, 82% of digital activities are related with visiting SNS [13] and Facebook is the most used one, with 13 million users (71.4%) [14].
In psychology, an emotion is defined as "a complex state of feeling that results in physical and psychological changes that influence thought and behavior." [15]. A classic classification of emotions defined six basic emotions typically recognizable by humans-Anger, Disgust, Fear, Happiness, Sadness, Surprise [16]. A broader classification of emotions was proposed in 1980, including 28 adjectives to describe human emotions [17]. A set of emotions specific to caregivers was also proposed, including Happiness, Sadness, Energy, Tiredness, Relaxation, Stress, Calm and Anger [1].
The information that users share through SNS depends on their own life events, and there is a hidden emotional component expressed by users across these platforms. For example, photos shared by couples through Facebook could be an indicator of the quality of their relationship -couples were found to share information about the relationship during days in which they were more satisfied with it [18]. Posting on Facebook could be helpful to reduce feelings of loneliness, because it makes users feel more connected and in touch with friends [19]. Receiving feedback, e.g., likes and comments, is also helpful [20]. When sharing emotions through Facebook, there is a phenomenon of contagion, in which emotions may be passed on to the user's contacts [21]. Some users share negative emotions on Facebook. One study found that depressive symptoms of college students could be identified from their Facebook status [22,23]. However, some users avoid sharing negative news with their contacts, e.g., parents of ill infants did not disclose information when the treatment was not going well to avoid worrying others and to avoid receiving questions about the situation [24]. Problems and bad news are sometimes not shared through Facebook although users may need the support, because they try to avoid being boring, being perceived as negative person and overwhelming others [25].

Emotion Classification
Support Vector Machine (SVM) is a technique consists of supervised trained models with associated learning algorithms that analyze information and recognize patterns. Another technique to group elements is the k-means algorithm. This algorithm generates partitions for the entire dataset, creating k clusters with similar features [26,27]. There are studies that have been used those techniques with emotions [28,29]. Natural language processing explores how computers can be used to understand, recognize and manipulate natural language in text or speech. The algorithm used in this work combines SVM and clustering to reduce the information needed for training.
Balahur et. al [30] presented a method to identify and classify the valence of emotions within text using a cultural dependent lexical database. This work stands out for identifying emotions in a different language than English. Similarly, Garcia and Alias [31] present a system of emotion identification based in text with a language-dependent architecture, focusing their recognition on English and Spanish texts. Kolz et al. [32] describe a tool to identify emotions in text of chat sessions in spanish. Another study did a supervised extraction of emotions from blog sentences using Ekman's emotion classification and SVM, pre-processing emoticons and controlling intensities, although it faced difficulties in handling metaphors and irony [28].
A study comparing several algorithms analyzed Ekman's six basic emotions in news headlines, finding some algorithms were better for some cases and worse for others [29].A study analyzing emotions from Facebook status updates applied binary labels (positive, negative) and four emotions (happy, unhappy, skeptical, playful), finding variability between apparently similar status updates [33]. Sentiment analysis tools have also been used e.g., to analyze emotions (including polarity and intensity) on Twitter [34].

Design and Development of Analízate Platform
In this section we describe Analízate application development. Users access the application through their Facebook account, so that the application can then collect all their Facebook status updates. These are pre-processed by the application, and then sent to the emotion classifiers that label them. With this information, we proceed to generate the graphics, thus joining everything to create the final visualization. Figure 1 details the interaction between the main components of Analízate.
The methodology to develop this application had five steps: (1) First, 9 parent caregivers agreed to download all of their Facebook wall information and share it with the researchers. Then, the downloaded information was filtered to only get their status updates, creating with this an initial database; (2) A processing tool was developed, to normalize the status of different users; (3) To train the platform, it needed a set of validating states, we developed a web application for manual emotion classification, to populate the training database; (4) The models to automatically label Facebook statuses according to the emotions they represent were created; (5) Finally, the Facebook application was developed, which encompasses all the above steps, allowing users to access it through their Facebook account and get an automated review of their own emotional states.

Data and Text Processing
As mentioned previously, we obtained the Facebook status updates of 9 parent caregivers. Participant caregivers were all women aged 23 to 45 (average: 31), with one child under treatment for pediatric cancer in a public hospital in the southwest area of Santiago, Chile. Participants had been on Facebook for an average of 3 years; all of them did not currently work outside their home and were dedicated full time to caregiving. To collect data we met and explained the purpose of the study and they were asked to sign a consent form. The first step of the study involved the caregivers downloading a copy of their Facebook data, and then sharing the files index.html and wall.html with the researchers. These files include status updates and captions associated with photos, videos, and links that the user shared on their Facebook timeline. The number of total Facebook activities were 13,650, out of which 2055 (15%) represented Facebook status updates. We removed photos, videos and links from the data, leaving only the status updates written by the users (therefore, there was no re-posted material). As a result, we obtained a database with 1800 Facebook status updates. All participant data was anonymized, and it was not shared with other participants.
However, people communicate through social network sites using everyday language, common expressions, abbreviations, emoticons, and emojis, so status updates can vary in their style and format. To neutralize these differences, we created a Python script that used regular expressions to standardize the text. First, we found many instances of text written with a repeated letter, e.g., noooo, whaaaat, so exciiiiitinggg (originally: noooo, quéééé, que emocióóóóónnn). In Spanish, there is a maximum of two consecutively repeated vowels or consonants (and the only consonants allowed to be repeated twice are l, n, c, r, and b) [35]. Based on this rule, our code eliminated repetitions not defined in the Spanish language. Second, we processed and replaced emoticons, by looking for emoticons commonly used on Facebook and transforming them with their equivalent word, e.g., :-) was recognized and changed to happy.
Even though there are some approaches [36] which suggest that filtering emoticons might impact on information loss, our study is not using a word-root to create a new lexicon, instead increases the semantic value of emoticons trough replacing it by words.

Status Labeling
It is necessary to have a large dataset to train an emotion classifier. Using the ad-hoc database with the Facebook status updates from caregivers, the next step was to manually label each status. For this, we created a web application (using Ruby on Rails) to label the dataset. Each status update was reviewed by two researchers, who labeled it with the following information: a tertiary label (negative, neutral, positive), an emotion based on Ekman' classification (anger, disgust, fear, happiness, sadness and surprise) [16] and its intensity (high, medium, low), and an emotion based on caregiver emotions (happiness, sadness, energy, tiredness, relaxation, stress, calm and anger), along with an intensity (high, medium, low) [1]. We call this step "Labeling Task", and Figure 2 presents a view of the main screen to tag a status update.
While Ekman emotions commonly are used for facial expression recognition, and there are some approaches who argue that are not universally identified by all people [37], our work was inspired on the study presented by Das et al. [28] who extract Ekman emotions on text, i.e., phrases on blogs.

Emotion Classifier
We used supervised training to classify the emotions in our dataset, using the previously labeled data for training. The classifier was implemented on the Azure Machine Learning platform. We used supervised training to classify the emotions in our dataset, using 80% of the previously labeled data for training and 20% for testing. Three different classifiers were implemented on the Azure Machine Learning platform: (1) sentiments (negative/positive/neutral), (2) emotions according to Ekman's [16] and (3) emotions according to the caregivers' [1]. The tasks were made using the following process: marking and features codification, dataset separation, features selection through the Filter Based Feature Selection module of Azure, which identify most predictive features in the data-set [38]. Once the features were selected and marked, the model was trained. We did train 4 different models, and then we compare the scores to select the best case. The trained models were: decision forest, decision jungle, logistic regression, and neural network, based on performance of recall and precision we did choose the classifier. The classifiers chosen for each class were: logistic regression (for sentiments and caregivers' emotions) and neural networks (for Ekman's emotions), we based the decision on test results (without applying cross-validaton) [38]. Although we did ask about strength, we didn't use this value for the classification.

Facebook Application
To develop the Facebook application, we used Django, a web Framework based on Python and JavaScript, to connect to Facebook using the SDK. The visualization and front end was developed using HTML, CSS and Google charts to generate graphs that represent the levels of emotional information of each user (see Figure 3).

Methods
Inspired by a similar study [39], in which Facebook status updates were classified in two categories (positive/negative), we looked to evaluate the precision of our platform. Additionally, we wanted to evaluate the value and usefulness of the platform, as perceived by potential users.

Participants
The participants in this stage were regular Facebook users (not necessarily caregivers), who evaluated the Analízate application using their own Facebook data. Participants were 54 people (30 Male, 23 women and 1 non-binary user). The average age was 24.7 years, with a standard deviation of 5.52 and a range from 15 to 47.

Evaluation
To invite participants to evaluate the application, we used snowball sampling using Facebook, starting from an open invitation from one researcher. Participants were included as Tester of Analízate to have access to the application since it was in development mode, and not a fully released application. When users entered the application, they were asked to grant permissions to gather their published posts, and then the application automatically read and analyzed the posts to show them the results about their shared emotions. Afterwards, participants received a link to an online questionnaire, registered their informed consent and answered information about the application. Each participant could use the application whenever they wanted and could quit the experiment at any time, although most of them only opened the application once. This study used a non-probabilistic survey using direct sample selection and only those participants who finished completing the questionnaire were included.

1.
Precision: To measure the precision of the Analízate platform we conducted an evaluation with each user, in which each participant reviewed his/her first 10 status updates and the labels that were suggested by the application (sentiment, Ekman and caregiver emotions). For each status the user had to mark if it was correctly classified, partially correct, or incorrectly classified.

2.
Perceived value of Analízate: To analyze the perceived value we used the "Intrinsic Motivation Inventory" (IMI) [40], a questionnaire of intrinsic motivation guided by Self-Determination Theory (SDT) [41]. It is a multidimensional measurement device intended to assess participants' subjective experience related to a target activity. We used the Value/Usefulness sub-scale to measure the platform's perceived value. There were 7 questions using a 7-point Likert scale. The questions were related to the value of the activity, whether it was useful for reflection, whether it allowed to recognize emotions, if they would do it again (recurrence), if they could identify their own emotions, if they felt the activity to be beneficial and important. One additional question to evaluate the platform novelty was included.

3.
Self-reflection support: To understand the impact of the platform on support self-reflection we asked participants to answer a question about how the platform could help user reflect about their data published on Facebook. The question asked was, Do you believe that Analizate helps you reflect about what your post on Facebook? If so, why? This information, written as free-form text, was analyzed using thematic analysis [42] to identify common themes about the support on reflection that Analízate could provide.

Results
To analyze data we defined basic demographic parameters to process the results. Since the invitation to participate was open, we had a heterogeneous group of participants. Therefore, we grouped participants by age to homogeneously distribute the sample to measure performance, not looking for statistical representation.
Results were grouped into four age categories. Table 1 describes the distribution of participants. We also separated answers by gender. We wanted to check the marked differences between the defined segments in order to provide relevant guidelines for future research. The results are presented below.  Table 2 shows that the total precision of the platform reached 60%, meanwhile, the relative precision is 84%. Relative precision was measured by adding the amount of status updates reported as correctly classified and partially classified. It represents status updates that may have been correctly classified by some of the trained models, but incorrectly classified by some of the others. Also, Table 3 shows that on a global level, the percentages of properly classified status updates are approximately the same across age and gender. It should be noted that Table 2 also shows a 0% incorrect status classification rate for participants in the age group 29 or over, which could be explained by a more traditional writing style, avoiding "internet slang" and therefore making text processing simpler.

Value and Usefulness Perception
Tables 4 and 5 reveal that in general the rating of the platform on a 1 to 7 scale is near to 5. Also, we identified that younger and older groups considered that Analízate is a tool that helps them to reflect about their emotions. At the same time, users considered the platform as something helpful to recognize their own emotions. Our study identified that the younger and older participants were very interested in using Analízate again. Our study also found that the assessment of the novelty of the platform is positive, with an average of 4.3 (on a 5-point Likert scale).

Self-Reflection Supported by Analízate
In terms of how the platform contribute a self-reflective process, results show that 80% of participants consider that Analízate allows them to think about the emotions they share on Facebook.
To understand how Analízate works as a tool to support reflective processes, we conducted a thematic analysis over the optional questionnaire open-ended questions. The themes that emerged were :

3.
Emotions perception: describes how our platform helps users understand and recognize their own emotions.
"I believe there are subtle emotions in what we write, and probably when I wrote it I did not realize or analyze it, but when I did a second reading or having external help it was possible to see" (p. 17)

Conclusions
In this work we developed a platform for emotional monitoring based on the information shared through SNS, particularly Facebook. To create this platform, called Analízate, we developed several tools. The first one, for data and text processing, helps to pre-process text in Spanish. The second tool was a web application designed for status labeling and thereafter have a database with tagged status updates. This database was used for training emotion classification models. Finally, we created a Facebook application, which made an automatic review and classification of data, also providing an interface where users can review their emotions and sentiments available in their publications.
An application such as Analízate could potentially be used in contexts where people need support, as the case of informal caregivers. Commonly, informal caregivers present high levels of stress and emotional burden given the intense task of their role. Before evaluating this platform in a real context, with informal caregivers, it was necessary to test first with common users in everyday activities, which allows to evaluate the platform utility and performance, as has been proposed in other studies which work on difficult contexts and elderly people [43,44].
One key limitation of our work is that it was a platform in Spanish, so its scope of applicability is limited. The dataset created was small and included a high level of manual tagging, so there is a bias implicit in the dataset. Also, new data regulations will make studies such as this one more difficult.
As a future work, we would like to study deeply the behaviour and frequency of use of Facebook, to establish correlations between different characteristics of users and their emotions on Facebook. As Facebook is as a platform that caregivers use [5] as a tool for support and information, our future work will include to evaluate extensively with this group. Based on feedback from participants, there is a need to complement the information that Analízate shows in order to clarify each category. We would also like to test the application in other contexts in which people need support, e.g., adolescents.