You are currently viewing a new version of our website. To view the old version click .
Information
  • Article
  • Open Access

21 October 2021

Emotion Classification in Spanish: Exploring the Hard Classes

and
Facultad de Ingeniería, Instituto de Computación, Universidad de la República, Julio Herrera y Reissig 565, Montevideo 11300, Uruguay
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
This article belongs to the Special Issue Sentiment Analysis and Affective Computing

Abstract

The study of affective language has had numerous developments in the Natural Language Processing area in recent years, but the focus has been predominantly on Sentiment Analysis, an expression usually used to refer to the classification of texts according to their polarity or valence (positive vs. negative). The study of emotions, such as joy, sadness, anger, surprise, among others, has been much less developed and has fewer resources, both for English and for other languages, such as Spanish. In this paper, we present the most relevant existing resources for the study of emotions, mainly for Spanish; we describe some heuristics for the union of two existing corpora of Spanish tweets; and based on some experiments for classification of tweets according to seven categories (anger, disgust, fear, joy, sadness, surprise, and others) we analyze the most problematic classes.

1. Introduction

The study of sentiments and emotions expressed in texts has been part of the Natural Language Processing field since the 1990s, increasing in interest in the last two decades due to the large amount of texts available on the Internet [], especially messages on social networks where large numbers of people give their opinion on all relevant topics and express their personal emotions. Within this area, work has been done mainly on the classification of texts by their polarity or valence (positive or negative, differentiating in general neutral texts), this problem is usually called Sentiment Analysis. The detection of more complex emotions, on the other hand, is a problem of greater complexity, where a finer classification of texts is made, taking into account different emotions, such as joy, sadness, anger, surprise, among others. Emotion classification requires more expensive resources than the usual ones for polarity classification, as it requires more examples to cover the full set of classes, more annotators, and more attention to differentiate a varied number of classes and achieve a reliable inter-annotator agreement.
This effort is worthwhile as it could help in the detection of different problems that people experience and frequently express in social networks. Situations of harassment, signs of mental health problems, hate speech, and many other situations, can be detected by analyzing the affective content of texts. Furthermore, the study of emotions can be a tool for decision making in the political or business environment, where knowing the opinions and feelings of citizens or users can be very useful.
In order to face the problem of detecting emotions in texts, it is necessary to define the set of classes to be considered, and for this purpose work has been done on the basis of psychological studies. Several studies have worked on the scheme of three dimensions (valence, arousal, and dominance) presented by Wundt [], carrying out different experiments that support this approach [,,]. Other authors have defined sets of basic emotions, such as the set of six classes proposed by Ekman []: anger, disgust, fear, happiness, sadness, and surprise; or the Plutchik’s set of eight emotions []: anger, fear, sadness, disgust, surprise, anticipation, trust, and joy. These theoretical proposals have led to the creation of different emotion lexicons (see Section 2).
Although having specific lexicons can help with emotion detection, the main resources for text classification currently are annotated datasets for training machine learning models. Large datasets can be used with deep neural network approaches, usually without the need to define attributes such as emotion word counts using a lexicon. Creating these resources is costly and imprecise, since emotions must be interpreted, which places us in a clearly subjective field. Nonetheless, corpus annotation tasks have been carried out on the basis of different sets of emotions, using different types of texts and different annotation schemes, e.g., single-label or multi-label classification.
In this paper, we present a revision of the resources available for emotion analysis, focusing on resources for the Spanish language, and previous work on automatic emotion classification. We also describe a new dataset built by merging two existing emotion datasets for Spanish and then we present some experiments performed on the new dataset, taking as a starting point the systems we sent to the EmoEvalEs task at IberLEF 2021 [,]. Finally, we analyze the most problematic classes.
The paper is organized as follows. In Section 2, we present the related work. In Section 3 we describe the materials (the new corpus) and the experiments on automatic detection of emotions. In Section 4, we show and analyze the results of the experiments. Finally, in Section 5 we present the conclusions of the work.

3. Materials and Methods

Next, we describe a new emotion corpus, built by merging the EmoEvent corpus and the corpus used in the subtask on emotion classification at SemEval-2018 Task 1. Then, we present some experiments on emotion classification using the expanded corpus for training. For development and evaluation we used the original corpora from EmoEvalEs, in order to make some comparisons with previous results.

3.1. Corpora

We worked on the expansion of the EmoEvent corpus, which was used in the TASS-2020 and EmoEvalEs-2021 tasks (https://competitions.codalab.org/competitions/28682, accessed on 17 October 2021), by adapting the corpus used in the subtask on emotion classification (E-c) at SemEval-2018 Task1 (https://competitions.codalab.org/competitions/17751, accessed on 17 October 2021). Table 1 shows the statistics of the EmoEvent dataset, taking the partition in train/development/test used in the EmoEvalEs task. The corpus contains 8223 tweets: 5723 for training, 844 for development, and 1656 for testing.
Table 1. Number of tweets per category in the EmoEvent dataset.
In order to compare the results with those published in the proceedings of EmoEvalEs, we kept the original EmoEvalEs development and test corpora and worked on extending only the training corpus. Since the corpora of both tasks are annotated based on different schemes, different options were tested for the creation of the extended corpus.
On the one hand, the sets of categories are not the same: EmoEvent uses Ekman’s six classes, while SemEval uses a set of eleven categories, which includes Ekman’s. We tried two different approaches: the first approach was to keep only the six EmoEvent categories and discard the remaining ones; the second approach was to assign the label others to the five extra categories. As already stated, this class was used in the EmoEvent to annotate the tweets which received different emotions according to different annotators, or neutral tweets, so this decision appears consistent.
On the other hand, the annotation of the two corpora differs in the number of classes per tweet. In the EmoEvent corpus each tweet is assigned a single category, while in the SemEval corpus multi-label tagging was performed. Again, two different approaches were tested: filtering only tweets in the SemEval corpus that have a single label or duplicating tweets with multiple labels, generating an instance for each value.
After testing these different options, we decided to keep the tweets that do not have emotions included in the six-value schema, assigning them the category others, and to create multiple instances of the tweets that have more than one class from the six-value schema (one instance for each class assigned to the tweet).
We also tried to extend the corpus by translating the English and Arabic datasets from SemEval, and applying back-translation for data augmentation (using the huggingface MarianMT models (https://huggingface.co/transformers/model_doc/marian.html, accessed on 17 October 2021)), but the experiments performed with these versions of the training corpus always performed worse than using just the Spanish set from SemEval.
Table 2 shows the final EmoEvent + SemEval training corpus statistics.
Table 2. Number of tweets per category in the expanded training corpus (EmoEvent + SemEval).
A particular problem with these corpora, mainly the EmoEval corpus, is the significant imbalance between classes. Two classes stand out as particularly sparse: surprise and fear. Although these two classes are better represented after merging both corpora, the imbalance is still significant.

3.2. Preprocessing

First of all we removed URLs that could be present in the tweets. This was the only preprocessing done to the text when using the tweet for generating a BERT-based encoding, as we used a case-sensitive model. On the other hand, when using the tweet for extracting word embeddings or detecting occurrences of lexicon words, we removed all special characters (particularly the hashtag symbols) and kept only letters and numbers, then we converted all the text to lowercase and split tokens based on blank spaces.
EmoEvalEs [] reported replacing hashtag mentions with a single token “HASHTAG” in the tweets, and also replacing user mentions with a “USER” token. We noticed that the user mentions were replaced, but not the hashtags, and we decided to keep it that way as the hashtag text might convey important information about tweet topics, which could influence the emotions.

3.3. Experiments

Based on the experiments we carried out for our participation in EmoEvalEs, we tested different variants of the merged corpus with the best model presented in the task. This model consists of a LSTM network that takes as input information generated by BETO for each tweet. The information we take from BETO is the encoding of the CLF token, and the centroid of the encoding of each token. This combination gave us better results than using only the CLF embedding.
In addition, we added features to indicate whether the tweet contains any words from a list of relevant words obtained from the training corpus. These are the most relevant words for the discrimination of the different classes in the training corpus, generated by the k_best function, which uses the ANOVA F-value method. We tested different values of k, choosing the ones that gave the best results with a simple baseline using a SVM model with bag of words type features and tested them with the LSTM. While large values for k give good results for the baseline, with 2000 being the best value we found, for the LSTM model the best value turned out to be 30.
Besides these relevant word features, we tried including features for each one of the words in the NRC lexicon adapted to Spanish (https://saifmohammad.com/WebPages/AccessResource.htm, accessed on 17 October 2021). However, the experiments using this extended set of features from the lexicon did not result in any improvement.
In our experiments we did not use the extra information about the event to which the tweet belongs or whether it is offensive or not. This is information that is not available in new instances to be classified, so it does not make sense to rely on it. There is also no such information in the SemEval corpus with which we extended the EmoEvent corpus.
The final LSTM model uses a single bi-directional LSTM layer of size 96 and a dense layer of size 64 (with tanh activation), the output of the LSTM is concatenated with the BERT features and features for the 30 best words.
For all the experiments carried out with SVM, mentioned in later sections, we used the default parameters configuration of the svc function of the sklearn library (https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html, accessed on 17 October 2021).

4. Results and Discussion

Table 3 shows the results on the development and test corpora from EmoEvalEs of the neural model trained with the EmoEvent corpus, on the one hand, and with the EmoEvent + SemEval corpus, on the other hand. The metrics we used for evaluation are Accuracy (Acc) and Weighted F1 (W-F1).
Table 3. Results on the development and test sets of the neural model trained with EmoEvent and with EmoEvent + SemEval.
Table 4 shows the results of our current best model, which uses data from EmoEvent and the Spanish set of SemEval data, compared to the best and worst systems in EmoEvalEs. We also include our own submission to EmoEvalEs for comparison.
Table 4. Comparison of the best, worst, and our system (originally described in []) in the EmoEvalEs competition, together with the result of the new model trained with data from EmoEvent and SemEval. Metrics: Accuracy (Acc), Weighted Precision (W-P), Weighted Recall (W-R), and Weighted F1 (W-F1).
The results seem to indicate a slight improvement training with the extended train corpus, over training only with the EmoEvent corpus, but, as the confidence intervals are not completely separate, further experiments would have to be performed to confirm this. This result is of particular interest since the new corpus contains tweets on different topics, not only tweets on some specific events, as is the case with EmoEvent. The combination of the two corpora could have had a negative effect on the results, compared to the training with the original corpus, since the test corpus contains exclusively tweets related to the events selected for building the EmoEvent corpus.
Looking at the confusion matrix of the model trained with the EmoEvent + SemEval corpus, shown in Figure 1, we see that almost all classes tend to be confused with the class others. Due to the way this class was generated [], it is expected that many of these tweets express some emotion, since tweets that received different emotions by different annotators were assigned to the others category. It is not a category representing tweets without emotion, but tweets with some emotion or a mixture of several ones, and probably also neutral tweets. Something similar happened with the neutral class of the dataset for sentiment analysis of the TASS task, where tweets with a neutral polarity and also tweets with mixed polarity, i.e., with both positive and negative nuances, could be found. In [], we discuss this problem and show a detailed analysis of tweets belonging to that category.
Figure 1. Confusion matrix over the test set for the best model. Rows show expected values and columns show predicted values.
This can be seen graphically in Figure 2 as well. In this diagram, the crossing arcs represent cases in which the classifier is wrong, and the bumps inside categories represent cases when the classifier is right. We can see in this diagram that the others category is the most numerous, and approximately a quarter of their expected values are classified as other categories. The class joy, on the other hand, has close to half of its examples classified as others. Note that categories surprise, fear, and anger have almost no visible bump, as almost no example of these categories is correctly classified. Furthermore, we can see that the disgust class is wrongly mistaken with the anger and others category in similar proportions.
Figure 2. Chord diagram representing the confusion matrix for the best model. An incoming arc from class X to Y represents that class Y was expected, but X was predicted.
An experiment performed with a new version of the corpus excluding the others category allows us to see how the remaining categories are better classified. In Table 5, we show the results for each category using both versions of the corpus: with and without the others class.
Table 5. Results on the test set training with the EmoEvent + SemEval corpus, with and without the others class.
As can be seen, the overall measures improve significantly. On the test corpus, Accuracy rises 5.87 points and Weighted F1 rises 5.50 points, the increase being even greater in the fear (+22) and joy (+25) categories. However, the most problematic class (disgust) does not improve.
Some examples of others tweets from the training corpus show the diversity we can find in this class:
  • A sad tweet: Guardaré en mis ojos tu última mirada… #notredame #paris #francia #photography #streetphotography;
  • A clearly positive tweet, that could even have been annotated as joy: Que clase táctica están dando estos dos Equipos… bendita #ChampionsLeague;
  • An informative tweet, with no emotion: El escrutinio en el Senado va mucho más lento. Solo el 14.85% del voto escrutado #28A #ElecccionesGenerales28A.
Besides the others class, we carried some more experiments on the three more difficult classes to detect (disgust, fear, and surprise), to try to understand what made these categories so difficult. One interesting experiment is trying to analyze which words are the most relevant ones for each class, the ones that would let us tell apart a tweet in one of these categories with the highest confidence. In order to do this, we trained several variants of SVM classifiers using different lists of BOW features found with the ANOVA F-value method. For these classifiers, we were only trying to classify a class against all the rest, for example: disgust vs. no-disgust, or fear vs. no-fear.

4.1. The Disgust Class

Regarding the disgust class, our best neural models are not able to correctly spot even a single tweet from this class. Some examples from this class are:
  • Tweet transmitting a very low level of dislike: Me cuesta mucho entender la fiesta de Ciudadanos…#ElecccionesGenerales28A;
  • Tweet transmitting a very high level of dislike Caterva de hijueputas venezolanos que le hacen juego al pilche golpe. Háganse los valientes en #Venezuela y no jodan en Ecuador. Dan asco….;
  • Informative tweet: Los gobiernos de #Argentina #Brasil #Canada #Chile #Colombia #CostaRica #Guatemala #Honduras #Panamá #Paraguay #Peru y #Venezuela, miembros del #GrupoDeLima, “conminan a USER a cesar la usurpación, para que pueda empezar la transición democrática” en #Venezuela;
  • Tweet that could have been annotated as anger: Si no fuésemos estúpidos/as, los gestores de nuestro sistema alimentario estarían en prisión, por actos criminales contra la naturaleza y la salud pública. #ExtinctionRebellion #GretaThunberg.
As we can see, there is a great disparity of opinions inside this category, many of them convey different levels of dislike, but some could perhaps be best represented as other categories. This is consistent with the fact that it was one of the categories with the lowest inter-annotator agreement [], which might indicate the difficulty to properly characterize this category.
As the confusion matrix showed in Figure 1, this class is frequently confused with the anger class. This is to be expected because they are two classes with a strong negative content; they could even be seen as two degrees of the same emotion.
Our SVM analysis showed that, even using the 200 best words as features, it is only possible to achieve 0.1 F1 on the development set, and it still gets an absolute 0.0 F1 on the test set. This might indicate that the vocabulary overlap between those sets is rather small, which renders the problem even harder.

4.2. The Fear Class

When analyzing the fear class, we realized that a limited number of words occur in many tweets: miedo, ansiedad, temor, among others. We performed some experiments with an SVM classifier using features from BOW, selecting the best words provided by the k_best function, replacing all the classes but fear by the tag no-fear on the corpus. We found that with just 20 words we could reach good results: 0.60 of F1 on the development corpus and 0.45 on the test corpus. The 20 words used as features are:
  • Ansiedad;
  • Ansiosa;
  • Ansioso;
  • Asustada;
  • Asustado;
  • Asustar;
  • Co;
  • https;
  • Miedo;
  • Nerviosa;
  • Nervioso;
  • Peligroso;
  • Pesadilla;
  • Preocupación;
  • Preocupada;
  • Pánico;
  • Susto;
  • Temblor;
  • Temor;
  • Terror.

4.3. The Surprise Class

For the surprise class, the experiments with the SVM plus BOW features that gave the best results used the 200 words from the k_best function, reaching 0.23 for F1 both on the development and the test set.
Analyzing some tweets from this class, we observed that more than the vocabulary, what many tweets have in common is the use of intensifying punctuation marks such as repeated exclamation or question marks (as can be seen in the first examples below). Another characteristic of this class is the use of some explicit references to a surprise or being surprised, or some particular idioms like “sin palabras” (breathtaking).
  • Liverpool está paseando al barcelona, hace tiempo no lo veía tan presionado al barca…!! #ChampionsLeague;
  • Tremendo liderazgo de #GretaThunberg!! Tiene 16 años y nos está haciendo a TODOS mirar el mundo con otros ojos! Entremos en pánico, salvemos el planeta!! #CambioClimatico;
  • El Messi de hoy deslumbra! Que nivel #ChampionsLeague;
  • Primera vez que veo a Messi exagerar una falta. #ChampionsLeague;
  • El único episodio que me ha dejado sin palabras. #JuegoDeTronos #GameofThrones;
  • Menudo sorpresón final. Eso sí que no me lo esperaba. #JuegoDeTronos;
  • Lo que vino a sacarles #NotreDame, es TODA su amargura. A la madere, me tienen sorprendida.

5. Conclusions

Emotion detection and classification is a very hard task, but it has started to gain traction quickly in the latest years. Although resources for this task, particularly for Spanish, are still scarce, we consider this situation might change in the future if the task keeps being developed for Spanish and for other languages. For the time being, the performance for emotion classification in Spanish still has a lot of room for improvement. We performed a series of experiments using different variants of the available data for classifying emotions in Spanish: tweet datasets EmoEvent and SemEval, and an adaptation of an English lexicon specialized in emotion detection to Spanish.
The best model we found combines a LSTM neural network enriched with encoding features calculated with BERT in Spanish and features representing the most salient words from the corpus. This model is trained using the training set from EmoEvent, plus the Spanish set from the SemEval emotion classification challenge. It achieves an accuracy of 0.6860 and a weighted F1 score of 0.6620 over the EmoEvent test set. We compared our performance against our previous results and other results from the EmoEvalEs competition.
Despite our results being promising, we found we are still a few points below the state of the art for this task in Spanish. Furthermore, even if our new model seems to show improvements for some of the least numerous classes (fear and surprise), it still cannot detect any tweets from the disgust class, and we must consider that the confidence intervals for the new and previous model are not completely separate as well. We analyzed the three categories with fewer examples (disgust, fear, and surprise, to try to understand why they behave that way. We conclude that the great disparity of negative emotions conveyed in the disgust tweets and the large variability of vocabulary might be one of the causes behind the difficulty of this class. We also analyzed the class others, which contains tweets with very diverse content, and observed that excluding these tweets from the corpus significantly increases the results.

Author Contributions

Conceptualization, A.R.; Formal analysis, A.R. and L.C.; Investigation, A.R. and L.C.; Software, L.C.; Writing—original draft, A.R. and L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable, the study does not report any data.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
BERTBidirectional Encoder Representations from Transformers
BOWBag of Words
ANEWAffective Norms for English Words
LSTMLong Short-Term Memory
SELSpanish Emotion Lexicon
SVMSupport Vector Machine

References

  1. Liu, B. Sentiment Analysis. Mining Opinions, Sentiments, and Emotions, 2nd ed.; Cambridge University Press: New York, NY, USA, 2020. [Google Scholar]
  2. Wundt, W. Grundriss der Psychologie (Outlines of Psychology); Engelmann: Leibzig, Germany, 1874. [Google Scholar]
  3. Osgood, C.; Suci, G.; Tenenbaum, P. The Measurement of Meaning; University of Illinois Press: Urbana, IL, USA, 1957. [Google Scholar]
  4. Mehrabian, A.; Russell, J. An Approach to Environmental Psychology; The MIT Press: Cambridge, MA, USA, 1974. [Google Scholar]
  5. Bradley, M.M.; Lang, P.J. Measuring emotion: The self-assessment manikin and the semantic differential. J. Behav. Ther. Exp. Psychiatry 1994, 25, 49–59. [Google Scholar] [CrossRef]
  6. Ekman, P. Facial Expressions of Emotion: New Findings, New Questions. Psychol. Sci. 1992, 3, 34–38. [Google Scholar] [CrossRef]
  7. Plutchik, R. A psychoevolutionary theory of emotions. Soc. Sci. Inf. 1982, 21, 529–553. [Google Scholar] [CrossRef]
  8. Chiruzzo, L.; Rosá, A. RETUYT-InCo at EmoEvalEs 2021: Multiclass Emotion Classification in Spanish. In Proceedings of the Iberian Languages Evaluation Forum Co-Located with the XXXVII International Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), Málaga, Spain, 21–24 September 2021. [Google Scholar]
  9. Plaza-del-Arco, F.M.; Jiménez-Zafra, S.M.; Montejo-Ráez, A.; Molina-González, M.D.; Ureña-López, L.A.; Martín-Valdivia, M.T. Overview of the EmoEvalEs task on emotion detection for Spanish at IberLEF 2021. Proces. Leng. Nat. 2021, 67, 155–161. [Google Scholar]
  10. Bradley, M.; Lang, P. Affective Norms for English Words (ANEW): Instruction Manual and Affective Ratings; Technical Report; Center for Research in Psychophysiology, University of Florida: Gainesville, FL, USA, 1999. [Google Scholar]
  11. Redondo, J.; Fraga, I.; Padrón, I.; Comesaña, M. The Spanish adaptation of ANEW (Affective Norms for English Words). Behav. Res. Methods 2007, 39, 600–605. [Google Scholar] [CrossRef] [PubMed]
  12. Warriner, A.B.; Kuperman, V.; Brysbaert, M. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 2013, 45, 1191–1207. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Shaikh, S.; Cho, K.; Strzalkowski, T.; Feldman, L.; Lien, J.; Liu, T.; Broadwell, G.A. ANEW+: Automatic Expansion and Validation of Affective Norms of Words Lexicons in Multiple Languages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia, 23–28 May 2016; pp. 1127–1132. [Google Scholar]
  14. Mohammad, S.; Turney, P.D. Crowdsourcing a Word-Emotion Association Lexicon. Comput. Intell. 2013, 29, 436–465. [Google Scholar] [CrossRef] [Green Version]
  15. Rangel, I.; Sidorov, G.; Guerra, S. Creación y evaluación de un diccionario marcado con emociones y ponderado para el español. Onomázein 2014, 29, 31–46. [Google Scholar] [CrossRef] [Green Version]
  16. Plaza-del-Arco, F.M.; Molina-González, M.D.; Jiménez-Zafra, S.M.; Martín-Valdivia, M.T. Lexicon Adaptation for Spanish Emotion Mining. Proces. Leng. Nat. 2018, 61, 117–124. [Google Scholar]
  17. Mohammad, S.M.; Bravo-Marquez, F.; Salameh, M.; Kiritchenko, S. SemEval-2018 Task 1: Affect in Tweets. In Proceedings of the International Workshop on Semantic Evaluation (SemEval-2018), New Orleans, LA, USA, 5–6 June 2018. [Google Scholar]
  18. Strapparava, C.; Mihalcea, R. SemEval-2007 Task 14: Affective Text. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23–24 June 2007; pp. 70–74. [Google Scholar]
  19. Mohammad, S.; Bravo-Marquez, F. Emotion Intensities in Tweets. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017), Vancouver, QC, Canada, 3–4 August 2017; pp. 65–77. [Google Scholar] [CrossRef] [Green Version]
  20. Mohammad, S.M.; Bravo-Marquez, F. WASSA-2017 Shared Task on Emotion Intensity. In Proceedings of the Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA), Copenhagen, Denmark, 8 September 2017. [Google Scholar]
  21. Mohammad, S.; Kiritchenko, S. Understanding Emotions: A Dataset of Tweets to Study Interactions between Affect Categories. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC), Miyazaki, Japan, 7–12 May 2018. [Google Scholar]
  22. Plaza-del-Arco, F.; Strapparava, C.; Ureña-López, L.A.; Martín-Valdivia, M.T. EmoEvent: A Multilingual Emotion Corpus based on different Events. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 1492–1498. [Google Scholar]
  23. García-Vega, M.; Díaz-Galiano, M.C.; García-Cumbreras, M.A.; del Arco, F.M.P.; Montejo-Ráez, A.; Jiménez-Zafra, S.M.; Martínez Cámara, E.; Aguilar, C.A.; Cabezudo, M.A.S.; Chiruzzo, L.; et al. Overview of TASS 2020: Introducing Emotion Detection. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) Co-Located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020; pp. 163–170. [Google Scholar]
  24. Montes, M.; Rosso, P.; Gonzalo, J.; Aragón, E.; Agerri, R.; Álvarez-Carmona, M.Á.; Álvarez Mellado, E.; Carrillo-de Albornoz, J.; Chiruzzo, L.; Freitas, L.; et al. Iberian Languages Evaluation Forum 2021. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021), Málaga, Spain, 21–24 September 2021; Available online: http://ceur-ws.org/Vol-2943/ (accessed on 17 October 2021).
  25. Acheampong, F.A.; Wenyu, C.; Nunoo-Mensah, H. Text-based emotion detection: Advances, challenges, and opportunities. Eng. Rep. 2020, 2, e12189. [Google Scholar] [CrossRef]
  26. González, J.Á.; Hurtado, L.F.; Pla, F. ELiRF-UPV at SemEval-2018 Tasks 1 and 3: Affect and Irony Detection in Tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation, New Orleans, LA, USA, 5–6 June 2018; pp. 565–569. [Google Scholar] [CrossRef]
  27. Ángel González, J.; Moncho, J.A.; Hurtado, L.F.; Pla, F. ELiRF-UPV at TASS 2020: TWilBERT for Sentiment Analysis and Emotion Detection in Spanish Tweets. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) Co-Located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), Málaga, Spain, 23 September 2020; pp. 179–186. [Google Scholar]
  28. Gonzalez, J.A.; Hurtado, L.F.; Pla, F. TWilBert: Pre-trained deep bidirectional transformers for Spanish Twitter. Neurocomputing 2021, 426, 58–69. [Google Scholar] [CrossRef]
  29. Plaza-del-Arco, F.; Martín-Valdivia, M.T.; Ureña-López, L.A.; Mitkov, R. Improved emotion recognition in Spanish social media through incorporation of lexical knowledge. Future Gener. Comput. Syst. 2020, 110, 1000–1008. [Google Scholar] [CrossRef]
  30. Cañete, J.; Chaperon, G.; Fuentes, R.; Ho, J.H.; Kang, H.; Pérez, J. Spanish Pre-Trained BERT Model and Evaluation Data. PML4DC at ICLR 2020. 2020. Available online: https://users.dcc.uchile.cl/~jperez/papers/pml4dc2020.pdf (accessed on 17 October 2021).
  31. Vera, D.; Araque, O.; Iglesias, C.A. GSI-UPM at IberLEF2021: Emotion Analysis of Spanish Tweets by Fine-tuning the XLM-RoBERTa Language Model. In Proceedings of the Iberian Languages Evaluation Forum Co-Located with the XXXVII International Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), Málaga, Spain, 21–24 September 2021. [Google Scholar]
  32. Tafreshi, S.; De Clercq, O.; Barriere, V.; Buechel, S.; Sedoc, J.; Balahur, A. WASSA 2021 Shared Task: Predicting Empathy and Emotion in Reaction to News Stories. In Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Online, 19 April 2021; pp. 92–104. [Google Scholar]
  33. Deng, J.; Ren, F. Multi-label Emotion Detection via Emotion-Specified Feature Extraction and Emotion Correlation Learning. IEEE Trans. Affect. Comput. 2020. [Google Scholar] [CrossRef]
  34. Ahmad, Z.; Jindal, R.; Ekbal, A.; Bhattachharyya, P. Borrow from rich cousin: Transfer learning for emotion detection using cross lingual embedding. Expert Syst. Appl. 2020, 139, 112851. [Google Scholar] [CrossRef]
  35. Abdul-Mageed, M.; Ungar, L. EmoNet: Fine-Grained Emotion Detection with Gated Recurrent Neural Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, QC, Canada, 7–12 August 2017; pp. 718–728. [Google Scholar] [CrossRef] [Green Version]
  36. Colnerič, N.; Demšar, J. Emotion Recognition on Twitter: Comparative Study and Training a Unison Model. IEEE Trans. Affect. Comput. 2020, 11, 433–446. [Google Scholar] [CrossRef]
  37. Wadhawan, A.; Aggarwal, A. Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach. In Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Online, 19 April 2021; pp. 195–202. [Google Scholar]
  38. Chiruzzo, L.; Etcheverry, M.; Rosá, A. Sentiment analysis in Spanish tweets: Some experiments with focus on neutral tweets. Rev. Proces. Leng. Nat. 2020, 64, 109–116. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.