Next Article in Journal
Robust Cochlear-Model-Based Speech Recognition
Next Article in Special Issue
J48SS: A Novel Decision Tree Approach for the Handling of Sequential and Time Series Data
Previous Article in Journal
Utilizing Transfer Learning and Homomorphic Encryption in a Privacy Preserving and Secure Biometric Recognition System
Article Menu
Issue 1 (March) cover image

Export Article

Open AccessArticle
Computers 2019, 8(1), 4; https://doi.org/10.3390/computers8010004

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

1
Faculty of Informatics, Vytautas Magnus University, K. Donelaičio 58, 44248 Kaunas, Lithuania
2
Department of Software Engineering, Kaunas University of Technology, K. Donelaičio 73, 44249 Kaunas, Lithuania
3
Institute of Mathematics, Silesian University of Technology, Kaszubska 23, 44-100 Gliwice, Poland
*
Author to whom correspondence should be addressed.
Received: 27 November 2018 / Revised: 21 December 2018 / Accepted: 24 December 2018 / Published: 1 January 2019
Full-Text   |   PDF [4553 KB, uploaded 1 January 2019]   |  
  |   Review Reports

Abstract

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets. View Full-Text
Keywords: sentiment analysis; machine learning; deep learning; neural word embeddings; Internet comments; Lithuanian language sentiment analysis; machine learning; deep learning; neural word embeddings; Internet comments; Lithuanian language
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Kapočiūtė-Dzikienė, J.; Damaševičius, R.; Woźniak, M. Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches. Computers 2019, 8, 4.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Computers EISSN 2073-431X Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top