Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (19)

Search Parameters:
Keywords = movie-reviews-based classification

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 8931 KB  
Article
CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization
by Amit Kumar Srivastava, Pooja, Musrrat Ali and Yonis Gulzar
Mathematics 2025, 13(22), 3645; https://doi.org/10.3390/math13223645 - 13 Nov 2025
Viewed by 445
Abstract
In recent years, social media-related sentiment classification has been researched extensively and is applied in various fields such as opinion mining, commodity feedback, and market analysis. Therefore, it is important to understand and analyse the opinions of the public, their feedback, and data [...] Read more.
In recent years, social media-related sentiment classification has been researched extensively and is applied in various fields such as opinion mining, commodity feedback, and market analysis. Therefore, it is important to understand and analyse the opinions of the public, their feedback, and data related to social media. Consumers continue to face challenges in accessing review-based sentiment classification expressed by their peers, and the existing method does not provide satisfactory results. Hence, an innovative sentiment classification method, the Convoluted Graph Pyramid Attention (CGPA) model, combined with the Updated Greater Cane Rat Algorithm (UGCRA), is proposed. This method improves sentiment classification by optimizing accuracy and efficiency while addressing inherent uncertainties, allowing for precise sentiment intensity evaluation across multiple dimensions. Explainable Artificial Intelligence (XAI) techniques, particularly SHapley Additive exPlanations (SHAPs), enhance the model’s transparency and interpretability. This approach enables the final ranking of classified reviews, predicts ratings on a scale of one to five stars, and generates a recommendation list based on the predicted user ratings. Comparison between other traditional existing methods and the result indicates that the proposed method achieves superior performance. From the experimental results, the proposed approach achieves an accuracy of 99.5% in the Restaurant Review dataset, 99.8% in the Edmund Consumer Car Ratings Reviews dataset, 99.9% in the Flipkart Cell Phone Reviews dataset, and 99.7% in the IMDB Movie database, showing its effectiveness in analysing sentiments with an increase in performance. Full article
Show Figures

Figure 1

11 pages, 463 KB  
Proceeding Paper
A Deep Convolutional Neural Network-Based Model for Aspect and Polarity Classification in Hausa Movie Reviews
by Umar Ibrahim, Abubakar Yakubu Zandam, Fatima Muhammad Adam, Aminu Musa, Mohamed Hassan, Mohamed Hamada and Muhammad Shamsu Usman
Eng. Proc. 2025, 107(1), 21; https://doi.org/10.3390/engproc2025107021 - 26 Aug 2025
Viewed by 2949
Abstract
Aspect-based sentiment analysis (ABSA) plays a pivotal role in understanding the nuances of sentiment expressed in text, particularly in the context of diverse languages and cultures. This paper presents a novel deep convolutional neural network (CNN)-based model tailored for aspect and polarity classification [...] Read more.
Aspect-based sentiment analysis (ABSA) plays a pivotal role in understanding the nuances of sentiment expressed in text, particularly in the context of diverse languages and cultures. This paper presents a novel deep convolutional neural network (CNN)-based model tailored for aspect and polarity classification in Hausa movie reviews, as Hausa is an underrepresented language with limited resources and presence in sentiment analysis research. One of the primary implications of this work is the creation of a comprehensive Hausa ABSA dataset, which addresses a significant gap in the availability of resources for sentiment analysis in underrepresented languages. This dataset fosters a more inclusive sentiment analysis landscape and advances research in languages with limited resources. The collected dataset was first preprocessed using Sci-Kit Learn to perform TF-IDF transformation for extracting feature word vector weights. Aspect-level feature ontology words within the analyzed text were derived, and the sentiment of the reviewed texts was manually annotated. The proposed model combines convolutional neural networks (CNNs) with an attention mechanism to aid aspect word prediction. The model utilizes sentences from the corpus and feature words as vector inputs to enhance prediction accuracy. The proposed model leverages the advantages of the convolutional and attention layers to extract contextual information and sentiment polarities from Hausa movie reviews. The performance demonstrates the applicability of such models to underrepresented languages. With 91% accuracy on aspect term extraction and 92% on sentiment polarity classification, the model excels in aspect identification and sentiment analysis, offering insights into specific aspects of interest and their associated sentiments. The proposed model outperformed traditional machine models in both aspect word and polarity prediction. Through the creation of the Hausa ABSA dataset and the development of an effective model, this study makes significant advances in ABSA research. It has wide-ranging implications for the sentiment analysis field in the context of underrepresented languages. Full article
Show Figures

Figure 1

18 pages, 1983 KB  
Proceeding Paper
HauBERT: A Transformer Model for Aspect-Based Sentiment Analysis of Hausa-Language Movie Reviews
by Aminu Musa, Fatima Muhammad Adam, Umar Ibrahim and Abubakar Yakubu Zandam
Eng. Proc. 2025, 87(1), 43; https://doi.org/10.3390/engproc2025087043 - 9 Apr 2025
Cited by 1 | Viewed by 1987
Abstract
In this study, we present a groundbreaking approach to aspect-based sentiment analysis (ABSA) using transformer-based models. ABSA is essential for understanding the intricate nuances of sentiment expressed in text, particularly across diverse linguistic and cultural contexts. Focusing on movie reviews in Hausa, a [...] Read more.
In this study, we present a groundbreaking approach to aspect-based sentiment analysis (ABSA) using transformer-based models. ABSA is essential for understanding the intricate nuances of sentiment expressed in text, particularly across diverse linguistic and cultural contexts. Focusing on movie reviews in Hausa, a language under-represented in sentiment analysis research, we propose HauBERT, a bidirectional transformer-based approach tailored for aspect and polarity classification, by fine-tuning a pre-trained mBERT model. Our work addresses the scarcity of resources for sentiment analysis in under-represented languages by creating a comprehensive Hausa ABSA dataset. Leveraging this dataset, we preprocess the text using state-of-the-art techniques for feature extraction, enhancing the model’s ability to capture nuanced aspects of sentiment. Furthermore, we manually annotate aspect-level feature ontology words and sentiment polarity assignments within the reviewed text, enriching the dataset with valuable semantic information. Our proposed transformer-based model utilizes self-attention mechanisms to capture long-range dependencies and contextual information, enabling it to effectively analyze sentiment in Hausa movie reviews. The proposed model achieves significant accuracy in aspect term extraction and sentiment polarity classification, with scores of 99% and 92% respectively, outperforming traditional machine models. This demonstrates the transformer’s ability to capture complex linguistic patterns and nuances of sentiment. Our study advances ABSA research and contributes to a more inclusive sentiment analysis landscape by providing resources and models tailored for under-represented languages. Full article
(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

28 pages, 4947 KB  
Article
The Detection of Spurious Correlations in Public Bidding and Contract Descriptions Using Explainable Artificial Intelligence and Unsupervised Learning
by Hélcio de Abreu Soares, Raimundo Santos Moura, Vinícius Ponte Machado, Anselmo Paiva, Weslley Lima and Rodrigo Veras
Electronics 2025, 14(7), 1251; https://doi.org/10.3390/electronics14071251 - 22 Mar 2025
Cited by 1 | Viewed by 2083
Abstract
Artificial Intelligence (AI) models, including deep learning and rule-based approaches, often function as black boxes, limiting transparency and increasing uncertainty in decisions. This study addresses spurious correlations, defined as associations between patterns and classes that do not reflect causal relationships, affecting AI models’ [...] Read more.
Artificial Intelligence (AI) models, including deep learning and rule-based approaches, often function as black boxes, limiting transparency and increasing uncertainty in decisions. This study addresses spurious correlations, defined as associations between patterns and classes that do not reflect causal relationships, affecting AI models’ reliability and applicability. In Natural Language Processing (NLP), these correlations lead to inaccurate predictions, biases, and challenges in model generalization. We propose a method that employs Explainable Artificial Intelligence (XAI) techniques to detect spurious patterns in textual datasets for binary classification tasks. The method applies the K-means algorithm to cluster patterns and interprets them based on their distance from centroids. It hypothesizes that patterns farther from the centroids are more likely to be spurious than those closer to them. We apply the method to public procurement datasets from the Court of Auditors of Piauí (TCE-PI) using models based on Support Vector Machine (SVM) and Logistic Regression with text representations from TFIDF and Word Embeddings, as well as a BERT model. The analysis is extended to the IMDB movie review dataset to evaluate generalizability. The results support the hypothesis that patterns farther from centroids exhibit higher spuriousness potential and demonstrate the clustering’s consistency across models and datasets. The method operates independently of the techniques used in its stages, enabling the automatic detection and quantification of spurious patterns without prior human intervention. Full article
(This article belongs to the Special Issue Advanced Natural Language Processing Technology and Applications)
Show Figures

Figure 1

18 pages, 1651 KB  
Article
Sentiment Analysis of Product Reviews Using Machine Learning and Pre-Trained LLM
by Pawanjit Singh Ghatora, Seyed Ebrahim Hosseini, Shahbaz Pervez, Muhammad Javed Iqbal and Nabil Shaukat
Big Data Cogn. Comput. 2024, 8(12), 199; https://doi.org/10.3390/bdcc8120199 - 23 Dec 2024
Cited by 27 | Viewed by 16423
Abstract
Sentiment analysis via artificial intelligence, i.e., machine learning and large language models (LLMs), is a pivotal tool that classifies sentiments within texts as positive, negative, or neutral. It enables computers to automatically detect and interpret emotions from textual data, covering a spectrum of [...] Read more.
Sentiment analysis via artificial intelligence, i.e., machine learning and large language models (LLMs), is a pivotal tool that classifies sentiments within texts as positive, negative, or neutral. It enables computers to automatically detect and interpret emotions from textual data, covering a spectrum of feelings without direct human intervention. Sentiment analysis is integral to marketing research, helping to gauge consumer emotions and opinions across various sectors. Its applications span analyzing movie reviews, monitoring social media, evaluating product feedback, assessing employee sentiments, and identifying hate speech. This study explores the application of both traditional machine learning and pre-trained LLMs for automated sentiment analysis of customer product reviews. The motivation behind this work lies in the demand for more nuanced understanding of consumer sentiments that can drive data-informed business decisions. In this research, we applied machine learning-based classifiers, i.e., Random Forest, Naive Bayes, and Support Vector Machine, alongside the GPT-4 model to benchmark their effectiveness for sentiment analysis. Traditional models show better results and efficiency in processing short, concise text, with SVM in classifying sentiment of short length comments. However, GPT-4 showed better results with more detailed texts, capturing subtle sentiments with higher precision, recall, and F1 scores to uniquely identify mixed sentiments not found in the simpler models. Conclusively, this study shows that LLMs outperform traditional models in context-rich sentiment analysis by not only providing accurate sentiment classification but also insightful explanations. These results enable LLMs to provide a superior tool for customer-centric businesses, which helps actionable insights to be derived from any textual data. Full article
Show Figures

Figure 1

26 pages, 6325 KB  
Article
Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)
by Chanthol Eang and Seungjae Lee
Appl. Sci. 2024, 14(18), 8388; https://doi.org/10.3390/app14188388 - 18 Sep 2024
Cited by 22 | Viewed by 11122
Abstract
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This [...] Read more.
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

14 pages, 2607 KB  
Article
Enhanced Text Classification with Label-Aware Graph Convolutional Networks
by Ming-Yen Lin, Hsuan-Chun Liu and Sue-Chen Hsush
Electronics 2024, 13(15), 2944; https://doi.org/10.3390/electronics13152944 - 25 Jul 2024
Cited by 1 | Viewed by 1675
Abstract
Text classification is an important research field in text mining and natural language processing, gaining momentum with the growth of social networks. Despite the accuracy advancements made by deep learning models, existing graph neural network-based methods often overlook the implicit class information within [...] Read more.
Text classification is an important research field in text mining and natural language processing, gaining momentum with the growth of social networks. Despite the accuracy advancements made by deep learning models, existing graph neural network-based methods often overlook the implicit class information within texts. To address this gap, we propose a graph neural network model named LaGCN to improve classification accuracy. LaGCN utilizes the latent class information in texts, treating it as explicit class labels. It refines the graph convolution process by adding label-aware nodes to capture document–word, word–word, and word–class correlations for text classification. Comparing LaGCN with leading-edge models like HDGCN and BERT, our experiments on Ohsumed, Movie Review, 20 Newsgroups, and R8 datasets demonstrate its superiority. LaGCN outperformed existing methods, showing average accuracy improvements of 19.47%, 10%, 4.67%, and 0.4%, respectively. This advancement underscores the importance of integrating class information into graph neural networks, setting a new benchmark for text classification tasks. Full article
Show Figures

Figure 1

26 pages, 5669 KB  
Article
A Natural-Language-Processing-Based Method for the Clustering and Analysis of Movie Reviews and Classification by Genre
by Fernando González, Miguel Torres-Ruiz, Guadalupe Rivera-Torruco, Liliana Chonona-Hernández and Rolando Quintero
Mathematics 2023, 11(23), 4735; https://doi.org/10.3390/math11234735 - 22 Nov 2023
Cited by 16 | Viewed by 4727
Abstract
Reclassification of massive datasets acquired through different approaches, such as web scraping, is a big challenge to demonstrate the effectiveness of a machine learning model. Notably, there is a strong influence of the quality of the dataset used for training those models. Thus, [...] Read more.
Reclassification of massive datasets acquired through different approaches, such as web scraping, is a big challenge to demonstrate the effectiveness of a machine learning model. Notably, there is a strong influence of the quality of the dataset used for training those models. Thus, we propose a threshold algorithm as an efficient method to remove stopwords. This method employs an unsupervised classification technique, such as K-means, to accurately categorize user reviews from the IMDb dataset into their most suitable categories, generating a well-balanced dataset. Analysis of the performance of the algorithm revealed a notable influence of the text vectorization method used concerning the generation of clusters when assessing various preprocessing approaches. Moreover, the algorithm demonstrated that the word embedding technique and the removal of stopwords to retrieve the clustered text significantly impacted the categorization. The proposed method involves confirming the presence of a suggested stopword within each review across various genres. Upon satisfying this condition, the method assesses if the word’s frequency exceeds a predefined threshold. The threshold algorithm yielded a mapping genre success above 80% compared to precompiled lists and a Zipf’s law-based method. In addition, we employed the mini-batch K-means method for the clustering formation of each differently preprocessed dataset. This approach enabled us to reclassify reviews more coherently. Summing up, our methodology categorizes sparsely labeled data into meaningful clusters, in particular, by using a combination of the proposed stopword removal method and TF-IDF. The reclassified and balanced datasets showed a significant improvement, achieving 94% accuracy compared to the original dataset. Full article
(This article belongs to the Special Issue Machine Learning, Statistics and Big Data)
Show Figures

Figure 1

14 pages, 3079 KB  
Article
A Hybrid Model with New Word Weighting for Fast Filtering Spam Short Texts
by Tian Xia, Xuemin Chen, Jiacun Wang and Feng Qiu
Sensors 2023, 23(21), 8975; https://doi.org/10.3390/s23218975 - 4 Nov 2023
Cited by 2 | Viewed by 3303
Abstract
Short message services (SMS), microblogging tools, instant message apps, and commercial websites produce numerous short text messages every day. These short text messages are usually guaranteed to reach mass audience with low cost. Spammers take advantage of short texts by sending bulk malicious [...] Read more.
Short message services (SMS), microblogging tools, instant message apps, and commercial websites produce numerous short text messages every day. These short text messages are usually guaranteed to reach mass audience with low cost. Spammers take advantage of short texts by sending bulk malicious or unwanted messages. Short texts are difficult to classify because of their shortness, sparsity, rapidness, and informal writing. The effectiveness of the hidden Markov model (HMM) for short text classification has been illustrated in our previous study. However, the HMM has limited capability to handle new words, which are mostly generated by informal writing. In this paper, a hybrid model is proposed to address the informal writing issue by weighting new words for fast short text filtering with high accuracy. The hybrid model consists of an artificial neural network (ANN) and an HMM, which are used for new word weighting and spam filtering, respectively. The weight of a new word is calculated based on the weights of its neighbor, along with the spam and ham (i.e., not spam) probabilities of short text message predicted by the ANN. Performance evaluations on benchmark datasets, including the SMS message data maintained by University of California, Irvine; the movie reviews, and the customer reviews are conducted. The hybrid model operates at a significantly higher speed than deep learning models. The experiment results show that the proposed hybrid model outperforms other prominent machine learning algorithms, achieving a good balance between filtering throughput and accuracy. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

14 pages, 2538 KB  
Article
ERF-XGB: Ensemble Random Forest-Based XG Boost for Accurate Prediction and Classification of E-Commerce Product Review
by Daniyal M. Alghazzawi, Anser Ghazal Ali Alquraishee, Sahar K. Badri and Syed Hamid Hasan
Sustainability 2023, 15(9), 7076; https://doi.org/10.3390/su15097076 - 23 Apr 2023
Cited by 24 | Viewed by 5020
Abstract
Recently, the concept of e-commerce product review evaluation has become a research topic of significant interest in sentiment analysis. The sentiment polarity estimation of product reviews is a great way to obtain a buyer’s opinion on products. It offers significant advantages for online [...] Read more.
Recently, the concept of e-commerce product review evaluation has become a research topic of significant interest in sentiment analysis. The sentiment polarity estimation of product reviews is a great way to obtain a buyer’s opinion on products. It offers significant advantages for online shopping customers to evaluate the service and product qualities of the purchased products. However, the issues related to polysemy, disambiguation, and word dimension mapping create prediction problems in analyzing online reviews. In order to address such issues and enhance the sentiment polarity classification, this paper proposes a new sentiment analysis model, the Ensemble Random Forest-based XG boost (ERF-XGB) approach, for the accurate binary classification of online e-commerce product review sentiments. Two different Internet Movie Database (IMDB) datasets and the Chinese Emotional Corpus (ChnSentiCorp) dataset are used for estimating online reviews. First, the datasets are preprocessed through tokenization, lemmatization, and stemming operations. The Harris hawk optimization (HHO) algorithm selects two datasets’ corresponding features. Finally, the sentiments from online reviews are classified into positive and negative categories regarding the proposed ERF-XGB approach. Hyperparameter tuning is used to find the optimal parameter values that improve the performance of the proposed ERF-XGB algorithm. The performance of the proposed ERF-XGB approach is analyzed using evaluation indicators, namely accuracy, recall, precision, and F1-score, for different existing approaches. Compared with the existing method, the proposed ERF-XGB approach effectively predicts sentiments of online product reviews with an accuracy rate of about 98.7% for the ChnSentiCorp dataset and 98.2% for the IMDB dataset. Full article
Show Figures

Figure 1

14 pages, 1050 KB  
Article
Sentiment Analysis of Text Reviews Using Lexicon-Enhanced Bert Embedding (LeBERT) Model with Convolutional Neural Network
by James Mutinda, Waweru Mwangi and George Okeyo
Appl. Sci. 2023, 13(3), 1445; https://doi.org/10.3390/app13031445 - 21 Jan 2023
Cited by 131 | Viewed by 16513
Abstract
Sentiment analysis has become an important area of research in natural language processing. This technique has a wide range of applications, such as comprehending user preferences in ecommerce feedback portals, politics, and in governance. However, accurate sentiment analysis requires robust text representation techniques [...] Read more.
Sentiment analysis has become an important area of research in natural language processing. This technique has a wide range of applications, such as comprehending user preferences in ecommerce feedback portals, politics, and in governance. However, accurate sentiment analysis requires robust text representation techniques that can convert words into precise vectors that represent the input text. There are two categories of text representation techniques: lexicon-based techniques and machine learning-based techniques. From research, both techniques have limitations. For instance, pre-trained word embeddings, such as Word2Vec, Glove, and bidirectional encoder representations from transformers (BERT), generate vectors by considering word distances, similarities, and occurrences ignoring other aspects such as word sentiment orientation. Aiming at such limitations, this paper presents a sentiment classification model (named LeBERT) combining sentiment lexicon, N-grams, BERT, and CNN. In the model, sentiment lexicon, N-grams, and BERT are used to vectorize words selected from a section of the input text. CNN is used as the deep neural network classifier for feature mapping and giving the output sentiment class. The proposed model is evaluated on three public datasets, namely, Amazon products’ reviews, Imbd movies’ reviews, and Yelp restaurants’ reviews datasets. Accuracy, precision, and F-measure are used as the model performance metrics. The experimental results indicate that the proposed LeBERT model outperforms the existing state-of-the-art models, with a F-measure score of 88.73% in binary sentiment classification. Full article
(This article belongs to the Special Issue AI Empowered Sentiment Analysis)
Show Figures

Figure 1

21 pages, 2066 KB  
Article
A Ranking Learning Model by K-Means Clustering Technique for Web Scraped Movie Data
by Kamal Uddin Sarker, Mohammed Saqib, Raza Hasan, Salman Mahmood, Saqib Hussain, Ali Abbas and Aziz Deraman
Computers 2022, 11(11), 158; https://doi.org/10.3390/computers11110158 - 8 Nov 2022
Cited by 18 | Viewed by 8054
Abstract
Business organizations experience cut-throat competition in the e-commerce era, where a smart organization needs to come up with faster innovative ideas to enjoy competitive advantages. A smart user decides from the review information of an online product. Data-driven smart machine learning applications use [...] Read more.
Business organizations experience cut-throat competition in the e-commerce era, where a smart organization needs to come up with faster innovative ideas to enjoy competitive advantages. A smart user decides from the review information of an online product. Data-driven smart machine learning applications use real data to support immediate decision making. Web scraping technologies support supplying sufficient relevant and up-to-date well-structured data from unstructured data sources like websites. Machine learning applications generate models for in-depth data analysis and decision making. The Internet Movie Database (IMDB) is one of the largest movie databases on the internet. IMDB movie information is applied for statistical analysis, sentiment classification, genre-based clustering, and rating-based clustering with respect to movie release year, budget, etc., for repository dataset. This paper presents a novel clustering model with respect to two different rating systems of IMDB movie data. This work contributes to the three areas: (i) the “grey area” of web scraping to extract data for research purposes; (ii) statistical analysis to correlate required data fields and understanding purposes of implementation machine learning, (iii) k-means clustering is applied for movie critics rank (Metascore) and users’ star rank (Rating). Different python libraries are used for web data scraping, data analysis, data visualization, and k-means clustering application. Only 42.4% of records were accepted from the extracted dataset for research purposes after cleaning. Statistical analysis showed that votes, ratings, Metascore have a linear relationship, while random characteristics are observed for income of the movie. On the other hand, experts’ feedback (Metascore) and customers’ feedback (Rating) are negatively correlated (−0.0384) due to the biasness of additional features like genre, actors, budget, etc. Both rankings have a nonlinear relationship with the income of the movies. Six optimal clusters were selected by elbow technique and the calculated silhouette score is 0.4926 for the proposed k-means clustering model and we found that only one cluster is in the logical relationship of two rankings systems. Full article
Show Figures

Figure 1

24 pages, 693 KB  
Article
JUMRv1: A Sentiment Analysis Dataset for Movie Recommendation
by Shuvamoy Chatterjee, Kushal Chakrabarti, Avishek Garain, Friedhelm Schwenker and Ram Sarkar
Appl. Sci. 2021, 11(20), 9381; https://doi.org/10.3390/app11209381 - 9 Oct 2021
Cited by 13 | Viewed by 6566
Abstract
Nowadays, we can observe the applications of machine learning in every field, ranging from the quality testing of materials to the building of powerful computer vision tools. One such recent application is the recommendation system, which is a method that suggests products to [...] Read more.
Nowadays, we can observe the applications of machine learning in every field, ranging from the quality testing of materials to the building of powerful computer vision tools. One such recent application is the recommendation system, which is a method that suggests products to users based on their preferences. In this paper, our focus is on a specific recommendation system called movie recommendation. Here, we make use of user reviews of movies in order to establish a general outlook about the movie and then use that outlook to recommend that movie to other users. However, a huge number of available reviews has baffled sophisticated review systems. Consequently, there is a need to find a method of extracting meaningful information from the available reviews and use that in classifying a movie review and predicting the sentiment in each one. In a typical scenario, a review can either be positive, negative, or indifferent about a movie. However, the available research articles in the field mainly consider this as a two-class classification problem—positive and negative. The most popular work in this field was performed on Stanford and Rotten Tomatoes datasets, which are somewhat outdated. Our work is based on self-scraped reviews from the IMDB website, and we have annotated the reviews into one of the three classes—positive, negative, and neutral. Our dataset is called JUMRv1—Jadavpur University Movie Recommendation dataset version 1. For the evaluation of JUMRv1, we took an exhaustive approach by testing various combinations of word embeddings, feature selection methods, and classifiers. We also analysed the performance trends, if there were any, and attempted to explain them. Our work sets a benchmark for movie recommendation systems that is based on the newly developed dataset using a three-class sentiment classification. Full article
Show Figures

Figure 1

14 pages, 1334 KB  
Article
Examining Attention Mechanisms in Deep Learning Models for Sentiment Analysis
by Spyridon Kardakis, Isidoros Perikos, Foteini Grivokostopoulou and Ioannis Hatzilygeroudis
Appl. Sci. 2021, 11(9), 3883; https://doi.org/10.3390/app11093883 - 25 Apr 2021
Cited by 71 | Viewed by 10789
Abstract
Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including [...] Read more.
Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including sentiment analysis, emotion recognition, machine translation and speech recognition. In this work, we study attention-based models built on recurrent neural networks (RNNs) and examine their performance in various contexts of sentiment analysis. Self-attention, global-attention and hierarchical-attention methods are examined under various deep neural models, training methods and hyperparameters. Even though attention mechanisms are a powerful recent concept in the field of deep learning, their exact effectiveness in sentiment analysis is yet to be thoroughly assessed. A comparative analysis is performed in a text sentiment classification task where baseline models are compared with and without the use of attention for every experiment. The experimental study additionally examines the proposed models’ ability in recognizing opinions and emotions in movie reviews. The results indicate that attention-based models lead to great improvements in the performance of deep neural models showcasing up to a 3.5% improvement in their accuracy. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

11 pages, 2912 KB  
Article
Improving Document-Level Sentiment Classification Using Importance of Sentences
by Gihyeon Choi, Shinhyeok Oh and Harksoo Kim
Entropy 2020, 22(12), 1336; https://doi.org/10.3390/e22121336 - 25 Nov 2020
Cited by 35 | Viewed by 4681
Abstract
Previous researchers have considered sentiment analysis as a document classification task, in which input documents are classified into predefined sentiment classes. Although there are sentences in a document that support important evidences for sentiment analysis and sentences that do not, they have treated [...] Read more.
Previous researchers have considered sentiment analysis as a document classification task, in which input documents are classified into predefined sentiment classes. Although there are sentences in a document that support important evidences for sentiment analysis and sentences that do not, they have treated the document as a bag of sentences. In other words, they have not considered the importance of each sentence in the document. To effectively determine polarity of a document, each sentence in the document should be dealt with different degrees of importance. To address this problem, we propose a document-level sentence classification model based on deep neural networks, in which the importance degrees of sentences in documents are automatically determined through gate mechanisms. To verify our new sentiment analysis model, we conducted experiments using the sentiment datasets in the four different domains such as movie reviews, hotel reviews, restaurant reviews, and music reviews. In the experiments, the proposed model outperformed previous state-of-the-art models that do not consider importance differences of sentences in a document. The experimental results show that the importance of sentences should be considered in a document-level sentiment classification task. Full article
Show Figures

Figure 1

Back to TopTop