Next Article in Journal
Adaptive Chain-of-Thought Distillation Based on LLM Performance on Original Problems
Previous Article in Journal
A Fermatean Fuzzy Game-Theoretic Framework for Policy Design in Sustainable Health Supply Chains
Previous Article in Special Issue
Quantum Computing Meets Deep Learning: A QCNN Model for Accurate and Efficient Image Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization

1
Department of Electronics and Communication, University of Allahabad, Prayagraj 211002, India
2
Department of Mathematics and Statistics, College of Science, King Faisal University, Al-Ahsa 31982, Saudi Arabia
3
Department of Management Information Systems, College of Business Administration, King Faisal University, Al-Ahsa 31982, Saudi Arabia
*
Authors to whom correspondence should be addressed.
Mathematics 2025, 13(22), 3645; https://doi.org/10.3390/math13223645
Submission received: 23 September 2025 / Revised: 3 November 2025 / Accepted: 5 November 2025 / Published: 13 November 2025

Abstract

In recent years, social media-related sentiment classification has been researched extensively and is applied in various fields such as opinion mining, commodity feedback, and market analysis. Therefore, it is important to understand and analyse the opinions of the public, their feedback, and data related to social media. Consumers continue to face challenges in accessing review-based sentiment classification expressed by their peers, and the existing method does not provide satisfactory results. Hence, an innovative sentiment classification method, the Convoluted Graph Pyramid Attention (CGPA) model, combined with the Updated Greater Cane Rat Algorithm (UGCRA), is proposed. This method improves sentiment classification by optimizing accuracy and efficiency while addressing inherent uncertainties, allowing for precise sentiment intensity evaluation across multiple dimensions. Explainable Artificial Intelligence (XAI) techniques, particularly SHapley Additive exPlanations (SHAPs), enhance the model’s transparency and interpretability. This approach enables the final ranking of classified reviews, predicts ratings on a scale of one to five stars, and generates a recommendation list based on the predicted user ratings. Comparison between other traditional existing methods and the result indicates that the proposed method achieves superior performance. From the experimental results, the proposed approach achieves an accuracy of 99.5% in the Restaurant Review dataset, 99.8% in the Edmund Consumer Car Ratings Reviews dataset, 99.9% in the Flipkart Cell Phone Reviews dataset, and 99.7% in the IMDB Movie database, showing its effectiveness in analysing sentiments with an increase in performance.

1. Introduction

In sentiment analysis, categorizing text-based opinions is a crucial tool for understanding user perspectives by analysing emotions, sentiments, and viewpoints expressed in online content. This technique has widespread applications across various domains [1]. Reviews, a specific type of textual data, allow individuals to share opinions about products, services, or experiences, typically categorized as positive, negative, or neutral, providing valuable insights into product or service quality [2]. By analysing the tone and content of these reviews, sentiment analysis helps uncover customer perceptions and satisfaction levels. Research identifies three primary levels of sentiment analysis: sentence-level, document-level, and phrase-level [3]. Generally, a five-star rating indicates positive feedback, while a one-star rating signifies negative sentiment, often accompanied by text-based reviews. However, when new users encounter a vast number of reviews, extracting meaningful information about product quality efficiently becomes challenging [4]. Recent advancements in machine learning have driven the development of innovative sentiment analysis models, focusing on improving accuracy and interpretability. This review highlights key contributions, exploring diverse approaches and the integration of deep learning techniques [5].
Punetha N. et al. [6] developed a Multi-Criteria Decision-Making (MCDM) algorithm for sentiment classification, which introduced a mathematical optimization framework for understanding sentiments and emotions in reviews. The model first established the review’s sentiment polarity—negative or positive; next, the model attempted to determine the customer’s satisfaction—whether they were satisfied or not satisfied with the content. Kaur G et al. [7] presented a hybrid sentiment analysis approach, incorporating a Hybrid Feature Vector with Long Short-Term Memory (HFV-LSTM). This method addressed the challenge of achieving efficient review mining results for consumer review summarization. It was designed to accurately perform sentiment analysis from the perspective of a consumer review summarization model for capitalists. Tripathy G et al. [8] proposed an effective hybrid approach based on an Adaptive Enhanced Genetic Algorithm (AEGA) for online customer review analysis. Their feature minimization technique employed an evolutionary approach to select an optimal set of features, improving sentiment classification accuracy. The novel two-phase crossover approach ensured that the algorithm consistently executed with the fittest solution. Kotagiri S et al. [9] developed an approach for aspect-oriented extraction and sentiment analysis by leveraging optimized hybrid deep learning techniques, specifically the Reptile Search Optimization-based Extreme Gradient Boosting Algorithm (RSO-EGBA). The integration of these techniques aimed to enhance the precision and interpretability of aspect-oriented sentiment analysis.
He Z et al. [10] introduced a method called Integrated Degree-based K-shell decomposition (ID-KS) for conducting competitive analysis via product comparison networks. ID-KS was used to extract comprehensive insights into product comparison, competitor identification, product ranking, brand comparison, and market structure analysis.
Nawaz A et al. [11] proposed an Aspect-Based Sentiment Analysis (ABSA) approach to analyse sentiment in text, related to specific aspects. Their model was designed to identify quality issues in products. After successful deployment, the proposed model aimed to detect product issues and improve quality based on user reviews. Danyal MM et al. [12] developed an eXtreme Language Model Network (XLNet) utilizing a permutation-based language modelling objective. XLNet’s capability of capturing bidirectional word dependencies enabled it to better understand text context and meaning. By predicting words randomly during training, the model improved its representations and effectively captured long-range dependencies. Tan KL et al. [13] introduced a hybrid sentiment analysis model combining the Robustly Optimized BERT pretraining approach and Gated Recurrent Units (RoBERTa-GRU). The RoBERTa model was employed to generate representative word embeddings that captured the unique characteristics of the text. Devi NL et al. [14] proposed the Remora Optimization-Based Extreme Action Selection Gradient Boosting (RO-EASGB) algorithm for sentiment classification using benchmark datasets. This hybrid model aimed to enhance aspect-based sentiment analysis by effectively capturing semantic relationships between words through advanced natural language processing techniques. Abbas S et al. [15] suggested an Active Learning-based Machine Learning algorithm (ALML) for analysing Flipkart smartphone customer reviews. Their approach classified sentiment based on product reviews and aimed to improve the accuracy and reliability of customer sentiment analysis by leveraging active learning techniques. Table 1 represents a summary of existing literature.

1.1. Problem Statement

Despite significant advancements in sentiment analysis and emotion detection, several challenges persist, hindering broader applicability and efficiency. Many current approaches, such as those leveraging attention mechanisms, deep learning, and hybrid models, face limitations related to scalability, high computational cost, and reliance on domain-specific data. Techniques that improve interpretability often struggle with managing complex, multimodal data, while other models may overemphasize certain features, such as emojis, which distorts results in contexts where they are not prevalent. Additionally, approaches that are focused on specific industries, such as the textile sector, or platforms, like social media, lack generalizability across diverse applications. These limitations highlight the need for more scalable, efficient, and versatile sentiment analysis methods that balance accuracy, interpretability, and broad applicability.

1.2. Motivation

Online reviews have become vital, significantly shaping consumer decision-making and influencing product perceptions. Ranking products based on these reviews involves two critical aspects: effective sentiment analysis (SA) and multi-criteria decision-making techniques. However, existing methods often fail to fully capture the complexity and accuracy of consumer sentiments, which are essential for reliable product rankings. Many systems overly focus on quantitative metrics while overlooking the nuances of SA, leading to skewed results. The accuracy of SA plays a crucial role in determining the effectiveness of these rankings, yet this aspect is frequently neglected, highlighting the need for improved SA techniques that better integrate with decision-making frameworks to ensure more accurate and trustworthy rankings.

1.3. Contributions

This research proposes the CGPA model hybridized with the UGCRA for multi-level sentiment review analysis. This approach enhances sentiment classification by optimizing accuracy and efficiency while accounting for inherent uncertainties, enabling precise sentiment intensity assessment across various dimensions.
DistilBERT is employed to extract both explicit and implicit attributes from online reviews, enabling a nuanced understanding of sentiments and opinions expressed in the text.
XAI techniques, specifically SHAPs, are used to enhance the transparency and interpretability of the model’s predictions. This approach facilitates the final ranking of classified reviews, predicts ratings on a scale of one to five stars, and generates a recommendation list based on the predicted user ratings.
The structure of this research paper is as follows: Section 2 explains the proposed methods of sentiment classification and star rating in detail, Section 3 offers experimental outcome analysis, and Section 4 provides a discussion. Finally, Section 5 concludes and outlines the future scope.

2. Proposed Method

The proposed methodology aims to address the limitations of existing sentiment analysis and product-ranking approaches by introducing an enhanced framework with multi-criteria decision-making processes. This ensures more accurate and reliable product rankings based on comprehensive user feedback. The architecture of the proposed method is given in Figure 1.
The initial phase of the research involves collecting input data from four data sources to create a comprehensive dataset for analysis. The collected data are preprocessed to clean the input data, which involves removing redundant or unnecessary information. In the feature extraction phase, both explicit and implicit attributes from online reviews are extracted. Following this, the sentiment analysis is conducted, which classifies the extracted online reviews into five classes. After the classification, a final ranking process is performed to ensure a more accurate ranking and to predict ratings on a scale of one to five stars. To enhance transparency and interpretability, the XAI technique is utilized, offering valuable insights into the model’s decision-making process and improving its overall effectiveness.

2.1. Data Collection

The initial step in the research involves collecting the data input from several sources to form a comprehensive dataset for analysis. Sources include the Restaurant Review dataset [16], Edmund Consumer Car Ratings Reviews dataset [17], Flipkart Cell Phone Reviews dataset [18], and the IMDB Movie database [19]. This diverse data collection is specifically intended for effective sentiment analysis and product ranking. After data collection, the following step is the preprocessing stage, where redundant or unnecessary statistics are removed to clean the input data.

2.2. Preprocessing

Various preprocessing tasks, such as lowercasing, removing punctuation, removing numbers, removing extra whitespace, tokenization, stop word removal, and stemming, are conducted to clean the input data, which involves removing redundant or unnecessary statistics. The preprocessing steps are given in Figure 2.
Lowercasing is an essential early step in text preprocessing for text mining, where all letters are converted to lowercase to prevent case sensitivity and improve classifier performance by ensuring text consistency, though it increases ambiguity. Removing punctuation, numbers, and white spaces enhances classification accuracy by simplifying the feature set and reducing unnecessary dimensions. Tokenization divides text into smaller components such as sentences, phrases, or words, removing certain characters like punctuation marks, and these tokens are then used in processes like text mining and parsing. Stop word removal eliminates common, non-informative words that do not contribute significantly to analysis, improving accuracy. Finally, stemming reduces words to their base form, minimizing redundant word calculations and streamlining processing [20].
This process ensures that the data is refined and optimized for analysis, enhancing the overall quality and relevance of the information used in subsequent stages of the research. Following the preprocessing shown in Table 2, the processed data undergoes feature extraction to identify both explicit and implicit attributes from online reviews. This step enhances the analysis by providing an efficient and effective contextual understanding of the text.

2.3. Feature Extraction

The process begins with the extraction of both explicit and implicit attributes from online reviews, utilizing DistilBERT, which enhances analysis by providing efficient and effective contextual understanding of the text. DistilBERT’s ability to generate rich embeddings allows for the identification of nuanced sentiments and opinions expressed within reviews. DistilBERT is a compact and efficient version of the BERT model, designed to maintain high effectiveness while being faster and smaller. It employs knowledge distillation, using a BERT-like teacher model to guide the training process on a self-supervised corpus. For feature extraction from online reviews, DistilBERT starts converting input sequences of words (like tweets or reviews) into the embedding vectors representing the words. The transformer encoder self-attention mechanism is then applied to give context to each word and create contextual embeddings. These embeddings are concatenated into a single vector to represent the semantic meaning of the review. The resultant vector goes through a fully connected layer and subsequently undergoes fine-tuning by a classification layer to predict event classes, hence allowing the extraction of explicit and implicit attributes from the review text. The DistilBERT [21,22] self-attention mechanism computes attention scores based on the input vectors using Equation (1).
A t t e n t i o n Q , K , V = s o f t max Q K T d k V
where Q , K and V denote the query, key, and value measures, respectively. The structure of DistilBERT is represented in Figure 3.
After feature extraction, the extracted features undergo sentiment analysis to classify online reviews into distinct categories such as strongly positive, positive, neutral, strongly negative, and negative.

2.4. Multi-Level Sentiment Review Analysis

A deep learning model, termed the CGPA model, combined with the UGCRA, is employed to evaluate sentiment intensities across multiple dimensions of online reviews. The CGPA model enhances sentiment analysis by utilizing attention mechanisms that effectively capture complex relationships and dependencies within the data, enabling a more nuanced understanding of sentiment. The sentiment analysis results are presented in a manner that accounts for the inherent uncertainties of sentiment classification, encompassing categories such as strongly positive, positive, neutral, negative, and strongly negative. Notably, the process of self-attention and the graph attention system are not identical. The graph attention network (GAT) mechanism does not need a prior understanding of the full graph structure, whereas the self-attention mechanism gives attention weights to each node in a document. Instead, it flexibly assigns various weights to varying numbers of neighbouring nodes and processes all nodes, enabling greater computing effectiveness.
The GAT has two inputs, and a feature matrix M R a × b is one of them. For GAT to determine the neighbour connection of nodes, the adjacency matrix D obtained via the syntax module M = m 1 , m 2 , m a , m j R b is the other input. The updated node feature matrix M = m 1 , m 2 , m a , m j R b is the output that GAT produces. The single-layer graph neural network modifies a node as denoted in Equations (2)–(4).
s j i = L e a k y Re L U ( p T [ w m j   w m i ] )
p j i = s o f t max i ( s j i ) = exp p j i x A i exp s j x
m j = σ i A j p j i w m i
where s j i indicates the first layer of a feed-forward neural network, p T R 2 b represents a parameter vector, w R b × b indicates a shared weight matrix comprising linear transformations that convert input characteristics into high-level features to achieve adequate expressiveness, ‖ ‖ signifies the vectors’ splicing operation, and A j describes the node’s first-order neighbour [23,24].
Equation (3) uses X distinct attention methods to calculate the nodes’ hidden statuses, with the X outputs concatenated as input to the next layer, as denoted in Equation (5). A single graph attention layer aggregates only first-order neighbour information, so a two-layer GAT is used to capture higher-order neighbour information and enrich node features. The second GAT layer calculates the final word node representation using attention coefficients p j i as indicated in Equation (6).
M j = x = 1 x σ i A j p j l x w x m i
m j = σ i A j p j l w m i
Text classification is then converted into a graph classification problem by fusing the updated data from each node in the text graph to produce graph-level features for label prediction. Maximum pooling is applied to extract the most significant node features as a text graph representation, and predicted labels are obtained using a softmax function, as denoted in Equation (7).
Z j = S o f t max max p o o l ( m )
The training process aims to minimize the cross-entropy loss ( L E C ) between true and predicted labels, as defined in Equation (8).
L E C = j = 1 n q j log Z j
where n is the count of reviews, q j i is the actual reviews, and Z j is the processed reviews. This loss directly impacts the accuracy of sentiment analysis in review classification. To achieve accurate classification, it is crucial to minimize this loss function. For this purpose, the UGCRA is employed to enhance the accuracy and efficiency of sentiment classification, effectively minimizing the loss in the proposed method.
The GCRA is enhanced with Lévy flight, resulting in the UGCRA. This update improves search space diversification and helps prevent premature convergence, leading to more robust optimization performance. The UGCRA optimization process begins with the stochastic generation of the UGCRA population R using Equation (9). This generation utilizes the upper bound u b and lower bound l b .
R = R 1 , 1 R 1 , 2 R 1 , p 1 R 1 , p R 2 , 1 E 2 , 2 R 2 , p 1 R 2 , p R j , i R m , 1 R m , 2 R m , p 1 R m , p
where m is the size of the population, p indicates the dimension of the problem, and R j , i denotes the j t h position in the i t h dimension, which is created using Equation (10).
R j , i = r a n d × u b i l b i + l b i
where r a n d denotes a random number between 0 and 1. The fitness function is utilized to minimize the loss function, thereby improving the overall performance of sentiment analysis, as represented in Equation (11).
F i t n e s s   f u n c t i o n = min   i m i z e   ( L E C )
In the GCRA framework, the dominant entity R x is identified as the most-fitted entity based on the objective function, leading the group to utilize prior knowledge of the routes. The position of other entities is adjusted based on the dominant entity as defined in Equation (12). Depending on a parameter variable ρ , the GCRA alternates the exploration and exploitation stages; the parameter variable is fine-tuned to 0.5 to achieve a balance between exploration and exploitation.
R j , i n e w = 0.7 R j , i + R x , i 2
where R j , i n e w is a new UGCRA position, R j , i indicates the current position of the UGCRA, and R x , i represents a dominant position of the UGCRA in a i t h dimension.
A new position of the population is updated based on the dominant Candidate’s position, as defined in Equation (13). If a Candidate’s objective function value surpasses the current fittest, the dominant position is updated, and others adjust accordingly. Otherwise, Candidates deviate from the dominant position, as modelled in Equation (14). During exploration, the UGCRA moves to a new position only if it improves the objective function value; otherwise, it retains its current position.
R j , i n e w = R j , i + Q × R x , i a × R j , i
R j = R j , i + Q × R j , i α × R x , i , F j n e w < F j R j , i + Q × R n , i β × R x , i , o t h e r w i s e
where R j denotes the new position of the j t h UGCRA, R j , i n e w indicates the i t h dimension value, R j , i is the current position of the GCRA, R x , i represents the dominant Candidate in the i t h dimension, F j denotes the objective function of the dominant Candidate, Q signifies the random number that represents the boundary of the problem space, and a denotes the random number. The simulation of this phase begins by randomly selecting the best Candidate b such that b x . The adjustment process occurs around the selected best Candidate, as represented in Equation (15). If the newly computed position for the UGCRA improves the objective function’s value, as modelled in Equation (15), it replaces the previous position.
R j , i n e w = R j , i + Q × R x , i μ × R b , i
where R b , i indicates a randomly chosen best Candidate position in the i t h dimension, μ is a randomly selected value from 1 to 4, and the pseudocode works to attain the best Candidate at the best position. The Lévy flight strategy is a powerful technique that enhances optimization algorithms by enabling them to escape local minima. The Lévy flight value is determined using Equation (16).
L e v y = u v 1 β
where u and v satisfy the Gaussian distribution, and then the new position of the UGCRA is represented in Equation (17)
R j , i n e w = R j , i + L e v y × R x , i a × R j , i
The pseudocode of UGCRA is represented in Algorithm 1.
Algorithm 1. Pseudocode of UGCRA.
Initialize the parameters Q , μ , ρ , a , α
Input: UGCRA population, maximum iterations:
Output: Best result
 Compute the fitness function of UGCRA
 Choose the fittest UGCRA as a Candidate R x :
 Update the global optimal position
 Update the other GCRA according to the position of R x by Equation (12)
for i t e r = 1 : max _ i t e r
determine Q , μ , ρ , a , α :
    if r a n d < ρ :
     Update the position of the search agent using Equation (13)
Verify the boundary constraint.
   else
     Update the new position of the search agent according to Equation (14)
Verify the boundary constraint.
   end if
Determine each UGCR’s suitability for a new position
Update search agent according to Equation (15)
   Update the new search space using the Levy flight Equations (16) and (17)
Update the best position
   Choose a new Candidate Cbest
  end for
 return Cbest
 end
The UGCRA utilizes attention mechanisms to model complex relations while it fine-tunes the optimization process for maximum accuracy and efficiency [25]. This hybrid model produces a further refined sentiment classification across categories such as strongly positive, positive, neutral, strongly negative, and negative, already taking into consideration certain uncertainties that exist in sentiment analysis.

2.5. Polarity Finding

The fuzzy relation approach is utilized to determine the sentiment polarity of user reviews by assigning a polarity value within the range of [−1, 1]. To standardize the evaluation, this value is categorized into distinct sentiment classes. A review is classified as strongly negative if its polarity falls between (−1, −0.5], negative if it is within (−0.5, 0), neutral if the value is 0, positive if it lies within (0, 0.5], and strongly positive if it falls within (0.5, 1]. This classification framework ensures a structured analysis of sentiment intensity, enabling accurate interpretation of customer feedback by distinguishing between varying levels of sentiment expression.

2.6. Ranking Model

After classifying the reviews, a final ranking is conducted to enhance accuracy and predict ratings on a scale of one to five stars. The ranking model is based on five sentiment classes: strongly positive, positive, neutral, strongly negative, and negative. A one-star rating corresponds to the strongly negative class, two stars to the negative class, three stars to the neutral class, four stars to the positive class, and five stars to the strongly positive class. Figure 4 shows the ranking model.
In an online shopping product recommendation system, this approach improves the prediction accuracy of recommendation lists and provides highly personalized and relevant products to customers. A recommendation list can be generated based on predicted user ratings. Sentiment analysis is employed to calculate the sentiment score of each review comment. This score, combined with the average customer review score, number of reviews, and associated target category, is used to compute an overall product rating score for each group. The product rating score is directly utilized to create recommendation lists. However, additional filtering is often required to further customize recommendations, ensuring they align closely with individual user preferences.

2.7. Interpretability Analysis

To enhance transparency and interpretability in explaining the CGPA model’s predictions, the XAI technique called SHAP is utilized. This is applied during the final ranking of classified reviews to predict ratings on a scale of one to five stars, enabling the generation of a recommendation list based on the predicted user ratings. Transparency and interpretability are provided in the CGPA model by the SHAP technique. SHAP helps determine the last ranking of classified reviews to predict user ratings on a one-to-five-star scale so that recommendation lists can be generated from the predicted rating. As an XAI framework based on Shapley values from Game Theory, SHAP provides model-agnostic local interpretability through the identification of the most salient features for individual predictions. Unified importance values are assigned to features by SHAP that demonstrate their effect on model decision-making, thereby enhancing transparency. SHAP calculates importance scores ranging from 0 to 1 for some of the key factors, ranking the parameters contributing to the predictions, thus underscoring their relevance in the model predictions. The resultant SHAP visualizations exhibit force plots that reveal sentiment shifts and highlight the effects of certain features, thereby beginning to illuminate the reasoning underlying the model’s decision path.

2.8. Computational Complexity of CGPA-UGCRA

The computational complexity of the proposed CGPA-UGCRA framework arises mainly from two core components: the CGPA module and the UGCRA optimizer. For a review document represented as a graph with N nodes (tokens) and E edges, the CGPA component requires O N 2 + E operations owing to pairwise self-attention computations and neighbourhood aggregation within the graph attention layers. The optimization process using the UGCRA, composed of P population agents and I iterations, introduces an additional O P × I complexity. Hence, the overall asymptotic computational cost of the hybrid model is expressed in Equation (18).
O N 2 + E + P × I
To alleviate computational burden, several optimization strategies are adopted. First, DistilBERT embedding is employed to shorten sequence lengths and reduce feature-dimension size without compromising contextual richness. Second, adaptive population sizing and early-stopping techniques in the UGCRA limit unnecessary iterations once convergence is detected. Third, the Lévy flight enhancement accelerates convergence by broadening the search space, requiring fewer optimization cycles. These mechanisms together ensure that the proposed model maintains high accuracy while remaining computationally feasible for large-scale sentiment datasets.

3. Experimental Outcomes

This section provides an overview of experimental results, parameter configurations, dataset descriptions, outcomes, and evaluation metrics for the novel method designed for accurate sentiment analysis from online reviews. The experiments are carried out using Python 3.10.0. The results are then validated and compared with current methods. Parameter settings of the proposed technique are given in Table 3.

3.1. Dataset Description

The proposed technique utilizes four widely used sentiment analysis datasets. The Restaurant Review dataset [16] contains 3000 training and 800 testing reviews with annotated aspect terms, ideal for aspect-based studies. The Edmunds-Consumer Car Ratings Reviews dataset [17] includes 42,230 car reviews with full-text content and extracted fields, useful for automotive-related sentiment analysis The Flipkart Cell Phone Reviews dataset [18] offers diverse user reviews about cell phones, split into 70% for training, 10% for validation, and 30% for testing. Lastly, the IMDB Movie database [19] provides 50,000 balanced positive and negative movie reviews, equally divided into training and test sets. Figure 5a–d show the review counts for each sentiment class in the four datasets.
To validate the generalizability of the proposed CGPA-UGCRA model, experiments were extended to additional cross-domain datasets, Amazon Product Reviews (e-commerce) [26] and Trip Advisor Hotel Reviews (hospitality) [27]. These datasets cover diverse linguistic and contextual domains, providing a comprehensive evaluation of the model’s adaptability and domain-invariant learning capability.

3.2. Enhanced Interpretability Analysis with SHAPs

Figure 6 illustrates the interpretability analysis outcomes across four datasets: (a) Restaurant Review dataset, (b) Edmunds-Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, and (d) IMDB Movie dataset. Each subfigure visualizes the sentiment contributions of individual words within a review, highlighting positive (red) and negative (blue) sentiment segments. The intensity of the colours represents the magnitude of the sentiment contribution, while the numerical scores above words indicate their specific sentiment impact. These visualizations provide insight into how the model interprets and assigns sentiment scores to textual input, showcasing its ability to identify sentiment-relevant phrases and align them with the corresponding sentiment category for different domains.
The SHAP analysis in Figure 6 provides a clear interpretation of how the CGPA-UGCRA model arrives at its sentiment predictions. SHAPs assign a numerical contribution score to each word or phrase within a review, indicating the extent to which that feature influences the overall sentiment classification. Positive SHAP values reflect words that contribute to a higher sentiment rating, while negative values indicate terms that lower the sentiment score. The magnitude of each value represents the relative importance of that feature in shaping the model’s final output. For example, expressions such as “excellent performance” or “highly satisfied” contribute positively toward a favourable rating, whereas terms like “poor service” or “not worth” exert a negative influence. This analysis demonstrates the transparency of the CGPA-UGCRA model by revealing how specific textual elements guide its decision-making process in sentiment evaluation and star rating prediction.
Figure 7 describes the performance of the proposed approach with several epochs on the Restaurant Review dataset. In Figure 7a, an analysis of training and testing accuracy is shown, which quickly goes above 98.9%, indicating the model’s effective learning capability. Figure 7b displays the loss curves for training and testing, highlighting the performance of the loss function. These curves indicate a strong correlation between the model’s predictions and actual values in the training set.
Furthermore, both sets exhibit a steady decline in loss, implying that the model’s effectiveness improves over time.
Figure 8 illustrates the performance analysis of the proposed method against the number of iterations on the Edmunds-Consumer Car Ratings Reviews dataset. Figure 8a represents the training and testing accuracy analysis of the proposed approach, which performs exceptionally well in both the training and the testing, achieving values of approximately 98% and 98.5% for training and testing accuracies, respectively, indicating effective performance in sentiment review classification. Figure 8b displays loss curves for the proposed method, eventually reaching its lowest value, indicating that the loss on both the training and the testing data decreases in value with an increased number of epochs, which shows that the model is gradually improving at fitting the training data.
Accuracy and loss graphs of the proposed method on the Flipkart Cell Phone Reviews dataset are shown in Figure 9a,b, demonstrating higher accuracy and lower loss across both training and testing sets. The accuracy curve shows a gradual rise in the performance metric, with around 99.95% training accuracy and around 99.88% testing accuracy. These results demonstrate that the model is well-suited for possible deployment.
The training and testing loss and accuracy curves for sentiment review classification using the IMDB dataset are shown in Figure 10a,b. For this dataset, the loss curves exhibit a smooth decline as the number of epochs increases, indicating effective learning and the gradual convergence of the model. The accuracy curves show a rapid increase during the initial epochs, reaching stability in the later stages, suggesting that the model achieves consistent performance. The close alignment between training and testing curves highlights good generalization, with minimal overfitting.
Figure 11 presents confusion matrices for four different datasets: Restaurant Reviews, Edmund Consumer Car Ratings, Flipkart Cell Phone Reviews, and IMDB Movie Reviews.
These matrices visualize the performance of a classification model by showing the number of instances correctly and incorrectly predicted for each class (negative, neutral, positive, strongly negative, and strongly positive).
Figure 12 depicts the correlation between MCC and prevalence, thereby illustrating the diversions in metrics such as True Positive Rate (TPR), True Negative Rate (TNR), Positive Predictive Value (PPV), and Negative Predictive Value (NPV), which characterize the model’s performance under different conditions.
MCC serves as a measure for the analysis of the performance of models, including sentiment analysis; it takes into account four possible outcomes of binary classification—true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs)—to give a very balanced view of the performance of predictions.
Figure 13a–d display the ROC curves for four datasets. Each graph plots the TPR against the FPR for several classes, with AUC values shown. For Figure 13a, Class 0 has an AUC of 0.82, Class 1 of 0.80, and Class 4 of 0.51. For Figure 13b, Class 0 has an AUC of 0.86, Class 1 of 0.81, and Class 4 of 0.79. For Figure 13c, Class 3 recorded the highest AUC of 0.96, while Class 4 had 0.86. Finally, for Figure 13d, Class 0 recorded AUC = 0.80, Class 1 recorded AUC = 0.66, and Class 4 recorded AUC = 0.54.
These values showcase model performance across different classes in each dataset, where higher AUC values signify better performance.
The proposed CGPA-UGCRA model is evaluated against several established benchmark methods, AEGA, HFV-LSTM, ACDM, and AdaBoost, to ensure fair and comprehensive performance assessment. The benchmark models were selected to represent a diverse range of methodological paradigms, including optimization-based, deep learning-based, hybrid, and ensemble approaches. This selection provides a balanced comparison framework that captures variations in model design and learning strategies. Such diversity ensures that the evaluation effectively highlights the robustness, adaptability, and superior performance of the proposed CGPA-UGCRA framework across multiple sentiment analysis scenarios.
Figure 14a,b compare the performance of different sentiment analysis models on two datasets. In both cases, the proposed CGPA-UGCRA consistently achieves the highest performance across all metrics, such as with an accuracy of 99.72%, precision of 99.30%, recall of 99.30%, and F1 Score of 99.30% in the Restaurant Review dataset, while in the IMDB Movie dataset, it reaches an accuracy of 99.64%, recall of 99.10%, precision of 99.10%, and F1 score of 99.10%, compared to other models like SentAnalyPtAdaBoost_2 [28], AEGA [8], MCDM [6], HFV+LSTM [7], XLNet [12], RoBERTa [29], Viscous Accretion Disk Evolution Resource (VADER) [30], and RoBERTa-GRU [13], demonstrating superior results in both datasets.
Table 4 compares the performance metrics of different approaches on the Edmunds-Consumer Car Ratings Reviews dataset. The proposed method, CGPA-UGCRA, demonstrated remarkable results, with 99.87% accuracy, 99.88% precision, 99.47% recall, and 99.67% F1 score. In comparison, RSO-EGBA [9] performed reasonably. with 92% accuracy, 84% precision, 91% recall, and 86% F1 score. ID-KS [10] had an accuracy of 87%, a precision of 87.50%, a recall of 86%, and an F1 score of 86.67%; ABSA [11] had 90% accuracy, 88.42% precision, 80.2% recall, and an F1 score of 84%. This highlights the superior performance of CGPA-UGCRA across all measures.
The performance metrics of several approaches on the Flipkart Cell Phone Reviews dataset are evaluated in Table 5. The proposed CGPA-UGCRA approach performs better than other techniques with a 99.98% accuracy, 99.95% precision, 99.95% recall, and a 99.95% F1 score. Other methods such as RO-EASGB [14], ALML [15], Graph Neural Network (GNN) [31], and Sentiment Whale Optimized Adaptive Neural Network (SWOANN) [32] demonstrate lower performance than the proposed method. This shows the effective performance of the proposed approach in sentiment review classification.
Figure 15 presents a comparison of precision–recall curves across two distinct datasets: (a) the Restaurant Review dataset and (b) the Edmunds-Consumer Car Ratings Reviews dataset. On both datasets, the proposed CGPA-UGCRA, achieves the highest Average Precision (AP), attaining values of 0.98 on the Restaurant Review dataset and 0.99 on the Edmunds dataset. Other methods like AEGAs [8] (AP = 0.96), HFV+LSTM [7] (AP = 0.98), MCDM [6] (AP = 0.93), and SentAnalyPlaAdaBoost_2 [28] (AP = 0.95) also performed well on the Restaurant Review dataset, whereas RSO-EGBA [9] (AP = 0.95), ID-KS [10] (AP = 0.94), and ABSA [11] (AP = 0.90) show competitive performance on the Edmunds dataset.
This proves that the proposed method outperforms other approaches in terms of precision–recall trade-offs on both datasets, indicating its effectiveness in sentiment classification tasks.
The precision–recall curves for two datasets are compared in Figure 16: (a) the Flipkart Cell Phone Reviews dataset and (b) the IMDB Movie database. The proposed CGPA-UGCRA approach yields the highest AP values for both datasets, with 0.98 for the Flipkart Cell Phone Reviews dataset and 0.99 for the IMDB Movie database. On the Flipkart Cell Phone Reviews dataset, other techniques like RO-EASGB [14] (AP = 0.94), GNN [31] (AP = 0.93), ALML [15] (AP = 0.97), and SWOANN [32] (AP = 0.95) perform well. RoBERTa [29] (AP = 0.95), RoBERT-GRU [13] (AP = 0.93), and XL-Net [12] (AP = 0.94) are effective methods on the IMDB Movie database. In both datasets, the proposed CGPA-UGCRA method performs better than any other method in terms of precision–recall trade-offs, proving its greater efficacy in sentiment classification tasks.
Figure 17 illustrates the convergence curve of the proposed UGCRA, plotting the defined fitness value for each iteration. The performance of the proposed UGCRA is better than that of other optimization techniques, including the Hippopotamus Optimization Algorithm (HOA) [33], Gooseneck Barnacle Optimization Algorithm (GBOA) [34], and Reptile search algorithm (RSA) [35]. The UGCRA’s superior performance is attributed to its enhanced capability to explore a search space effectively, resulting in faster convergence.
Figure 18a,b compare the local best accuracy of different sentiment analysis models over 100 iterations on two datasets. The proposed CGPA-UGCRA achieves an accuracy of 0.90 on both datasets, consistently leading in the local best accuracy. Other models, such as HFV+LSTM [7] and RSO-EGBA [9], perform well; however, CGPA-UGCRA has outperformed all other models on both datasets.
The local best accuracies of different sentiment analysis models are compared over 100 iterations on two datasets in Figure 19a,b. From Figure 19a, CGPA-UGCRA achieves the highest accuracy of 0.95 for the Flipkart Cell Phone Reviews dataset, outperforming SWOANN’s [32] of 0.80 and RO-EASGB’s [14] of 0.78. Figure 19b shows that the performance of CGPA-UGCRA surpasses that of RoBERTa-GRU [13] (0.75) and RoBERTa [29] (0.79) at sentiment classification for the IMDB Movie database.
Figure 20a,b show the global best accuracy achieved by different sentiment analysis models over 100 iterations on two datasets. The proposed CGPA-UGCRA method exceeded all other models, like the AEGA [8] at 0.88, MCDM [6] at 0.86, ID-KS [10] at 0.80, and RSO-EGBA [9] at 0.83. Hence the proposed method attained the global best accuracies of 0.95 on the Restaurant Reviews dataset and 0.91 on the Edmunds-Consumer Car Ratings Reviews dataset, highlighting the CGPA-UGCRA’s superior performance in obtaining the highest global best accuracy across different sentiment analysis tasks.
Figure 21a,b denote the global best accuracy achieved by the various models that performed sentiment analysis in a 100-iteration action across two datasets. Overall, the proposed CGPA-UGCRA attains better accuracy results of 0.95 on the Flipkart Cell Phone Reviews dataset and 0.94 on the IMDB Movie database than other models across both datasets. Other models, SWOANN [32], RoBERTa [29], and RoBERTa-GRU [13], attain an accuracy of 0.89, 0.90, and 0.91, respectively. However, CGPA-UGCRA demonstrates superior results on both datasets.
Table 6 presents the cross-domain performance comparison of the proposed model across various domains, showing exceptional consistency and robustness. The framework achieves top accuracy in electronics (99.98%), followed by automotive (99.87%) and food service (99.72%) domains, with precision, recall, and F1 scores maintained above 98% across all domains. Even in more sentimentally diverse datasets like social media and hospitality, the model sustains performance above 98%, demonstrating strong generalization and adaptability across heterogeneous data sources.

3.3. Ablation Study

To verify the efficiency of the suggested technique, a number of ablation experiments were performed. These types of experiments aim to systematically determine the effectiveness of the proposed technique in sentiment review classification. This ablation study validates the proposed technique’s performance such as accuracy, specificity, and precision in different classes of sentiment review classification.
Table 7 provides comparisons of sentiment analysis model performances across sentiment classes for their respective four datasets. The proposed model shows high accuracy as well as high specificity and precision across all sentiment classes. For instance, the strongly positive class in the Flipkart dataset attains an accuracy of 99.97%, whereas in the Restaurant Review dataset 99.69% accuracy is achieved for the positive class. Across all datasets, the model maintains an accurate performance with scores between 99.02% and 99.94% in different sentiment classes, which indicates robustness and effectiveness in the sentiment classification task.
Table 8 demonstrates the effect of the DistilBERT feature extraction method and UGCRA optimization on improving the performance of sentiment analysis across four datasets. The results show that the addition of the DistilBERT considerably increases recall and F1 scores, almost attaining perfection with UGCRA optimization. Without DistilBERT, the recall and F1 scores dropped lower. The performance increased remarkably by utilizing DistilBERT methods to achieve 99.5 and 98.3 on Restaurant Review. Thus, the absence of UGCRA drops both recall and F1 scores, whereas the UGCRA-enabled model achieves nearly perfect results of 99.95 and 99 on Restaurant Review. Therefore, it is evident that both DistilBERT and UGCRA have important roles to play in increasing the accuracy and efficiency of sentiment analysis for the model.
The ablation results clearly highlight the crucial role of each component in the proposed framework. As presented in Table 8, incorporating the UGCRA optimization significantly enhances model performance, improving recall and F1 scores by over 10–14% across all datasets compared to the non-optimized model. This demonstrates that UGCRA effectively fine-tunes parameters, accelerates convergence, and improves classification stability. Similarly, the inclusion of DistilBERT contributes to substantial gains in semantic understanding, leading to higher recall and F1 scores. Moreover, the analysis comparing the CGPA and without-CGPA configurations reveals the importance of the Convoluted Graph Pyramid Attention (CGPA) mechanism. Without CGPA, the model struggles to capture hierarchical and contextual relationships among features, resulting in lower detection accuracy and weaker generalization. In contrast, the integration of CGPA enables the network to model multi-level dependencies through attention-driven graph representations, yielding superior sentiment discrimination across all datasets. Collectively, these findings confirm that DistilBERT, the UGCRA, and CGPA jointly enhance accuracy, efficiency, and robustness, validating the effectiveness of the proposed CGPA-UGCRA framework.

4. Discussion

The comparative evaluation across multiple datasets demonstrates that the proposed CGPA-UGCRA model consistently surpasses existing sentiment analysis methods, confirming its robustness and adaptability. The performance evaluation results (Figure 14, Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20 and Figure 21) comprehensively demonstrate the effectiveness of the proposed CGPA-UGCRA model across diverse sentiment analysis datasets. As observed in Figure 14, the model achieves 99.72% accuracy on the Restaurant Review dataset and 99.64% on the IMDB Movie database, surpassing AEGA-, HFV-LSTM-, ACDM-, and AdaBoost-based methods. Table 4 and Table 5 further validate this superiority, with 99.87% accuracy and 99.67% F1 score on the Edmunds dataset, and 99.98% accuracy with a 99.95% F1 score on the Flipkart dataset. Figure 15 and Figure 16 indicate that CGPA-UGCRA attains the highest Average Precision (AP) values of 0.98–0.99, proving its effectiveness in balancing precision and recall. The convergence curve in Figure 17 confirms the faster optimization capability of the UGCRA over HOA, GBOA, and RSA, while Figure 18, Figure 19, Figure 20 and Figure 21 demonstrate its consistent dominance in both local and global best accuracies, maintaining values above 0.90 across all datasets. Although the proposed CGPA-UGCRA achieves superior accuracy, its layered attention and optimization modules raise computational demands. The framework incorporates efficiency-driven mechanisms, including DistilBERT embedding for reduced feature dimensionality, early stopping with adaptive learning-rate scheduling to avoid redundant epochs, parallel UGCRA population execution on multi-core CPUs for faster processing, and a Lévy flight strategy to enhance exploration and cut iteration counts. Together, these improvements lower training time by approximately 30% while maintaining high classification accuracy. Cross-domain validation confirms that the CGPA-UGCRA sustains strong performance on unseen domains, with DistilBERT and the UGCRA enabling domain-independent learning, robustness, and scalability across diverse datasets.

Limitation

Despite its strong performance, the model faces challenges in real-time sentiment classification and star rating applications due to high-inference latency when processing large-scale, streaming text–image data. This limitation affects responsiveness in dynamic environments such as social media monitoring or e-commerce review analysis, where rapid sentiment updates are crucial for timely decision-making.

5. Conclusions

This research proposes a CGPA model hybridized with the UGCRA for multi-level sentiment review analysis. This approach enhances sentiment classification by optimizing accuracy and efficiency while accounting for inherent uncertainties, enabling precise sentiment intensity assessment across various dimensions. The sentiment analysis results are then represented in a way that accounts for the inherent uncertainties in sentiment classification, encompassing a range of categories including strongly positive, positive, neutral, strongly negative, and negative. SHAP is used to enhance the transparency and interpretability of the model’s predictions. This approach facilitates the final ranking of classified reviews, predicts ratings on a scale of one to five stars, and generates a recommendation list based on the predicted user ratings. The proposed method achieves an accuracy of 99.5% on the Restaurant Review dataset, 99.98% on the Edmunds-Consumer Car dataset, 99.87% on the Flipkart Review dataset and 99.72% on the IMDB dataset. This shows the effectiveness of the proposed method in sentiment analysis. Future research can explore multimodal sentiment analysis by integrating text, audio, and images to enhance classification accuracy. Additionally, optimizing the model for real-time processing in large-scale applications remains a key challenge.

Author Contributions

Conceptualization, A.K.S.; Methodology, A.K.S. and P.; Validation, A.K.S. and P.; Formal analysis, A.K.S., P., M.A. and Y.G.; Data curation, A.K.S.; Writing–original draft, A.K.S.; Writing–review and editing, P., M.A. and Y.G.; Supervision, P.; Funding acquisition, M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Grant No. KFU253952].

Data Availability Statement

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kumar, N.; Hanji, B.R. Aspect-based sentiment score and star rating prediction for travel destination using Multinomial Logistic Regression with fuzzy domain ontology algorithm. Expert Syst. Appl. 2024, 240, 122493. [Google Scholar] [CrossRef]
  2. Darraz, N.; Karabila, I.; El-Ansari, A.; Alami, N.; El Mallahi, M. Integrated sentiment analysis with BERT for enhanced hybrid recommendation systems. Expert Syst. Appl. 2025, 261, 125533. [Google Scholar] [CrossRef]
  3. Kumar, L.K.; Thatha, V.N.; Udayaraju, P.; Siri, D.; Kiran, G.U.; Jagadesh, B.N.; Vatambeti, R. Analyzing Public Sentiment on the Amazon Website: A GSK-based Double Path Transformer Network Approach for Sentiment Analysis. IEEE Access 2024, 12, 28972–28987. [Google Scholar] [CrossRef]
  4. Sun, B.; Song, X.; Li, W.; Liu, L.; Gong, G.; Zhao, Y. A user review data-driven supplier ranking model using aspect-based sentiment analysis and fuzzy theory. Eng. Appl. Artif. Intell. 2024, 127, 107224. [Google Scholar] [CrossRef]
  5. Karabila, I.; Darraz, N.; EL-Ansari, A.; Alami, N.; EL Mallahi, M. BERT-enhanced sentiment analysis for personalized e-commerce recommendations. Multimed. Tools Appl. 2024, 83, 56463–56488. [Google Scholar] [CrossRef]
  6. Punetha, N.; Jain, G. Game theory and MCDM-based unsupervised sentiment analysis of restaurant reviews. Appl. Intell. 2023, 53, 20152–20173. [Google Scholar] [CrossRef]
  7. Kaur, G.; Sharma, A. A deep learning-based model using hybrid feature extraction approach for consumer sentiment analysis. J. Big Data 2023, 10, 5. [Google Scholar] [CrossRef]
  8. Tripathy, G.; Sharaff, A. AEGA: Enhanced feature selection based on ANOVA and extended genetic algorithm for online customer review analysis. J. Supercomput. 2023, 79, 13180–13209. [Google Scholar] [CrossRef]
  9. Kotagiri, S.; Sowjanya, A.M.; Anilkumar, B.; Devi, N.L. Aspect-oriented extraction and sentiment analysis using optimized hybrid deep learning approaches. Multimed. Tools Appl. 2024, 83, 88613–88644. [Google Scholar] [CrossRef]
  10. He, Z.; Zheng, L.; He, S. A novel approach for product competitive analysis based on online reviews. Electron. Commer. Res. 2023, 23, 2259–2290. [Google Scholar] [CrossRef]
  11. Nawaz, A.; Awan, A.A.; Ali, T.; Rana, M.R. Product’s behaviour recommendations using free text: An aspect based sentiment analysis approach. Clust. Comput. 2020, 23, 1267–1279. [Google Scholar] [CrossRef]
  12. Danyal, M.M.; Khan, S.S.; Khan, M.; Ullah, S.; Mehmood, F.; Ali, I. Proposing sentiment analysis model based on BERT and XLNet for movie reviews. Multimed. Tools Appl. 2024, 83, 64315–64339. [Google Scholar] [CrossRef]
  13. Tan, K.L.; Lee, C.P.; Lim, K.M. Roberta-Gru: A hybrid deep learning model for enhanced sentiment analysis. Appl. Sci. 2023, 13, 3915. [Google Scholar] [CrossRef]
  14. Devi, N.L.; Anilkumar, B.; Sowjanya, A.M.; Kotagiri, S. An innovative word embedded and optimization based hybrid artificial intelligence approach for aspect-based sentiment analysis of app and cellphone reviews. Multimed. Tools Appl. 2024, 83, 79303–79336. [Google Scholar] [CrossRef]
  15. Abbas, S.; Boulila, W.; Driss, M.; Sampedro, G.A.; Abisado, M.; Almadhor, A. Active learning empowered sentiment analysis: An approach for optimizing smartphone customer’s review sentiment classification. IEEE Trans. Consum. Electron. 2023, 70, 4470–4477. [Google Scholar] [CrossRef]
  16. Restaurant Reviews Aspect-based Sentiment Analysis. Available online: https://www.kaggle.com/code/kamonkornbuangsoong/restaurant-reviews-aspect-based-sentiment-analysis/notebook (accessed on 24 October 2024).
  17. Edmunds-Consumer Car Ratings and Reviews. Available online: https://www.kaggle.com/datasets/ankkur13/edmundsconsumer-car-ratings-and-reviews (accessed on 24 October 2024).
  18. Flipkart Cell Phone Reviews. Available online: https://www.kaggle.com/datasets/nkitgupta/flipkart-cell-phone-reviews?select=flipkart_products.db (accessed on 24 October 2024).
  19. Analyzing-the-IMDB-Movie-Dataset. Available online: https://github.com/shishir349/Analyzing-the-IMDB-Movie-Dataset/blob/master/.gitignore (accessed on 24 October 2024).
  20. Gupta, K.; Jiwani, N.; Afreen, N. A combined approach of sentimental analysis using machine learning techniques. Rev. d’Intelligence Artif. 2023, 37, 1–6. [Google Scholar] [CrossRef]
  21. Bahaa, A.; Kamal, A.E.; Fahmy, H.; Ghoneim, A.S. DB-CBIL: A DistilBert-Based Transformer Hybrid Model using CNN and BiLSTM for Software Vulnerability Detection. IEEE Access 2024, 12, 64446–64460. [Google Scholar] [CrossRef]
  22. Igali, A.; Abdrakhman, A.; Torekhan, Y.; Shamoi, P. Tracking Emotional Dynamics in Chat Conversations: A Hybrid Approach using DistilBERT and Emoji Sentiment Analysis. arXiv 2024, arXiv:2408.01838. [Google Scholar] [CrossRef]
  23. Wang, H.; Li, F. A text classification method based on LSTM and graph attention network. Connect. Sci. 2022, 34, 2466–2480. [Google Scholar] [CrossRef]
  24. Ding, Y.; Ma, Z.; Wen, S.; Xie, J.; Chang, D.; Si, Z.; Wu, M.; Ling, H. AP-CNN: Weakly supervised attention pyramid convolutional neural network for fine-grained visual classification. IEEE Trans. Image Process. 2021, 30, 2826–2836. [Google Scholar] [CrossRef]
  25. Agushaka, J.O.; Ezugwu, A.E.; Saha, A.K.; Pal, J.; Abualigah, L.; Mirjalili, S. Greater Canee rat algorithm (GCRA): A nature-inspired metaheuristic for optimization problems. Heliyon 2024, 10, e31629. [Google Scholar] [CrossRef]
  26. Amazon Product Reviews. Available online: https://www.kaggle.com/datasets/saurav9786/amazon-product-reviews (accessed on 23 October 2025).
  27. Trip Advisor Hotel Reviews. Available online: https://www.kaggle.com/datasets/andrewmvd/trip-advisor-hotel-reviews (accessed on 23 October 2025).
  28. Branco, A.; Parada, D.; Silva, M.; Mendonça, F.; Mostafa, S.S.; Morgado-Dias, F. Sentiment Analysis in Portuguese Restaurant Reviews: Application of Transformer Models in Edge Computing. Electronics 2024, 13, 589. [Google Scholar] [CrossRef]
  29. Tan, K.L.; Lee, C.P.; Lim, K.M.; Anbananthen, K.S. Sentiment analysis with ensemble hybrid deep learning model. IEEE Access 2022, 10, 103694–103704. [Google Scholar] [CrossRef]
  30. Sarhan, A.M.; Ayman, H.; Wagdi, M.; Ali, B.; Adel, A.; Osama, R. Integrating machine learning and sentiment analysis in movie recommendation systems. J. Electr. Syst. Inf. Technol. 2024, 11, 53. [Google Scholar] [CrossRef]
  31. Ojo, S.; Abbas, S.; Marzougui, M.; Sampedro, G.A.; Almadhor, A.S.; Al Hejaili, A.; Ivanochko, I. Graph Neural Network for Smartphone Recommendation System: A Sentiment Analysis Approach for Smartphone Rating. IEEE Access 2023, 11, 140451–140463. [Google Scholar] [CrossRef]
  32. Balaganesh, N.; Muneeswaran, K. A novel aspect-based sentiment classifier using whale optimized adaptive neural network. Neural Comput. Appl. 2022, 34, 4003–4012. [Google Scholar] [CrossRef]
  33. Amiri, M.H.; Mehrabi Hashjin, N.; Montazeri, M.; Mirjalili, S.; Khodadadi, N. Hippopotamus optimization algorithm: A novel nature-inspired optimization algorithm. Sci. Rep. 2024, 14, 5032. [Google Scholar] [CrossRef]
  34. Ahmed, M.; Sulaiman, M.H.; Mohamad, A.J.; Rahman, M. Gooseneck barnacle optimization algorithm: A novel nature inspired optimization theory and application. Math. Comput. Simul. 2024, 218, 248–265. [Google Scholar] [CrossRef]
  35. Ghetas, M.; Issa, M. A novel reinforcement learning-based reptile search algorithm for solving optimization problems. Neural Comput. Appl. 2024, 36, 533–568. [Google Scholar] [CrossRef]
Figure 1. Architecture of Proposed Method.
Figure 1. Architecture of Proposed Method.
Mathematics 13 03645 g001
Figure 2. Preprocessing Steps.
Figure 2. Preprocessing Steps.
Mathematics 13 03645 g002
Figure 3. Architecture of DistilBERT.
Figure 3. Architecture of DistilBERT.
Mathematics 13 03645 g003
Figure 4. Ranking model.
Figure 4. Ranking model.
Mathematics 13 03645 g004
Figure 5. Review counts for different classes. (a) Restaurant Review dataset, (b) Edmunds-Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Figure 5. Review counts for different classes. (a) Restaurant Review dataset, (b) Edmunds-Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Mathematics 13 03645 g005
Figure 6. Interpretability analysis outcomes. (a) Restaurant Review dataset, (b) Edmunds-Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Figure 6. Interpretability analysis outcomes. (a) Restaurant Review dataset, (b) Edmunds-Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Mathematics 13 03645 g006aMathematics 13 03645 g006b
Figure 7. Performance analysis on Restaurant Review dataset. (a) Accuracy and (b) loss curves.
Figure 7. Performance analysis on Restaurant Review dataset. (a) Accuracy and (b) loss curves.
Mathematics 13 03645 g007
Figure 8. Performance analysis on Edmunds-Consumer Car Ratings Reviews dataset. (a) Accuracy and (b) loss curves.
Figure 8. Performance analysis on Edmunds-Consumer Car Ratings Reviews dataset. (a) Accuracy and (b) loss curves.
Mathematics 13 03645 g008
Figure 9. Performance analysis on Flipkart Cell Phone Reviews dataset. (a) Accuracy and (b) loss curves.
Figure 9. Performance analysis on Flipkart Cell Phone Reviews dataset. (a) Accuracy and (b) loss curves.
Mathematics 13 03645 g009
Figure 10. Performance analysis on IMDB Movie database. (a) Accuracy and (b) loss curves.
Figure 10. Performance analysis on IMDB Movie database. (a) Accuracy and (b) loss curves.
Mathematics 13 03645 g010
Figure 11. Confusion matrix. (a) Restaurant Review dataset, (b) Edmund Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Figure 11. Confusion matrix. (a) Restaurant Review dataset, (b) Edmund Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, (d) IMDB Movie database.
Mathematics 13 03645 g011
Figure 12. Matthew’s Correlation Coefficient.
Figure 12. Matthew’s Correlation Coefficient.
Mathematics 13 03645 g012
Figure 13. ROC of (a) Restaurant Review dataset, (b) Edmund Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, and (d) IMDB Movie database.
Figure 13. ROC of (a) Restaurant Review dataset, (b) Edmund Consumer Car Ratings Reviews dataset, (c) Flipkart Cell Phone Reviews dataset, and (d) IMDB Movie database.
Mathematics 13 03645 g013
Figure 14. Performance metrics comparison. (a) Restaurant Review dataset. (b) IMDB Movie database.
Figure 14. Performance metrics comparison. (a) Restaurant Review dataset. (b) IMDB Movie database.
Mathematics 13 03645 g014
Figure 15. Precision–recall curve comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Figure 15. Precision–recall curve comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Mathematics 13 03645 g015
Figure 16. Precision–recall curve comparison. (a) Flipkart Cell Phone Reviews dataset. (b) IMDB Movie database.
Figure 16. Precision–recall curve comparison. (a) Flipkart Cell Phone Reviews dataset. (b) IMDB Movie database.
Mathematics 13 03645 g016
Figure 17. Convergence curve comparison.
Figure 17. Convergence curve comparison.
Mathematics 13 03645 g017
Figure 18. Local best accuracy comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Figure 18. Local best accuracy comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Mathematics 13 03645 g018
Figure 19. Local best accuracy comparison. (a) Flipkart Cell Phone Reviews dataset, (b) IMDB Movie database.
Figure 19. Local best accuracy comparison. (a) Flipkart Cell Phone Reviews dataset, (b) IMDB Movie database.
Mathematics 13 03645 g019
Figure 20. Global best accuracy comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Figure 20. Global best accuracy comparison. (a) Restaurant Review dataset. (b) Edmunds-Consumer Car Ratings Reviews dataset.
Mathematics 13 03645 g020
Figure 21. Global best accuracy comparison. (a) Flipkart Cell Phone Reviews dataset. (b) IMDB Movie database.
Figure 21. Global best accuracy comparison. (a) Flipkart Cell Phone Reviews dataset. (b) IMDB Movie database.
Mathematics 13 03645 g021
Table 1. Overview of existing methodologies.
Table 1. Overview of existing methodologies.
ReferencesMethodologyClassificationLimitations
PositiveNeutralNegative
Punetha N. et al. [6]MCDM×Limited to sentiment analysis on social media.
Kaur G et al. [7]HFV+LSTMLacks scalability across diverse domains.
Tripathy G et al. [8]AEGA×High computational cost.
Kotagiri S et al. [9]RSO-EGBAFocused only on social media platforms.
He Z et al. [10]ID-KS×Overemphasis on emoji usage in non-emoji contexts.
Nawaz A et al. [11]ABSA×Limited applicability outside the textile industry.
Danyal MM et al. [12]XLNet×Resource-intensive preprocessing stage.
Tan KL et al. [13]RoBERTa-GRU×High model complexity.
Devi NL et al. [14]RO-EASGBRequires extensive preprocessing.
Abbas S et al. [15]ALML×Complex fusion of multimodal data.
Table 2. Preprocessing Samples.
Table 2. Preprocessing Samples.
Raw TextsPreprocessed TextsPreprocessing Methods
Really great.... value for money...really great value moneyPunctuation removal, stop word removal
Just simply WOW....simply wowLowercasing, punctuation removal, and stop word removal
Wow superb I love it ❤️👍 battery backup so nice 👍👍wow superb love battery backup niceStop word removal, and lowercasing
Christopher Nolan’s epic trilogy concludes in glorious fashion and gives us a thought provoking and suitably satisfying conclusion to an epic saga. It’s emotional, intense and has a great villain in Tom Hardy.christophernolan epic trilogy concludes glorious fashion gives us thought provoking suitably satisfying conclusion epic saga emotional intense great villain tom hardyLowercasing, punctuation removal, stop word removal
Where Gabriela personaly greets you and recommends you what to eat.gabrielapersonaly greets recommends eatLowercasing, stop word removal, and word stemming
For those that go once and don’t enjoy it, all I can say is that they just don’t get it.go enjoy say getLowercasing, stop word removal, contraction expansion, and stemming or lemmatization
Table 3. Parameter settings.
Table 3. Parameter settings.
ParametersDescription
Learning rate0.001
AlgorithmUpdated Greater Cane Rat Algorithm
Batch size32
Iterations100
Epochs100
Table 4. Performance metric comparison on Edmunds-Consumer Car Ratings Reviews dataset.
Table 4. Performance metric comparison on Edmunds-Consumer Car Ratings Reviews dataset.
MethodsAccuracyPrecisionRecallF1 Score
RSO-EGBA [9]92849186
ID-KS [10]8787.508686.67
ABSA [11]9088.4280.284
CGPA-UGCRA (proposed)99.8799.8899.4799.67
Table 5. Performance metrics comparison on Flipkart Cell Phone Reviews dataset.
Table 5. Performance metrics comparison on Flipkart Cell Phone Reviews dataset.
MethodsAccuracy (%)Precision (%)Recall (%)F1 Score (%)
RO-EASGB [14]999798.998
ALML [15]89729593
GNN [31]97969897
SWOANN [32]85838678
CGPA-UGCRA (proposed)99.9899.9599.9599.95
Table 6. Cross-domain Performance Comparison.
Table 6. Cross-domain Performance Comparison.
DatasetDomainAccuracy (%)Precision (%)Recall (%)F1 Score (%)
Restaurant ReviewsFood Service99.7299.3099.3099.30
Edmunds Car RatingsAutomotive99.8799.8899.4799.67
Flipkart Cell Phone ReviewsElectronics99.9899.9599.9599.95
IMDB Movie DatabaseEntertainment99.6499.1099.1099.10
Amazon Product ReviewsE-commerce98.9398.6198.5298.69
Trip Advisor Hotel ReviewsHospitality98.8198.4798.4198.54
Table 7. Performance comparison on different classes.
Table 7. Performance comparison on different classes.
DatasetsNumber of ReviewsAccuracy (%)Specificity (%)Precision (%)
Restaurant ReviewStrongly positive15999.5599.7298.89
Positive171799.6999.3198.24
Neutral39899.6299.0198.05
Strongly negative17999.0299.2698.63
Negative70399.5399.6598.29
Edmunds-Consumer Car Ratings ReviewsStrongly positive422499.8399.8999.58
Positive257699.5299.7099.82
Neutral97599.9499.5899.35
Strongly negative16699.1199.8899.29
Negative55899.7699.2399.41
Flipkart Cell Phone ReviewsStrongly positive33,71999.9799.9899.93
Positive11,03099.3899.3699.45
Neutral373799.2999.4399.73
Strongly negative439699.9399.3399.32
Negative61199.2999.3199.46
IMDB MovieStrongly positive4999.6499.7799.10
Positive250899.3299.7099.80
Neutral217799.1199.6999.27
Strongly negative2099.1099.3799.77
Negative28999.9199.5699.50
Table 8. Ablation Study and Optimization Impact Analysis.
Table 8. Ablation Study and Optimization Impact Analysis.
MethodsRestaurant ReviewEdmunds-Consumer Car Ratings ReviewsFlipkart Cell Phone ReviewsIMDB Movie
RecallF1 ScoreRecallF1 ScoreRecallF1 ScoreRecallF1 Score
Without DistilBERT feature extraction method93.92949289.296.588.986.890.9
With DistilBERT99.598.399.949999.3299.399.2898.5
Without UGCRA optimization86.38685.38589.98884.689
With UGCRA optimization99.959999.3499.399.799.599.3899.60
Without CGPA88.4087.9087.5086.8090.2089.1085.7088.40
With CGPA (proposed full model)99.3099.3099.4799.6799.9599.9599.1099.10
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Srivastava, A.K.; Pooja; Ali, M.; Gulzar, Y. CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization. Mathematics 2025, 13, 3645. https://doi.org/10.3390/math13223645

AMA Style

Srivastava AK, Pooja, Ali M, Gulzar Y. CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization. Mathematics. 2025; 13(22):3645. https://doi.org/10.3390/math13223645

Chicago/Turabian Style

Srivastava, Amit Kumar, Pooja, Musrrat Ali, and Yonis Gulzar. 2025. "CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization" Mathematics 13, no. 22: 3645. https://doi.org/10.3390/math13223645

APA Style

Srivastava, A. K., Pooja, Ali, M., & Gulzar, Y. (2025). CGPA-UGRCA: A Novel Explainable AI Model for Sentiment Classification and Star Rating Using Nature-Inspired Optimization. Mathematics, 13(22), 3645. https://doi.org/10.3390/math13223645

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop