Abstract
This research proposes a composite machine learning (ML) framework for real-time response to negative online reviews, grounded in the psychological principle of negative reinforcement. By integrating K-means clustering to group reviews by thematic similarity and bidirectional encoder representations from transformer (BERT)-based sentiment analysis to assess emotional tone, and the system identifies high-risk clusters requiring marketing intervention. Customized response strategies are designed based on cluster sentiment intensity, and their effectiveness can be evaluated via sentiment transformation functions. The proposed model provides a practical and adaptive approach to digital marketing, enabling brands to respond rapidly, reduce dissatisfaction, and enhance consumer trust in a data-driven environment.
1. Introduction
In the era of digital interaction, online reviews have become a powerful form of consumer voice, shaping public perception and directly influencing purchasing decisions. While positive reviews can enhance brand image, negative online reviews can spread rapidly, harming trust and customer retention if not addressed promptly [1]. Traditional digital marketing models lack the responsiveness and psychological nuance to handle such feedback effectively.
The emergence of ML and natural language processing (NLP) has created new avenues for real-time content understanding and engagement. State-of-the-art models like BERT have transformed sentiment analysis with deep contextual comprehension [2]. Similarly, clustering techniques such as K-means help identify common themes in user complaints [3]. Yet, few systems integrate psychological principles, like negative reinforcement, into digital response strategies.
This research proposes a novel framework that combines BERT-based sentiment analysis, K-means clustering, and emotional transformation modeling rooted in negative reinforcement theory [4].
This research aims to construct a composite ML system that detects and classifies negative online reviews, groups feedback into thematic clusters using k-means, analyzes emotional polarity via BERT, applies personalized, emotionally intelligent response strategies, and evaluates effectiveness using sentiment transformation metrics.
Research Questions
The research questions are as follows:
- How can ML models be integrated to enable real-time emotional analysis of online reviews?
- Can clustering by theme improve the targeting of sentiment-aware marketing responses?
- How effective is a negative reinforcement model in altering consumer sentiment post-response?
- What quantitative measures best represent emotional recovery and brand trust enhancement?
This article is organized as follows: Section 2 presents the literature review results on digital marketing, sentiment analysis, clustering, and negative reinforcement. Section 3 describes the experimental design and theoretical modeling. Section 4 explains the implementation of the developed system. Section 5 concludes the study.
2. Literature Review
2.1. Digital Marketing in the Age of Online Reviews
Online reviews are a cornerstone of digital consumer engagement, providing both challenges and opportunities for brand management. Research shows that negative reviews can drastically influence purchasing decisions, especially when left unanswered [5]. Therefore, modern digital marketing must shift from passive content delivery to active, real-time sentiment engagement.
2.2. The Risk and Amplification of Negative Feedback
Social media and review platforms amplify the reach of dissatisfaction. Crises such as the United Airlines passenger removal incident in 2017 [6] and Starbucks’ racial profiling scandal [7] demonstrate how delayed or tone-deaf responses can intensify public backlash. Automated feedback systems must be emotionally attuned and psychologically responsive to prevent escalation.
2.3. Sentiment Analysis and NLP Techniques
Sentiment analysis decodes emotional intent in language. Earlier models, such as support vector machine (SVM) and lexicon-based classifiers, were limited in context awareness. BERT’s deep bidirectional transformers enable superior semantic comprehension, making it a state-of-the-art approach for sentiment classification. It is particularly effective for fine-tuning on domain-specific tasks such as review-level emotion detection.
2.4. Clustering for Thematic Detection: K-Means
K-means is a foundational clustering algorithm used for topic discovery in unlabeled data. Applied to term frequency–inverse document frequency (TF-IDF) or BERT-embedded vectors, it groups texts by thematic similarity, revealing latent concerns such as delivery delays or pricing dissatisfaction. This provides a structured basis for tailored response strategies.
2.5. Negative Reinforcement Theory in Marketing
Negative reinforcement, first formalized by Skinner, refers to increasing a desired behavior by removing an unpleasant stimulus. In marketing, this involves removing sources of dissatisfaction (e.g., by offering apologies or compensation) to reinforce consumer loyalty. Recent interdisciplinary studies emphasize the role of affective computing in emotionally calibrated communication [8].
2.6. Research Gap and Motivation
Although BERT and K-means are widely used in NLP, few frameworks synthesize ML with psychological theory for marketing automation. Existing models often lack real-time adaptability and emotional nuance. This study seeks to address this by developing an emotionally aware, learning-based feedback system for digital marketing.
3. Methodology
3.1. Overview of System Architecture
We developed a composite ML framework designed for real-time emotional analysis and response to negative online reviews. The system consists of the following major components:
- Data collection and preprocessing;
- Text embedding and feature representation;
- K-means clustering for topic segmentation;
- Sentiment classification via BERT;
- Sentiment transformation function modeling;
- System evaluation and feedback learning.
An abstract representation of the process this research proposed can be expressed as Equation (1):
where is the i-th user review, is the cluster label assigned via K-means, is the sentiment output, and is the emotional shift post-intervention.
3.2. Data Collection and Preprocessing
Online review data is collected using web scraping APIs and filtered to retain textual fields. Each review is tokenized and cleaned via standard preprocessing: lowercasing, punctuation removal, and stop word filtering. Let the cleaned corpus be represented as Equation (2).
where is the space of normalized text tokens.
3.3. Text Embedding and Feature Representation
BERT used in this study is based on the BERT-base architecture, whose hidden dimension provides a balanced design in terms of the following:
- Expressiveness: sufficient dimensionality to capture subtle linguistic dependencies;
- Computational efficiency: compact enough for effective training and inference;
- Transformer structure: the 768-dimensional embedding evenly distributed across 12 attention heads (i.e., 64 units per head).
Each token in the input sentence, including the special [CLS] token summarizing the sentence semantics, is converted into a 768-dimensional vector. The BERT-base architecture is illustrated in Figure 1.
Figure 1.
BERT-base architecture [2].
For this research, each pre-processed review is transformed into a high-dimensional semantic vector using BERT, as expressed in Equation (5), producing a document embedding matrix in the form of Equation (6):
To prevent excessive dimensionality from hindering clustering, dimensionality-reduction techniques such as principal component analysis (PCA) [9] or t-distributed stochastic neighbor embedding (t-SNE) [10] were applied, as formulated in Equation (7):
where denotes the i-th token, is its contextual embedding, and represents the aggregated semantic representation of the entire review. The resulting matrix may be optionally reduced to to accelerate clustering and mitigate the curse of dimensionality.
3.4. K-Means Clustering for Topic Segmentation
To group semantically similar reviews, K-means clustering is applied to . Assuming clusters, the objective function is defined in Equation (8), where denotes cluster i, and each review is assigned a label as expressed in Equation (9):
Here, is the embedded vector of the j-th review, the i-th cluster, and its centroid. The objective minimizes intra-cluster variance, ensuring that semantically similar reviews are grouped together. Each is determined by minimizing the Euclidean distance to all centroids.
3.5. Sentiment Classification via BERT
The sentiment polarity of each review was estimated using a fine-tuned BERT classifier. Each input review was tokenized and encoded through BERT’s transformer layers to capture contextual dependencies among words. The hidden representation corresponding to the [CLS] token in the final layer, denoted , serves as a compact sentence-level embedding.
The encoded vector was fed into a fully connected layer followed by a softmax activation to produce a probability distribution over sentiment categories, as defined in Equation (10):
where is the probability vector for the i-th review, and are trainable parameters, and is the embedding dimension. Each represents the likelihood of the review being negative, neutral, or positive, satisfying .
To convert categorical probabilities into a continuous sentiment measure, a scalar sentiment score was computed for each review, as shown in Equation (11):
The score reflects the degree of emotional polarity: positive values indicate favorable sentiment, negative values reflect unfavorable sentiment, and values near 0 correspond to neutral emotion. This continuous representation enables fine-grained sentiment comparison and supports subsequent quantitative analyses, including emotional-shift computation () and cluster validation.
The use of rather than discrete sentiment labels preserves intra-class emotional variance, providing a statistically interpretable metric for tracking sentiment recovery or degradation in system-generated responses.
3.6. Sentiment Transformation Modeling
Each cluster corresponds to a customized response strategy , which aims to improve sentiment. Let and represent the sentiment scores before and after the intervention. The sentiment transfer function (STF) is defined as (12). Then, the expected switching effectiveness is modeled via (13), where is the average sentiment shift of the strategy on cluster . Afterwards, the system selects the response template that can maximize the restoration of emotions, as shown in (14).
3.7. Sentiment Transformation Modeling
BERT can be fine-tuned during training using cross-entropy loss as shown in Equation (15). Finally, in terms of evaluation indicators, this research adopted Accuracy and F1-score to evaluate sentiment classification, Silhouette score to evaluate clustering, and Mean ΔE and Success Rate to evaluate sentiment conversion.
4. Implementation
To validate the proposed ML framework, a sentiment-aware and cluster-guided response system has been implemented, using the dataset of women’s e-commerce clothing reviews [11]. The system includes a sequential pipeline:
- Data preprocessing and cleaning;
- BERT-base text embedding;
- K-means clustering for topic segmentation;
- Fine-tuned sentiment classification;
- Response strategy simulation;
- Evaluation using emotional transformation metrics.
This dataset contains 23,486 customer reviews of women’s clothing e-commerce websites. This research selects features from columns in the dataset for experimental use, as shown in Table 1.
Table 1.
Features and descriptions selected from dataset.
On the annotations that map sentiment, star ratings were mapped as 3 labels for classification: 1 or 2 stars → Negative (label 0), 3 stars → Neutral (label 1), and 4 or 5 stars → Positive (label 2).
For the K-means clustering operation, this research set K = 4, PCA took 1 and 2, and the Silhouette Score was 0.0885. The clustering results are shown in Figure 2. Performance indicators: Accuracy = 0.6950, F1 score (label 0) = 0.6863, F1 score (label 2) = 0.8306, Area Under the Curve (AUC) = 0.8295. Please refer to Figure 3 and Figure 4 for the confusion matrix and ROC curve, respectively.
Figure 2.
Clustering result on BERT embeddings.
Figure 3.
Confusion matrix of sentiment classification.
Figure 4.
ROC curves of sentiment classification.
Finally, “post-response” sentiment adjustments were simulated to verify that the results of this experiment are effective. Assume that after a well-handled response, the following are true:
- Negative reviews can improve +0.4–+0.7 (shift toward positive);
- Neutral reviews can improve +0.1–+0.3;
- Positive reviews slightly increase +0.0–+0.1 (reinforcement).
is calculated, and the histogram shown in Figure 5 is obtained. The mean value of = 0.2616, and the percentage of users with improved sentiment is 100.00%.
Figure 5.
Distribution of sentiment shifts after brand response.
5. Conclusions
We developed an integrated machine learning framework designed for real-time, emotionally calibrated responses to negative online reviews. By synergizing semantic clustering (via K-means), contextual sentiment classification (via BERT), and a sentiment transformation model grounded in negative reinforcement theory, the system effectively models the emotional feedback loop and quantifies sentiment recovery through interpretable metrics. Empirical validation on e-commerce review data demonstrated its practical efficacy in enhancing consumer sentiment post-intervention.
The contributions of this research are threefold: A: it offers a novel fusion of natural language processing techniques and behavioral psychology for adaptive digital marketing; B: it introduces Sentiment Shift (ΔE) as a quantitative proxy for emotional recovery, enabling outcome-driven optimization; C: it validates the potential of psychologically informed AI systems to drive real-time engagement and trust restoration at scale.
Future research should explore domain generalization across industries, incorporate reinforcement learning for dynamic strategy optimization, and integrate real user feedback to enhance learning fidelity. Furthermore, expanding the framework to multilingual contexts and cross-cultural sentiment interpretation would broaden its applicability in global digital marketing ecosystems.
Author Contributions
Conceptualization, C.-H.L. and Y.H.; methodology, C.-H.L., Y.H. and Y.L.; investigation, T.-S.L.; resources, T.-S.L. and Y.L.; writing—original draft preparation, C.-H.L.; writing—review and editing, Y.H. and Y.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
https://drive.google.com/drive/folders/10J0PxvD1k-hUtHTpmasOGGSKdwchQRK0?usp=drive_link (accessed on 13 December 2025).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Liu, T.-S. Negative Reinforcement Mode of Digital Marketing Services for the Design and Implementation. Master’s Thesis, Takming University of Science and Technology, Taipei, Taiwan, 28 May 2015. [Google Scholar]
- Danish, M.; Amjad, M.; Ahmad, T. Comparative Sentiment Analysis of Cryptocurrency Apps Using BERT. In Proceedings of the 7th International Conference on Contemporary Computing and Informatics, Greater Noida, India, 18–20 September 2024. [Google Scholar]
- Zhao, Y.; Zhou, X. K-means Clustering Algorithm and Its Improvement Research. J. Phys. Conf. Ser. 2021, 1873, 12074. [Google Scholar] [CrossRef]
- Widrick, S.; Fram, E. Identifying Negative Products: Do Customers Like to Purchase Your Products. J. Prod. Brand Manag. 1992, 1, 43–50. [Google Scholar] [CrossRef]
- EXPRESS: The Impact of Negative Reviews on Online Search and Purchase Decisions. Available online: https://www.researchgate.net/publication/372380179_EXPRESS_The_Impact_of_Negative_Reviews_on_Online_Search_and_Purchase_Decisions (accessed on 20 November 2025).
- United Airlines: Passenger Forcibly Removed from Flight. Available online: https://www.bbc.com/news/world-us-canada-39554421 (accessed on 6 May 2025).
- Starbucks Ordered to Pay $25m to Ex-Employee in Racial Discrimination Case. Available online: https://www.bbc.com/news/world-us-canada-65906962 (accessed on 6 May 2025).
- Klekere, E. Affective Computing for Managing Crisis Communication. In Proceedings of the IEEE International Conference on Affective Computing and Intelligent, Cambridge, MA, USA, 10–13 September 2023. [Google Scholar]
- Sehgal, S.; Singh, H.; Agarwal, M.; Bhasker, V.; Shantanu. Data analysis using principal component analysis. In Proceedings of the International Conference on Medical Imaging, m-Health and Emerging Communication Systems, Greater Noida, India, 7–8 November 2014. [Google Scholar]
- Tarekegn, G.B.; Tai, L.-C.; Lin, H.-P.; Tesfaw, B.A.; Juang, R.-T.; Hsu, H.-C. Applying t-Distributed Stochastic Neighbor Em-bedding for Improving Fingerprinting-Based Localization System. IEEE Sens. Lett. 2023, 7, 6005004. [Google Scholar] [CrossRef]
- Women’s E-Commerce Clothing Reviews. Available online: https://www.kaggle.com/datasets/nicapotato/womens-ecommerce-clothing-reviews (accessed on 20 November 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).