Text Analytics on YouTube Comments for Food Products

Tsiourlini, Maria; Tzafilkou, Katerina; Karapiperis, Dimitrios; Tjortjis, Christos

doi:10.3390/info15100599

Open AccessArticle

Text Analytics on YouTube Comments for Food Products

School of Science and Technology, International Hellenic University, 57001 Thessaloniki, Greece

^*

Author to whom correspondence should be addressed.

Information 2024, 15(10), 599; https://doi.org/10.3390/info15100599

Submission received: 9 July 2024 / Revised: 13 September 2024 / Accepted: 19 September 2024 / Published: 30 September 2024

(This article belongs to the Special Issue 2nd Edition of Information Retrieval and Social Media Mining)

Download

Browse Figures

Versions Notes

Abstract

YouTube is a popular social media platform in the contemporary digital landscape. The primary focus of this study is to explore the underlying sentiment in user comments about food-related videos on YouTube, specifically within two pivotal food categories: plant-based and hedonic product. We labeled comments using sentiment lexicons such as TextBlob, VADER, and Google’s Sentiment Analysis (GSA) engine. Comment sentiment was classified using advanced Machine-Learning (ML) algorithms, namely Support Vector Machines (SVM), Multinomial Naive Bayes, Random Forest, Logistic Regression, and XGBoost. The evaluation of these models encompassed key macro average metrics, including accuracy, precision, recall, and F1 score. The results from GSA showed a high accuracy level, with SVM achieving 93% accuracy in the plant-based dataset and 96% in the hedonic dataset. In addition to sentiment analysis, we delved into user interactions within the two datasets, measuring crucial metrics, such as views, likes, comments, and engagement rate. The findings illuminate significantly higher levels of views, likes, and comments in the hedonic food dataset, but the plant-based dataset maintains a superior overall engagement rate.

Keywords:

plant-based products; hedonic food products; sentiment analysis; text analytics; YouTube comments; machine learning

1. Introduction

Driven by a substantial pool of potential consumers, stakeholders in the food industry, including suppliers and retailers, are increasingly motivated to harness social media platforms, amplifying interactive engagements [1]. In this context, the ever-evolving landscape of digital content consumption has witnessed the widespread traction of visual-oriented social media platforms, such as YouTube, Instagram, and Tik Tok. Among these platforms, YouTube stands out as a transformative force, shaping how individuals engage with a myriad of topics, including food. YouTube has become a hub for culinary exploration, providing a diverse array of content, ranging from cooking tutorials to food reviews. The platform’s influence extends beyond mere viewership, with users actively participating through comments, likes, and shares, thereby creating a dynamic and interactive community. Consequently, a research interest revolves around investigating the role of YouTube within the food sector. Since the food sector concerns several categories of food products, and different products might trigger different emotions or motivations to consume, we distinguish between plant-based and hedonic food products. This way, we can evaluate emotional reactions towards different categories. To briefly describe the terms, plant-based products are foods designed to avoid the use of animal-derived counterparts, such as meat, milk, and eggs. The motivation behind consuming plant-based products includes concerns about animal welfare, health, and environmental impact. Hedonic products are foods chosen primarily for the sensory pleasure they provide, such as ice cream, potato chips, and candies. These treats are consumed for the emotional enjoyment they bring, often leading to spontaneous purchases due to their perceived reward value.

In our study, we aim to contribute to the evolving understanding of user sentiments within YouTube food videos. Our research involves three objectives. Primarily, we aim to understand user sentiment in a variety of YouTube videos related to plant-based and hedonic food products. Secondly, we want to determine the most accurate Machine-Learning (ML) algorithm for detecting sentiment in YouTube food videos. The focus of this objective is to explore variations in the ML models’ performance and evaluate their usefulness in predicting sentiment through hybrid or other approaches. Lastly, the study seeks to explore user engagement levels in YouTube videos related to food, examining metrics such as views, likes, comments, and engagement rate. In particular, we identify the most appropriate data-labeling tool by comparing two well-known libraries. Additionally, we investigate the most accurate ML algorithm for predicting sentiment. Lastly, we explore the interactions of users with food videos on YouTube. The purpose of this study is to address the following research questions:

RQ1: What is the most appropriate sentiment analysis tool for our study when it comes to data labeling, among TextBlob, VADER, and GSA?
RQ2: What is the most accurate ML algorithm to detect sentiment in YouTube food videos?
RQ3: How are user engagement levels reflected on YouTube food videos, particularly concerning views, likes, comments, and engagement rate?

The main innovation of this study lies on its context, as previous research efforts in the area are scarce. In particular, this study focuses on the identification of sentiment in YouTube comments about food products, comparing between plant-based and hedonic products. The findings might hold significant implications for advancing our understanding of user sentiments in YouTube food videos, with a specific emphasis on the domains of plant-based and hedonic food products. Our work seeks to address critical gaps in the current literature, specifically in the context of plant-based foods, where research efforts thus far are notably limited. Understanding public sentiment toward a specific food category can greatly influence marketing strategies, product development, and content creation.

2. Theoretical Background

In the theoretical background section, the study examines aspects related to plant-based and hedonic products. Furthermore, the research highlights the significance of YouTube comments as authentic user-generated content and reviews relevant prior works within the academic landscape.

2.1. Plant-Based Products

In recent years, plant-based diets have caught the attention of the general public [2]. Due to growing concerns about either animal welfare [3] or how diet affects our health and the environment, a collective aspiration has emerged to reduce the consumption of animal-based products. Evidently, consumers are demonstrating a growing interest in replacing, reducing, or even completely eliminating animal-based products, such as meat, milk, and eggs. Nevertheless, the extensive utilization of animal products in various traditional food cultures generates cultural, culinary, and sensory conflicts, thereby highlighting the challenges associated with the ongoing transition [4]. The term “plant-based” signifies a diet that avoids consuming animal products [5]. At its core, the “plant-based” concept encompasses food items specifically crafted to closely imitate their animal-derived counterparts [6]. Plant-based foods (PBFs) generally lower the risk of developing cardiovascular diseases due to their rich nutrient content, including fiber and antioxidants, while having lower levels of saturated fats and cholesterol compared to foods sourced from animals [7].

The global consumption of plant-based products has doubled from 6.7% in 2008–2011 to 13.1% in 2017–2019, with plant-based milk and meat being the most popular options [8]. Particularly, plant-based beverages have experienced remarkable growth. They offer a lactose- and cholesterol-free alternative, effectively serving as wholesome substitutes for dairy milk [9]. Plant proteins are also seeing significant traction. The global plant-protein market is expected to grow from USD 10.3 billion in 2020 to USD 15.6 billion by 2026 [10]. Plant-based products’ increasing presence in supermarkets and discount stores, and their presumed potential to support a transformation towards more sustainable food systems, underline the relevance of this food market segment.

2.2. Hedonic Products

The desire to eat for pleasure rather than out of prolonged food deprivation is called hedonic hunger [11]. Hedonic foods, encompassing indulgent treats like ice cream, potato chips, and candies, are chosen by consumers primarily for the sensory pleasure they offer, leading to spontaneous purchases driven by the perceived heightened reward value associated with these foods [12]. When individuals buy pleasurable items, they generally undergo emotional reactions, such as experiencing joy while consuming a delightful dessert, and they are less price-sensitive when they want to make a hedonic purchase [13].

Dhar and Wertenbroch [14] discovered in their research that consumers commonly associate hedonic food with enjoyment and pleasure, emphasizing that hedonic consumption is typically motivated by individuals seeking pleasurable experiences. Also, the consumption of hedonic goods is characterized by a subjective, affective, and multi-sensory emotional experience, involving tastes, sounds, scents, tactile impressions, and visual images. This highlights the subjective aspects over objective considerations [15]. Additionally, consumers were demonstrated to place considerable importance on the “taste” and “appearance” of hedonic food products, further emphasizing the sensory and aesthetic factors driving the selection of such items. Moreover, individuals with lower health concerns exhibit a greater inclination toward purchasing hedonic foods. This tendency is reinforced by the association of hedonic foods with an unhealthy image, aligning with people’s desires [16]. Also, pleasurable products can evoke a sense of guilt, prompting consumers to engage in reflection and adopt more altruistic behaviours as a way to offset this feeling.

2.3. YouTube Comments as User Generated Content

The number of videos on the YouTube platform is growing by 100% every year [17]. Considering that it enables users to sign up, create channels, edit their profiles with personal information, add images, write comments, and share videos on other social networks, YouTube is currently one of the most comprehensive social networks available [18]. A notable hallmark of YouTube is its practice of making videos publicly accessible to all, including non-subscribers; however, the privilege of commenting is exclusively reserved for registered users of the platform. Through the act of uploading videos onto the platform, participating in discussions through posts, and disseminating shared content, consumers wield the power to shape marketing strategies, communication initiatives, and even the purchasing choices of individuals spanning virtually every conceivable product category [19]. Users also have the ability to leave comments on videos. These comments encompass a spectrum of viewpoints, including opinions, questions about the video’s content, expressions of gratitude towards the video creator, or even discontent directed at the video itself or its creator [20]. This mechanism of commenting essentially contributes to the pool of User-Generated Content (UGC). User-Generated Content (UGC), also referred to as electronic Word-of-Mouth (eWOM), operates similarly to traditional word-of-mouth [21], with the distinction that it spreads through digital platforms. UGC encompasses original materials created and uploaded to the internet by non-media individuals.

This form of content exerts a significant impact on people’s consumption patterns, often finding its way onto social media platforms such as Facebook, YouTube, Twitter, and Instagram. Online consumers frequently turn to user-generated content as a crucial resource to aid them in making informed purchase choices [22]. The quote, “A brand is no longer what we tell the consumer it is—it is what consumers tell each other it is” (Scott Cook, co-founder of Intuit), captures how user-driven discussions are shaping the changing nature of brand perception.

2.4. Sentiment Analysis

Sentiment analysis, also known as Opinion Mining, is conducted at the document, sentence, and aspect levels [23]. The goal of sentiment analysis is to determine whether text generated by users conveys a positive, negative, or neutral opinion [24]. Three primary approaches for sentiment analysis have been recognized: (a) lexicon-based, (b) ML, and (c) hybrid approaches. Lexicon-based approaches were first used for sentiment analysis. There are two such approaches in this category: (a) the dictionary-based approach and (b) the corpus-based approach. Predefined dictionaries like WordNet and SentiWordNet rely upon dictionary-based sentiment classification. On the other hand, corpus-based sentiment analysis involves performing a statistical or semantic analysis of document content, rather than using predefined dictionaries [25].

ML approaches categorize the polarity of sentiments (such as negative, positive, and neutral) by leveraging both training and testing datasets. These approaches can be divided into three primary categories of learning: (a) supervised learning, (b) unsupervised learning, and (c) semi-supervised learning. By harnessing ML techniques, these methodologies can decipher intricate patterns inherent to specific domains within textual content, thus yielding more accurate results [26]. The most commonly used methods in ML are Support Vector Machines (SVM) and the Naïve Bayes classifier. Naïve Bayes is effective when applied to well-formed text corpora, while SVM perform well with datasets of low dimensionality [27]. Studies have shown that combining both methods results in higher accuracy. Therefore, to improve the outcomes, it is recommended to combine both methods, resulting in improved results compared to using a single approach. Hybrid models combine both lexicon-based and ML approaches.

The rise of social media platforms has significantly altered the worldwide scene, gradually replacing traditional means of communication, spreading ideas, and even the way people approach learning on their own [28]. Businesses and organizations extensively utilize sentiment analysis to identify customer opinions on social media platforms. Between 2008 and 2022, the number of published papers including the concept of “sentiment analysis in social networks” grew at a geometric rate of 34% year by year [29].

The most popular social media platform for extracting information is Twitter [27], and a significant number of academic papers use Twitter as their database. In our research, we use YouTube as the source for customer comments. In the literature, a limited number of relevant research efforts conducted sentiment analysis using YouTube (e.g., [28,30]).

2.5. Related Work

The food industry has attracted significant attention from researchers. Rajeswari et al. [31] analyzed consumer reviews for organic and regular food products, concluding that sentiment scores for organic products exceeded those for regular products. In a similar vein, Meza and Yamanaka [32] investigated the dissemination of information about local and organic foods on YouTube. Their findings revealed that viewers generally portrayed sustainable food positively within a broader context, occasionally drawing comparisons with artificially produced alternatives and using these comparisons to promote organic options. In the realm of social media platforms, Lim et al. [33] directed their attention toward analyzing emotions expressed in comments on Food and Beverages Facebook fan pages. Their conclusion highlighted that the sentiment scores provided in these comments do not always accurately reflect the overall mood and sentiment of the entire message due to the analysis’s focus on the word level.

Tzafilkou et al. [34] focused on the emotional states of viewers by analyzing their facial expressions and subjective assessments while watching food video campaigns. Their findings highlighted that different food types and media elicit varied emotional responses, with sadness frequently emerging as a dominant emotion. However, their study primarily relied on visual and subjective data, which may not fully capture the nuances of consumer sentiment, particularly in text-based environments like YouTube comments. Similarly, Pastor et al. [35] explored the presence of food products on children’s YouTube channels, uncovering a predominance of non-essential and nutritionally poor products. While this research is insightful, it does not address the sentiment or emotional response to these products, focusing instead on their prevalence and nutritional aspects.

In another recent study [36], the authors investigated the link between online consumer behavior and purchase intent, emphasizing facial expressions during food video campaigns. Their study found that Neural Networks and Random Forest models effectively predicted purchase intent, with emotions like sadness and surprise being significant predictors. While their approach is robust, it is limited by its reliance on facial emotion data rather than textual sentiment, which is central to our study. In the realm of plant-based products, Shamoi et al. [37] examined public sentiment towards vegan products and found a growing acceptance and positive sentiment, albeit with some lingering fears.

Thao [38] provided insights into consumer attitudes towards vegetarian food, revealing that a majority of opinions were neutral, with quality receiving the most positive feedback. While these studies contribute to understanding sentiment towards plant-based products, they do not delve deeply into the specific sentiment analysis methods or their effectiveness.

Dalayya et al. [39] explored perceptions of plant-based diets for cancer prevention and management, highlighting public preferences, but without directly comparing sentiment analysis methods. Bhuiyan et al. [40] proposed an attention-based approach using CNN and LSTM models, achieving a high accuracy of 98.45%. However, their focus on food review datasets with CNN and LSTM models does not account for the complexities of sentiment in video comments, which is central to our study.

Gunawan et al. [41] emphasized the effectiveness of SVM for sentiment analysis in food reviews, a finding that aligns with our use of SVM, but it does not address the hybrid approach’s added value. Panagiotou and Gkatzionis [42] developed a food-related emotion lexicon to assess emotions directly, while Liapakis [43] proposed a sentiment lexicon-based analysis for the Greek food and beverage industry. While these lexicons are useful, they do not incorporate the latest advancements in sentiment analysis techniques or hybrid models that combine multiple methodologies.

There are several other recent studies that explored sentiment analysis in food-related content, for instance, [44], who introduced a novel approach of aspect-based sentiment analysis, which enhances aspect-based sentiment analysis by incorporating affective knowledge from SenticNet into a Graph Convolutional Network (GCN). In [45], the authors applied Aspect-based Sentiment Classification (ASC) to predict the corresponding emotion of a specific target of a sentence related to food reviews.

According to recent literature on social media sentiment analysis tasks, the combination of supervised machine-learning models and text dictionaries like TextBlob and VADER tends to reveal higher levels of accuracy [46,47]. Similarly, in [48] the authors suggest a hybrid sentiment analysis approach where the lexicon-based methods are used with deep-learning models to improve sentiment accuracy.

3. Materials and Methods

In this section, we provide a detailed presentation of our research endeavors. Specifically, we present the sentiment analysis tasks, as well as the methodology employed to analyze user’s engagement on YouTube.

3.1. Data Collection for Sentiment Analysis

The first step of our research was to collect the data, which served as the foundation for subsequent steps. To comprehensively address the research questions, we had to gather two distinct types of data: YouTube video metrics and YouTube video comments. To build a comprehensive dataset, videos related to plant-based and hedonic food products were manually compiled based on the specific criteria outlined in Table 1 and Table 2. In September 2023, 83 videos were finalized for analysis—24 focused on hedonic products and 59 on plant-based products. The decision to include more videos for plant-based products was driven by two key factors on the YouTube platform: (a) many videos for plant-based products have fewer than 100 comments, disqualifying them from our selection, and (b) even if some videos exceed 100 comments, they still did not accumulate as many comments as hedonic food videos. As a result, we chose to collect more videos about plant-based products to create a rich dataset, even though it is still smaller than the dataset for hedonic food products.

The video compilation was completed by manually recording key information for each video, such as views, comments, likes, and engagement rate. These metrics form the basis for evaluating engagement levels. From the chosen videos, only top-level comments were exported, totaling 188,011 comments. Each comment in the dataset included essential details, like the author’s name, comment, posting time, likes, and replies. Ultimately, two CSVs were generated, one for each food category.

3.2. Data Preprocessing

The initial phase of data preprocessing begins with text cleaning and normalization. This process encompasses the removal of special characters, punctuation, stop words, and extraneous information. Tokenization further dissects the text into individual words or sentences, handling matters such as contractions and hyphenated words. Removing stop words filters out common and trivial words that add little to the overall context. Additionally, noise and irrelevant data are eliminated through the removal of non-alphanumeric characters, excessive whitespace, and repeated characters [49,50].

To extract valuable insights from textual data, a systematic data cleaning approach was implemented. This encompassed procedures such as lowercasing, lemmatization, converting emojis to word representations, and eliminating noise like URLs, HTML tags, usernames, stop words, duplicates, and non-alphabetic characters. This resulted in a refined dataset containing 93,411 comments for the hedonic dataset and 17,873 for the plant-based dataset. Next, the comments were categorized into sentiment classes (“Positive”, “Negative”, or “Neutral”). Two dictionary-based tools, TextBlob [51] and VADER [52], which is used as shorthand for Valence Aware Dictionary for Sentiment Reasoning, and the sentiment analysis engine of the Natural Language API offered by Google Cloud [53], symbolized by GSA, were evaluated for this purpose. VADER, which was used via the Python package vaderSentiment, considered various elements, such as uppercase vs. lowercase letters, emojis, punctuation, smileys, and slang [54]. Meanwhile, TextBlob provided polarity scores within the range of

- 1

to 1. A score of

- 1

implies a negative sentiment, 0 signifies a neutral sentiment, and 1 represents a positive sentiment [55]. GSA also scores between

- 1.0

(negative) and

1.0

(positive), which correspond to the overall emotional leaning of the submitted text. GSA additionally generates a magnitude score, which indicates the overall strength of emotion within the given text. We submitted text to Google using the Python programming language version 3.10.

Following data labeling, tokenization was performed using the NLTK (Natural Language Toolkit) library. This tool breaks down sentences into a sequence of words, removing punctuation and special characters. This process results in a structured representation of each comment, where words are isolated and can be individually analyzed. To prepare the datasets for ML model training and evaluation, a crucial step involved dividing them into two distinct sets: a training set and a testing set. This was accomplished using the train test split function from the scikit-learn library. This function facilitated the random shuffling of the dataset and allocated

80 %

of the data to the training set, with the remaining

20 %

constituting the testing set. The same process was applied to both food categories.

3.3. Feature Extraction Using Frequency-Inverse Document Frequency

In this research phase, Term Frequency-Inverse Document Frequency (TF-IDF) vectorization was employed to convert tokenized comments into a numerical format suitable for sentiment analysis. TF-IDF assesses word importance within a specific text category, with TF indicating significance in a document and IDF characterizing its ability to differentiate in text classification [56]. Vital for sentiment analysis and ML, TF-IDF features capture semantic importance, aiding in sentiment pattern identification. The TfidfVectorizer from the scikit-learn library executed the TF-IDF vectorization process, converting tokenized comments into numerical representations based on TF-IDF values, encapsulating term importance relative to the entire dataset.

3.4. Model Training and Testing

The success of sentiment prediction relies on the careful selection of models, each with unique characteristics adaptable to diverse datasets. Our selected ML algorithms—SVM, Random Forest, Naive Bayes, Logistic Regression, and XGBoost—were chosen for their widespread use in sentiment analysis on unorganized social media data [30,57,58,59,60]. For all models, a meticulous hyperparameter fine-tuning process was undertaken to maximize predictive performance. Utilizing Grid Search, we systematically explored hyperparameter combinations by specifying ranges for each. This approach ensured robustness and mitigated overfitting risks through 5-fold cross-validation. The dataset was divided into five folds, with the model being iteratively trained on four folds, and tested on the last one. After completing Grid Search and cross-validation, optimal hyperparameters leading to the best predictive performance were selected for subsequent model training.

3.5. Evaluation of the Models

In the evaluation phase, sentiment analysis classifiers underwent a comprehensive assessment using key macro average metrics: accuracy, precision, recall, and F1 score. Selected for their broad applicability across diverse contexts [60,61,62], these metrics ensure a robust evaluation framework, transcending specific domains. Accuracy, precision, recall, and F1 score were applied to assess the models’ performance. The accuracy is the ratio between the correctly classified samples and the total number of samples in the evaluation dataset. Precision represents the percentage of retrieved samples that are relevant and is determined by the ratio of correctly classified samples to all samples assigned to that class. Precision values range from 0 to 1, where a score of 1 indicates that all samples in the class were correctly predicted, while a score of 0 indicates that no correct predictions were made in the class. The recall metric represents the proportion of positive samples that are accurately identified. It is computed by dividing the number of correctly classified positive samples by the total number of samples assigned to the positive class. The recall value ranges from 0 to 1, with 1 indicating perfect prediction of the positive class and 0 indicating incorrect prediction of all positive class samples. The F1 score is the harmonic mean of precision and recall, meaning that it penalizes the extreme values of either. This metric is not symmetric among the classes. The F1 score ranges from 0 to 1, where a value of 1 indicates maximum precision and recall, while 0 indicates zero precision and/or recall [63]. In simple terms, accuracy provides an overall assessment of correct classifications, precision focuses on the accuracy of positive predictions, recall measures the ability to capture all positive instances, and the F1 score offers a balanced assessment, considering both precision and recall. Table 3 provides a quick breakdown of terms for easy understanding, while in Table 4 are demonstrated the formulas for each metric.

3.6. Engagement Metrics and User Interaction

In our research, we investigated user engagement levels in YouTube videos related to food, specifically focusing on metrics such as views, likes, comments, and engagement rate. We also employed the Mann–Whitney Test to explore potential differences between two datasets. Additionally, descriptive statistics were used to gain insights into central tendencies and variability.

3.6.1. Mann–Whitney Test

In our research, we sought to uncover potential disparities in user engagement with content across two distinct datasets, considering comments, likes, views, and engagement rate. Due to the diverse nature of these metrics and varying sample sizes [65], we opted for the Mann–Whitney U test as our statistical method. According to Kasuya [66], this non-parametric test is widely utilized in behavioral studies, offering a suitable approach to explore potential variations in the distribution of engagement metrics. Mann–Whitney was also chosen since it does not require normal distribution in the datasets. For each engagement metric, we formulated clear null and alternative hypotheses:

H0: There is no significant difference in the distributions of engagement metrics between the two datasets.
H1: There is a significant difference in the distributions of engagement metrics between the two datasets.

To assess the statistical significance, we set an alpha of 0.05. The non-parametric test was independently applied to each engagement metric, producing U statistics and corresponding p-values in our analysis. The decision to reject or retain the null hypothesis depended on the comparison of these p-values to our chosen significance level.

3.6.2. Descriptive Statistics

To comprehensively analyze user engagement metrics, we employed various descriptive statistical measures within the dataset. For each engagement metric, the following descriptive statistics were computed:

Measures of Central Tendency [67,68]

Mean: Represents the average value, providing a central point around which data clusters.
Median: Calculated as the middle value when data is sorted, offering a robust measure of central tendency, especially in the presence of outliers.

Measures of Dispersion

Standard Deviation (Std): Quantifies variation or dispersion in the dataset, revealing insights into the spread of values around the mean [69].
Variance (Var): Indicates how spread out values are, complementing standard deviation in assessing overall variability [70].
Range: Calculated as the difference between maximum and minimum values, providing a straightforward measure of the overall dataset spread [71].

3.6.3. Statistical Analysis and Data Visualization

To enhance our dataset understanding, we conducted correlation and temporal analyses. Correlation analysis assessed relationships between comment length and two engagement metrics, ‘Comment Likes’ and ‘Reply Count’, using the Pearson correlation coefficient. Temporal patterns were investigated by extracting components such as year, day of the week, and hour of the day. Comment activity trends over distinct years were examined, quantifying and visualizing patterns to understand how engagement evolved across time periods. This combination of statistical calculations and visualizations offers a comprehensive insight into the dataset’s dynamics.

4. Results

In this section, we present the empirical findings of our study. Our primary aim is to present the research outcomes, with each result directly linked to the core objectives of our study. First, we delve into the performance and comparison of the ML algorithms and then we move on to the engagement metrics task.

4.1. Comparison of Sentiment Analysis Tools: TextBlob, VADER, and GSA

In the realm of sentiment analysis, accurate data labeling is crucial. This study thoroughly compares three widely used libraries, TextBlob, VADER, and GSA. VADER, known for its expertise in social media sentiment analysis [72], evaluates individual words and sentences, providing sentiment scores within the context of social media [73]. It articulates expressed sentiments, as discussed by Hutto and Gilbert [52]. TextBlob, widely used in sentiment analysis tasks [48,74], is also selected. The labeling process categorizes data into positive, negative, and neutral classes, consistently applied to both datasets. Sentiment labels for each tool are illustrated in Table 5 and Table 6.

After labeling the comments, we compared the tools. The results are depicted in Figure 1 and Figure 2, shedding light on the performance of each tool in sentiment classification.

In both datasets, TextBlob, VADER, and GSA provided similar results. Positive comments predominated, with negative ones being a minority. A notable difference emerged in the number of comments labeled as neutral. TextBlob assigned more comments as neutral than VADER and GSA, especially in the hedonic dataset, where neutral comments closely rivaled the positive ones.

To determine the tool that is more aligned with human understanding, we manually labeled a sample of 300 comments in each dataset. Table 7 and Table 8 illustrate parts of our datasets, highlighting in bold instances where the tools seemed closer to human understanding.

For a deeper understanding, Figure 3 and Figure 4 demonstrate the results of the manually labeled comments by comparing the performance of the tools used with the ground truth labels.

When it comes to comments that were differently classified by these tools, GSA seems to be closer to human understanding, which is due to its extensive training using large volumes of text. Therefore, GSA is the sentiment analysis tool that we considered most appropriate for our study. GSA, in the plant-based dataset, labeled

61.2 %

of the comments as positive,

26.8 %

as neutral, and

7.3 %

as negative. In the hedonic dataset, GSA labeled

54.5 %

of the comments as positive,

32.6 %

as neutral, and

12.9 %

as negative.

Measuring the accuracy of these classifications compared to the ground truth, as indicated by human labeling, we observe in Figure 3 and Figure 4 that GSA exhibits almost perfect performance, with only a very small number of truly positive comments being misclassified as neutral. In the plant-based dataset, TextBlob clearly has the worst performance, while VADER exhibits better scores, though still inferior compared to GSA, as Figure 3 suggests. In the hedonic dataset, whose scores are shown in Figure 4, VADER exhibits much lower accuracy, especially in the cases of negative sentiments.

4.2. Performance and Comparison of the ML Algorithms

In this section, we discuss the procedures applied to both plant-based and hedonic datasets. Choosing between micro, macro, and weighted averages as a performance metric is a common challenge in ML. Our study compared all three for model analysis. Table 9 illustrates the performance of these metrics for the Random Forest Model. Macro- and weighted-average values present a similar picture, but with a slight divergence. Macro-average values are slightly lower, suggesting potential class imbalance.The weighted average, accounting for class distribution, tends to be slightly higher. If we take the weighted average, the F1 score is a good score. However, it does not classify negative comments with great confidence, which is why we believe the macro-average with 0.81 would be a better measure. Similar results were obtained from other models.

So, in our datasets, we decided to calculate the evaluation metrics using the macro-average to assign equal significance to each class and to handle the dataset imbalance. According to Hamid et al. [75], a slightly imbalanced dataset is defined as having a distribution like 60:40, 55:45, or 70:30 (majority:minority). Furthermore, Opitz [76] notes a growing adoption of ‘macro’ metrics in recent years, and Guo et al. [77] highlight an increasing trend in utilizing macro-average indicators for sentiment analysis evaluation.

4.2.1. Performance and Comparison of the ML Algorithms in Plant-Based Dataset

First, let us examine the performance metrics for the plant-based dataset. In Table 10, we present the performance values obtained from all five ML algorithms.

Our results indicate that the Support Vector Machine, Logistic Regression, XGBoost, and Random Forest models all exhibited superior performance compared to the Naïve Bayes classifier. Naïve Bayes consistently demonstrated the lowest efficacy across all evaluated metrics, including accuracy, precision, recall, and F1 score, thereby making it the least effective model within this comparative analysis.

Our analysis revealed that the Support Vector Machine and Logistic Regression models achieved the highest accuracy among the considered models, both scoring an accuracy of

0.93

. Additionally, the F1 score, which combines precision and recall, provides a balanced evaluation of a model’s performance. A higher F1 score implies a better trade-off between precision and recall. F1 remains a popular metric among researchers, and in multiclass cases the F1 micro/macro averaging procedure offers flexibility, enabling customization for ad-hoc optimization to meet specific goals in diverse contexts [78]. When considering the F1 score, the Support Vector Machine model outperformed the other models, with a score of 0.89. This indicates that the Support Vector Machine model was able to maintain a good balance between precision and recall in its predictions. Overall, our results suggest that the Support Vector Machine model excels in both accuracy and F1 score, making it a suitable choice for sentiment analysis in the context of YouTube comments on plant-based products.

4.2.2. Performance and Comparison of the ML Algorithms in Hedonic Dataset

Just like we did for the previous dataset, Table 11 demonstrates the performance metrics for all five ML algorithms for the hedonic dataset.

Our analysis of the hedonic dataset, similar to our findings with the plant-based dataset, reveals that all machine-learning models significantly outperformed the Naïve Bayes classifier. These findings suggest that the advanced models are more adept at handling the most complex datasets.

Furthermore, our evaluation has unveiled that the Support Vector Machine and Logistic Regression models stand out as top performers, both achieving an impressive accuracy of 96%. Additionally, when considering the F1 score, both the Support Vector Machine and Logistic Regression models have proven to be highly effective, with an F1 score of 94%. This score indicates that they maintain a strong equilibrium between precision and recall in their predictions. To determine which one of the two performed better in our dataset, we used their confusion matrices (Figure 5) to compare the true positive and true negative values of each model in Table 12 and Table 13.

The performance of the two models is quite similar, with subtle differences. If we were to choose one model, Support Vector Machine emerges as the preferable choice due to its slightly higher combined count of True Positive and True Negative values, indicating a marginally stronger overall performance.

4.3. Engagement Metrics

The Mann–Whitney U tests show significant differences in user engagement metrics, rejecting the null hypothesis (Table 14). Views, comments, likes, and engagement rate all exhibit consistent disparities, indicating non-random variations. These findings enhance quantitative insights into user behavior, emphasizing the importance of exploring underlying factors. For a deeper understanding, descriptive statistics are calculated for the plant-based dataset in Table 15, and for the hedonic dataset in Table 16.

The hedonic dataset boasts a mean view count of 10 million, indicating broad reach and potential virality, with views ranging from

1.68

million to

68.75

million. Active audience participation is shown through a mean of 6579 comments and a substantial like count of 170,862, reflecting a positive response. The engagement rate suggests moderate interaction, ranging from

0.70 %

to

4.06 %

. In contrast, the plant-based dataset has a lower mean view count of 432,891, indicating less diverse viewership. Although comments are fewer, with a mean of 510, the likes count is positive, ranging from 1109 to 113,000, indicating varying popularity. The engagement rate is notable at

3.90 %

, showcasing a more actively engaged audience compared to the hedonic dataset, with a range from

1.31 %

to

9.08 %

. After examining both datasets, we calculate the overall engagement rate by considering total views, comments, and likes. The engagement rate was computed using the formula in Equation (1). Results are depicted in Table 17. It is essential to consider dataset sizes—59 videos for plant-based and 24 for hedonic—before interpreting engagement metrics.

E R = \frac{Engagement (likes + comments)}{Total views} \times 100

(1)

The plant-based dataset achieved

25.5

million views, 30,097 comments, and 812,411 likes, resulting in a commendable

3.30 %

engagement rate. In contrast, the hedonic dataset garnered 241.4 million views, 157,914 comments, and 4.1 million likes, with a relatively lower

1.77 %

engagement rate. Despite higher absolute engagement in hedonic videos, the plant-based dataset maintained a superior rate, suggesting a more consistent and impactful connection per video. Considering the varying video counts, the hedonic dataset achieved impressive views but across a smaller set of 24 videos. In contrast, the plant-based dataset, with 59 videos, achieved a noteworthy

3.30 %

engagement rate despite 25.5 million views. This emphasizes that plant-based content not only attracted attention but also fostered a more engaged audience per video. Beyond quantitative metrics, we analyze the effect of comment length, referring to the number of characters in each comment, on viewer engagement. In both datasets (Figure 6), the correlation coefficients between comment length and comment likes, as well as reply count, are low, suggesting a weak linear connection. In the plant-based dataset, the correlation coefficients are

0.03

for comment likes and

0.10

for reply count. In the hedonic dataset, these coefficients are even smaller, at

0.02

and

0.06

, respectively.

Moving from comment length correlations, we shift to another engagement measure—the total comments over the years. Examining trends in both datasets, Table 18 presents insights into user interaction dynamics for the plant-based dataset, while Table 19 does the same for the hedonic dataset. The plant-based dataset showed an upward trend from 2018 to 2021, followed by a decline in 2022 and 2023. Similarly, the hedonic dataset peaked in 2020, followed by a decline in subsequent years.

Lastly, beyond numerical counts, in Figure 7 we delve into user activity patterns within each dataset. In the plant-based dataset, activity rises in late morning and peaks between 14:00 and 18:00, with consistently higher activity in the middle of the week. In contrast, the hedonic dataset sees a peak in activity during the afternoon and evening, especially on Sundays.

5. Discussion

This study explored viewers’ sentiment in different YouTube videos on food products, by applying different ML algorithms. Moreover, it examined how users interact with food-related YouTube videos, looking at things like views, likes, comments, and engagement rate.

The comparison between TextBlob and VADER for sentiment analysis revealed that both tools produced similar results for positive and negative sentiments in the plant-based and hedonic datasets. However, TextBlob frequently labeled a higher number of comments as neutral compared to VADER. To determine the most appropriate tool, we conducted a manual review of the comments that were labeled differently by the two tools. This cross-checking showed that VADER’s sentiment classifications were more closely aligned with human judgment. Consequently, VADER was selected as the preferred tool for our study.

In evaluating ML algorithms, the Support Vector Machine (SVM) and Logistic Regression models performed well on both datasets, with SVM exhibiting the highest accuracy (93%) and F1 score (89%) for the plant-based dataset, and similar results observed for the hedonic dataset. The SVM model outperformed other ML algorithms in the sentiment analysis case for several reasons, such as its ability to handle high-dimensional data, which is common in text-based sentiment analysis where each unique word or phrase can represent a separate dimension. Furthermore, SVM can maximize the margin between classes and leverage kernel functions for capturing complex data relationships in sentiment analysis tasks.

The Mann–Whitney U test revealed significant differences in user engagement metrics between the plant-based and hedonic datasets, indicating distinct user behavior patterns. Descriptive statistics highlighted the broad reach and active audience of the hedonic dataset, while the plant-based dataset showed higher engagement levels. Temporal trends and user activity patterns offer insights for content creators to optimize engagement in the competitive landscape of YouTube food videos.

Overall, our findings, demonstrating a prevalence of positive sentiments, are in line with Shamoi et al.’s examination of public sentiment towards plant-based products [37], highlighting a generally favorable reception of plant-based food. In contrast, Dalayya et al. [39] explored public perceptions of plant-based diets for cancer prevention and management, revealing that the public’s inclination towards plant-based diets was not as pronounced as previously assumed. The selection of SVM as the primary sentiment analysis model conforms to established practices in sentiment analysis, consistent with the conclusions of Gunawan et al. [41]. They have previously discussed the effectiveness of ML algorithms, particularly SVMs, in capturing sentiments across various contexts, including food reviews.

5.1. Future Work and Limitations

The study encounters certain limitations. Firstly, even though we used VADER for data labeling, it did not always categorize comments the same way a human would. This might have influenced our results. Secondly, despite the application of high-performing classifiers, none attained absolute accuracy. For future research, enhancing prediction accuracy could involve exploring advanced classifiers like neural networks. Future research could integrate advanced deep-learning models to provide a deeper understanding of sentiment, capturing context that traditional models might miss. Additionally, considering alternative feature extraction techniques may contribute to refining sentiment predictions. Thirdly, there were limitations in the dataset: the dataset for plant-based products had a lower number of comments, which could affect our results. Having more comments might give us a clearer picture. We focused on specific food products, specifically our datasets were structured from comments about plant-based milk, butter, and yogurt, while the hedonic category included pizza, burgers, and cake. Hence, the selection of the videos was based on these inclusion criteria, leading to a manual approach that might introduce selection bias. Future automated approaches or the exploration of other types of food video could further validate our results.

Also, this study offers preliminary evidence on the role of engagement metrics in sentiment analysis. A deeper analysis is needed to examine the association between these engagement metrics and user sentiment in social media content.

To further enhance the model’s interpretability and gain deeper insights into its decision-making process, incorporating SHAP (SHapley Additive exPlanations) (https://github.com/shap/shap, accessed on 15 June 2024) as a future research direction is promising. By applying SHAP values, we can quantify the contribution of each feature to the model’s output, thereby providing valuable explanations for the model’s predictions. This would not only improve the model’s transparency but also aid in identifying potential biases and improving its overall performance.

5.2. Conclusions

This study explored user sentiment across various YouTube videos promoting plant-based and hedonic food products. It also attempted to identify the most effective ML algorithm for detecting sentiment in these videos. Lastly, the study investigated user engagement metrics in food-related YouTube videos, including views, likes, comments, and engagement rate. The comparison between sentiment analysis tools revealed that VADER, with its closer alignment to human judgment, is preferable for analyzing user sentiments in YouTube food-related content. Additionally, the prevalence of positive sentiments across both datasets suggests a generally favorable user sentiment towards food-related content on the platform. The consistent performance of the SVM model highlights its effectiveness in sentiment analysis tasks, indicating its potential for broader application in content analysis. Moreover, the observed differences in user engagement metrics between plant-based and hedonic datasets imply distinct user behavior patterns, emphasizing the importance of tailored content strategies.

The findings offer practical insights into user sentiments in YouTube food videos about plant-based and hedonic food products. Addressing gaps in the existing research, particularly regarding plant-based foods, provides crucial understanding for marketing strategies, product development, and content creation. By understanding the sentiments expressed by users towards different food types, manufacturers can gain valuable insights into consumer preferences and demands. Also, understanding the sentiments prevalent among viewers can guide the creation of engaging and relevant content that resonates with their audience. Based on our results, content creators may choose to produce more content centered around plant-based cooking, thus catering to the preferences of their viewers and potentially expanding their audience reach. Overall, understanding public sentiment can inform tailored marketing, product innovation, and content strategies, benefiting businesses and content creators in the food industry.

Author Contributions

Conceptualization: M.T. and K.T.; writing—original draft preparation: M.T.; writing—review and editing: C.T., K.T. and D.K.; project supervision: C.T. and D.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available upon request.

Conflicts of Interest

The authors declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. This manuscript is according to the guidelines and complies with the Ethical Standards.

References

Luo, N.; Wu, S.; Liu, Y.; Feng, Z. Mapping social media engagement in the food supply chain. Technol. Forecast. Soc. Chang. 2023, 192, 122547. [Google Scholar] [CrossRef]
Patra, T.; Rinnan, A.; Olsen, K. The physical stability of plant-based drinks and the analysis methods thereof. Food Hydrocoll. 2021, 118, 106770. [Google Scholar] [CrossRef]
Kopplin, C.S.; Rausch, T.M. Above and beyond meat: The role of consumers’ dietary behavior for the purchase of plant-based food substitutes. Rev. Manag. Sci. 2021, 16, 1335–1364. [Google Scholar] [CrossRef]
Onwezen, M.C. The application of systematic steps for interventions towards meat-reduced diets. Trends Food Sci. Technol. 2022, 19, 443–451. [Google Scholar] [CrossRef]
Aschemann-Witzel, J.; Futtrup-Gantriisa, R.; Fraga, P.; Perez-Cueto, F.J.A. Plant-based food and protein trend from a business perspective: Markets, consumers, and the challenges and opportunities in the future. Crit. Rev. Food Sci. Nutr. 2020, 61, 3119–3128. [Google Scholar] [CrossRef]
Martin, C.; Langé, C.; Marette, S. Importance of additional information, as a complement to information coming from packaging, to promote meat substitutes: A case study on a sausage based on vegetable proteins. Food Qual. Prefer. 2021, 87, 104058. [Google Scholar] [CrossRef]
Kahleová, H.; Levin, S.; Barnard, N.D. Cardio-Metabolic benefits of Plant-Based diets. Nutrients 2017, 9, 848. [Google Scholar] [CrossRef]
Alae-Carew, C.; Green, R.; Stewart, C.; Cook, B.; Dangour, A.D.; Scheelbeek, P.F.D. The role of plant-based alternative foods in sustainable and healthy food systems: Consumption trends in the UK. Sci. Total. Environ. 2022, 807, 151041. [Google Scholar] [CrossRef]
Acquah, J.B.; Amissah, J.G.N.; Affrifah, N.S.; Wooster, T.J.; Danquah, A.O. Consumer perceptions of plant based beverages: The Ghanaian consumer’s perspective. Future Foods 2023, 7, 100229. [Google Scholar] [CrossRef]
Yang, Q.; Eikelboom, E.; Linden, E.V.; de Vries, R.; Venema, P. A mild hybrid liquid separation to obtain functional mungbean protein. LWT 2022, 154, 112784. [Google Scholar] [CrossRef]
Chmurzynska, A.; Mlodzik-Czyzewska, M.A.; Radziejewska, A.; Wiebe, D.J. Hedonic Hunger Is Associated with Intake of Certain High-Fat Food Types and BMI in 20- to 40-Year-Old Adults. J. Nutr. 2021, 151, 820–825. [Google Scholar] [CrossRef] [PubMed]
Otterbring, T.; Folwarczny, M.; Gidlöf, K. Hunger effects on option quality for hedonic and utilitarian food products. Associated with Intake of Certain High-Fat Food Types and BMI in 20- to 40-Year-Old Adults. Food Qual. Prefer. 2023, 103, 104693. [Google Scholar] [CrossRef]
Wakefield, K.L.; Inman, J.J. Situational price sensitivity: The role of consumption occasion, social context and income. J. Retail. 2003, 79, 199–212. [Google Scholar] [CrossRef]
Dhar, R.; Wertenbroch, K. Consumer Choice between Hedonic and Utilitarian Goods. J. Mark. Res. 2000, 37, 60–71. [Google Scholar] [CrossRef]
Cramer, L.; Antonides, G. Endowment effects for hedonic and utilitarian food products. Food Qual. Prefer. 2011, 22, 3–10. [Google Scholar] [CrossRef]
Loebnitz, N.; Grunert, K.G. Impact of self-health awareness and perceived product benefits on purchase intentions for hedonic and utilitarian foods with nutrition claims. Food Qual. Prefer. 2018, 64, 221–231. [Google Scholar] [CrossRef]
Fitriani, W.R.; Mulyono, A.B.; Hidayanto, A.N.; Munajat, Q. Reviewer’s communication style in YouTube product-review videos: Does it affect channel loyalty? Heliyon 2020, 6, e04880. [Google Scholar] [CrossRef]
Castillo-Abdul, B.; Romero-Rodríguez, L.M.; Larrea-Ayala, A. Kid influencers in Spain: Understanding the themes they address and preteens’ engagement with their YouTube channels. Heliyon 2020, 6, e05056. [Google Scholar] [CrossRef]
Oh, C.; Roumani, Y.; Nwankpa, J.K.; Hu, H.-F. Beyond likes and tweets: Consumer engagement behavior and movie box office in social media. Inf. Manag. 2017, 54, 25–37. [Google Scholar] [CrossRef]
Kavitha, K.M.; Shetty, A.; Abreo, B.; D’Souza, A.; Kondana, A. Analysis and Classification of User Comments on YouTube Videos. Procedia Comput. Sci. 2020, 177, 593–598. [Google Scholar] [CrossRef]
Manap, K.H.A.; Adzharudin, N.A. The Role of User Generated Content (UGC) in Social Media for Tourism Sector. The 2013 WEI International Academic Conference Proceedings 2013. Available online: https://www.westeastinstitute.com/wp-content/uploads/2013/07/Khairul-Hilmi-A-Manap.pdf (accessed on 10 October 2023).
Bahtar, A.Z.; Muda, M. The Impact of User—Generated Content (UGC) on Product Reviews towards Online Purchasing—A Conceptual Framework. Procedia Econ. Financ. 2016, 37, 337–342. [Google Scholar] [CrossRef]
Ganganwar, V.; Rajalakshmi, R. Implicit aspect extraction for sentiment Analysis: A survey of Recent approaches. Procedia Comput. Sci. 2019, 165, 485–491. [Google Scholar] [CrossRef]
Dang, C.N.; García, M.N.M.; Prieta, F.D.L. Sentiment analysis Based on Deep Learning: A comparative study. Electronics 2020, 9, 483. [Google Scholar] [CrossRef]
Xu, Q.; Chang, V.; Jayne, C. A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decis. Anal. J. 2022, 3, 100073. [Google Scholar] [CrossRef]
Birjali, M.; Kasri, M.; Beni-Hssane, A. A comprehensive survey on sentiment analysis: Approaches, challenges and trends. Decis. Anal. J. 2021, 3, 100073. [Google Scholar] [CrossRef]
Drus, Z.; Khalid, H. Sentiment Analysis in Social Media and its Application: Systematic Literature review. Procedia Comput. Sci. 2019, 161, 707–714. [Google Scholar] [CrossRef]
Chalkias, I.; Tzafilkou, K.; Karapiperis, D.; Tjortjis, C. Learning Analytics on YouTube Educational Videos: Exploring Sentiment Analysis Methods and Topic Clustering. Electronics 2023, 12, 3949. [Google Scholar] [CrossRef]
Rodríguez-Ibánez, M.; Casánez-Ventura, A.; Castejón-Mateos, F.; Cuenca-Jiménez, P.-M.M. A Review on Sentiment Analysis from Social Media Platforms. 2023. Available online: https://www.synopsys.com/glossary/what-is-dast.html (accessed on 5 October 2023).
Anastasiou, P.; Tzafilkou, K.; Karapiperis, D.; Tjortjis, C. YouTube Sentiment Analysis on Healthcare Product Campaigns: Combining Lexicons and Machine Learning Models. Available online: https://doi.ieeecomputersociety.org/10.1109/IISA59645.2023.10345900 (accessed on 5 October 2023).
Rajeswari, B.; Madhavan, S.; Venkatesakumar, R.; Riasudeen, S. Sentiment analysis of consumer reviews—A comparison of organic and regular food products usage. Rajagiri Manag. J. 2020, 14, 55–167. [Google Scholar] [CrossRef]
Meza, X.V.; Yamanaka, T. Food Communication and its Related Sentiment in Local and Organic Food Videos on YouTube. J. Med. Internet Res. 2020, 22, 16761. [Google Scholar] [CrossRef]
Lim, K.H.; Lim, T.M.; Tan, K.S.N.; Tan, L.P. Sentiment Analysis on Mixed Language Facebook Comments: A Food and Beverages Case Study. In Fundamental and Applied Sciences in Asia; Springer: Singapore, 2023. [Google Scholar] [CrossRef]
Tzafilkou, K.; Panavou, F.R.; Economides, A.A. Facially Expressed Emotions and Hedonic Liking on Social Media Food Marketing Campaigns:Comparing Different Types of Products and Media Posts. In Proceedings of the 2022 17th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP), Corfu, Greece, 3–4 November 2022. [Google Scholar] [CrossRef]
Pastor, E.M.; Vizcaíno-Laorga, R.; Atauri-Mezquida, D. Health-related food advertising on kid YouTuber vlogger channels. Heliyon 2021, 7, e08178. [Google Scholar] [CrossRef]
Tzafilkou, K.; Economides, A.A.; Panavou, F.R. You Look like You’ll Buy It! Purchase Intent Prediction Based on Facially Detected Emotions in Social Media Campaigns for Food Products. Computers 2023, 12, 88. [Google Scholar] [CrossRef]
Shamoi, E.; Turdybay, A.; Shamoi, P.; Akhmetov, I.; Jaxylykova, A.; Pak, A. Sentiment analysis of vegan related tweets using mutual information for feature selection. PeerJ Comput. Sci. 2022, 8, e1149. [Google Scholar] [CrossRef]
Thao, T.T.H. Exploring Consumer Opinions on Vegetarian Food by Sentiment Analysis Method. 2022. Available online: https://journalofscience.ou.edu.vn/index.php/econ-en/article/view/2256/1787 (accessed on 1 October 2023).
Dalayya, S.; Elsaid, S.T.F.A.; Ng, K.H.; Song, T.L.; Lim, J.B.Y. Sentiment Analysis to Understand the Perception and Requirements of a Plant-Based Food App for Cancer Patients. Hum. Behav. Emerg. Technol. 2023, 2023, 8005764. [Google Scholar] [CrossRef]
Bhuiyan, M.R.; Mahedi, M.H.; Hossain, N.; Tumpa, Z.N.; Hossain, S.A. An Attention Based Approach for Sentiment Analysis of Food Review Dataset. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020. [Google Scholar] [CrossRef]
Gunawan, L.; Anggreainy, M.S.; Wihan, L.; Santy; Lesmana, G.Y.; Yusuf, S. Support vector machine based emotional analysis of restaurant reviews. Procedia Comput. Sci. 2023, 216, 479–484. [Google Scholar] [CrossRef]
Thao, T.T.H. Lexicon development to measure emotions evoked by foods: A review. Meas. Food 2022, 7, 100054. [Google Scholar] [CrossRef]
Liapakis, A. A Sentiment Lexicon-Based Analysis for Food and Beverage Industry reviews. The Greek Language Paradigm. 2020. Available online: https://aircconline.com/abstract/ijnlc/v9n2/9220ijnlc03.html (accessed on 8 October 2023).
Liang, B.; Su, H.; Gui, L.; Cambria, E.; Xu, R. Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks. Knowl.-Based Syst. 2022, 235, 107643. [Google Scholar] [CrossRef]
Xiao, L.; Xue, Y.; Wang, H.; Hu, X.; Gu, D.; Zhu, Y. Exploring fine-grained syntactic information for aspect-based sentiment classification with dual graph neural networks. Neurocomputing 2022, 471, 48–59. [Google Scholar] [CrossRef]
Motz, A.; Ranta, E.; Sierra Calderon, A.; Adam, Q.; Alzhouri, F.; Ebrahimi, D. Live Sentiment Analysis Using Multiple Machine Learning and Text Processing Algorithms. Knowl.-Based Syst. 2022. Available online: https://www.sciencedirect.com/science/article/pii/S1877050922006287 (accessed on 5 October 2023). [CrossRef]
Khan, R.; Rustam, F.; Kanwal, K.; Mehmood, A.; Sang Choi, G. US Based COVID-19 Tweets Sentiment Analysis Using TextBlob and Supervised Machine Learning Algorithms. 2021. Available online: https://ieeexplore.ieee.org/abstract/document/9445207/authors#authors (accessed on 5 October 2023).
Aljedaani, W.; Rustam, F.; Wiem Mkaouer, M.; Ghallab, A.; Rupapara, V.; Bernard Washington, P.; Lee, E.; Ashraf, I. Sentiment Analysis on Twitter Data Integrating TextBlob and Deep Learning Models: The Case of US Airline Industry. 2022. Available online: https://www.sciencedirect.com/science/article/abs/pii/S0950705122009017 (accessed on 5 October 2023).
Edwin, F.; Joseph, O.; Godwin, O. Data Preprocessing Techniques for NLP in BI. 2024. Available online: https://www.researchgate.net/publication/379652291_Data_preprocessing_techniques_for_NLP_in_BI (accessed on 30 September 2023).
Hemmatian, F.; Sohrabi, M.K. “D” A Survey on Classification Techniques for Opinion Mining and Sentiment Analysis. 2019. Available online: https://doi.org/10.1109/ICAIS50930.2021.9396049 (accessed on 30 September 2023).
Textblob: Simplified Text Processing. Available online: https://textblob.readthedocs.io/en/dev/ (accessed on 30 September 2023).
Hutto, C.J.; Gilbert, E.E. VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. In Proceedings of the Eighth International Conference on Weblogs and Social Media (ICWSM-14), Ann Arbor, MI, USA, 1–4 June 2014. [Google Scholar]
Sentiment Analysis Natural Language API Google Cloud. Available online: https://cloud.google.com/natural-language/docs/analyzing-sentiment (accessed on 30 September 2023).
Rosenberg, E.; Tarazona, C.; Mallor, F.; Eivazi, H.; Pastor-Escuredo, D.; Fuso-Nerini, F.; Vinuesa, R. Sentiment Analysis on Twitter Data Towards Climate Action. 2023. Available online: https://doi.org/10.21203/rs.3.rs-2434092/v1 (accessed on 5 November 2023).
Lokanan, M. The tinder swindler: Analyzing public sentiments of romance fraud using machine learning and artificial intelligence. J. Econ. Criminol. 2023, 2, 100023. [Google Scholar] [CrossRef]
Liang, M.; Niu, T. Research on Text Classification Techniques Based on Improved TF-IDF Algorithm and LSTM Inputs. Procedia Comput. Sci. 2022, 208, 460–470. [Google Scholar] [CrossRef]
Cam, H.; Cam, A.V.; Demirel, U.; Ahmed, S. Sentiment analysis of financial Twitter posts on Twitter with the machine learning classifiers. Heliyon 2023, 10, 2405–8440. [Google Scholar] [CrossRef]
Ghosal, S.; Jain, A. Depression and Suicide Risk Detection on Social Media using fastText Embedding and XGBoost Classifier. Procedia Comput. Sci. 2023, 218, 1631–1639. [Google Scholar] [CrossRef]
Hidayat, T.H.J.; Ruldeviyani, Y.; Aditama, A.R.; Madya, G.R.; Nugraha, A.W.; Adisaputra, M.W. Sentiment analysis of twitter data related to Rinca Island development using Doc2Vec and SVM and logistic regression as classifier. Procedia Comput. Sci. 2022, 197, 660–667. [Google Scholar] [CrossRef]
Fitri, V.A.; Andreswari, R.; Hasibuan, M.A. Sentiment Analysis of Social Media Twitter with Case of Anti-LGBT Campaign in Indonesia using Naïve Bayes, Decision Tree, and Random Forest Algorithm. Procedia Comput. Sci. 2019, 161, 765–772. [Google Scholar] [CrossRef]
Halawani, H.T.; Mashraqi, A.M.; Badr, S.K.; Alkhalaf, S. Automated sentiment analysis in social media using Harris Hawks optimisation and deep learning techniques. Alex. Eng. J. 2023, 80, 433–443. [Google Scholar] [CrossRef]
Zulfiker, M.S.; Kabir, N.; Biswas, A.A.; Zulfiker, S.; Uddin, M.S. Analyzing the public sentiment on COVID-19 vaccination in social media: Bangladesh context. Array 2022, 15, 100204. [Google Scholar] [CrossRef]
Hicks, S.A.; Strümke, I.; Thambawita, V.; Hammou, M.; Riegler, M.A.; Halvorsen, P.; Parasa, S. On evaluation metrics for medical applications of artificial intelligence. Sci. Rep. 2022, 12, 59–79. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. IJDKP 2015, 2, 1–11. [Google Scholar] [CrossRef]
McClenaghan, E. Mann-Whitney U Test: Assumptions and Example. 2022. Available online: https://www.technologynetworks.com/informatics/articles/mann-whitney-u-test-assumptions-and-example-363425 (accessed on 20 October 2023).
Kasuya, E. Mann—Whitney U test when variances are unequal. Anim. Behav. 2001, 61, 1247–1249. [Google Scholar] [CrossRef]
Sethuraman, M. Measures of central tendency: Median and mode. J. Pharmacol. Pharmacother. 2011, 3, 214–215. [Google Scholar] [CrossRef]
Sethuraman, M. Measures of central tendency: The mean. J. Pharmacol. Pharmacother. 2011, 2, 140–142. [Google Scholar] [CrossRef]
Roberson, Q.M.; Sturman, M.C.; Simons, T. Does the Measure of Dispersion Matter in Multilevel Research? A Comparison of the Relative Performance of Dispersion Indexes. Organ. Res. Methods 2007, 10, 564–588. [Google Scholar] [CrossRef]
Gawali, S. Dispersion of Data: Range, IQR, Variance, Standard Deviation. 2021. Available online: https://www.analyticsvidhya.com/blog/2021/04/dispersion-of-data-range-iqr-variance-standard-deviation/ (accessed on 20 October 2023).
Sethuraman, M. Measures of dispersion. J. Pharmacol. Pharmacother. 2011, 2, 315–316. [Google Scholar] [CrossRef]
Garay, J.; Yap, R.; Sabellano, M.J.G. An analysis on the insights of the anti-vaccine movement from social media posts using k-means clustering algorithm and VADER sentiment analyzer. IOP Conf. Ser. Mater. Sci. Eng. 2019, 482, 012043. [Google Scholar] [CrossRef]
Elbagir, S.; Yang, J. Twitter Sentiment Analysis Using Natural Language Toolkit and VADER Sentiment. 2019. Available online: https://www.iaeng.org/publication/IMECS2019/IMECS2019_pp12-16.pdf (accessed on 5 October 2023).
Diyasa, G.S.M.; Mandenni, N.M.I.M.; Fachrurrozi1, M.I.; Pradika, S.I.; Manab, K.R.N.; Sasmita, N.R. Twitter Sentiment Analysis as an Evaluation and Service Base on Python Textblob. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1125, 012034. [Google Scholar] [CrossRef]
Hamid, M.H.A.; Yusoff, M.; Mohamed, A. Survey on highly imbalanced multi-class data. IJACSA 2022, 13. Available online: https://thesai.org/Publications/ViewPaper?Volume=13&Issue=6&Code=IJACSA&SerialNo=27 (accessed on 10 October 2023). [CrossRef]
Optiz, J. From Bias and Prevalence to Macro F1, Kappa, and MCC: A Structured Overview of Metrics for Multi-Class Evaluation. 2022. Available online: https://api.semanticscholar.org/CorpusID:253270558 (accessed on 10 October 2023).
Guo, X.; Yu, W.; Wang, X. An overview on fine-grained text Sentiment Analysis: Survey and challenges. J. Phys. Conf. Ser. 2022, 1757, 012038. [Google Scholar] [CrossRef]
Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]

Figure 1. Comparison of TextBlob, VADER and GSA on Plant-Based Dataset.

Figure 2. Comparison of TextBlob, VADER, and GSA on hedonic dataset.

Figure 3. Comparison of TextBlob, VADER, and GSA using the samples from the plant-based dataset.

Figure 4. Comparison of TextBlob, VADER, and GSA using the samples from the hedonic dataset.

Figure 5. Confusion matrices of Support Vector Machine and Logistic Regression in hedonic dataset.

Figure 6. Correlations between comment length and likes, and also between comment length and reply count in both plant-based (a) and hedonic (b) datasets.

Figure 7. User Activity for comments in plant-based (a) and hedonic (b) datasets. Note, 0 = Monday, 1 = Tuesday, 2 = Wednesday, 3 = Thursday, 4 = Friday, 5 = Saturday, 6 = Sunday, while in the right visualizations 0 = midnight, 1 = 1:00 a.m., etc.

Table 1. Inclusion criteria for plant-based products.

Inclusion Criteria	Plant-Based Products
Video Content	The chosen videos focus on plant-based products, specifically on milk, butter, and yogurt.
Video Type	The selected videos are of the “how-to” or tutorial-style format, where the preparation of plant-based food products is demonstrated.
Comments	All the included videos have a minimum of 100 comments, ensuring that the dataset for sentiment analysis is substantial.
Language	The videos selected for analysis are exclusively in English, ensuring linguistic consistency.
Year	The chosen videos were uploaded within the last 5–6 years, aligning them with current trends in plant-based eating.
Duration of the video	All selected videos have a duration of less than 20 min, facilitating efficient analysis.

Table 2. Inclusion criteria for hedonic products.

Inclusion Criteria	Hedonic Products
Video Content	The chosen videos focus on hedonic products, specifically on pizza, burgers, and cakes.
Video Type	The selected videos are of the “how-to” or tutorial-style format, where the preparation of hedonic food products is demonstrated.
Comments	All the included videos have a minimum of 100 comments, ensuring that the dataset for sentiment analysis is substantial.
Language	The videos selected for analysis are exclusively in English, ensuring linguistic consistency.
Year	The chosen videos were uploaded within the last 5–6 years, aligning them with current trends in hedonic eating.
Duration of the video	All selected videos have a duration of less than 16 min, facilitating efficient analysis.

Table 3. Evaluation of model terms.

TP (True Positive)	Instances correctly predicted as positive.
TN (True Negative)	Instances correctly predicted as negative.
FP (False Positive)	Instances incorrectly predicted as positive.
FN (False negative)	Instances incorrectly predicted as negative.

Table 4. Formulas for evaluation metrics [64].

Metric	Formula
Accuracy	$\frac{T P + T N}{T P + T N + F P + F N}$
Precision	$\frac{T P}{T P + F P}$
Recall	$\frac{T P}{T P + F N}$
F1 score	$2 \times \frac{Precision \times Recall}{Precision + Recall}$

Table 5. Sentiment labels in plant-based dataset.

Comment	TextBlob	VADER	GSA
try thank easy peasy	positive	positive	positive
interested try shelf like milk	positive	positive	positive
thankuuuuuu want try	neutral	neutral	positive
many day store fridge	positive	neutral	positive
must costly	neutral	neutral	negative
use milk instead water	neutral	neutral	neutral
look fantastic go make simple clean	positive	positive	positive
add vanilla	neutral	neutral	neutral
thank much video love recipe almond milk best	positive	positive	positive
amaze	neutral	positive	positive
awesome video thank upload	positive	positive	positive

Table 6. Sentiment labels in hedonic dataset.

Comment	TextBlob	VADER	GSA
love work love pizza	positive	positive	positive
that’s heaven	neutral	neutral	positive
amazing burger easy inexpensive	positive	positive	positive
give heart attack	neutral	neutral	negative
cant get good nope	positive	neutral	negative
delicious love flavour vanilla nice recipe	positive	positive	positive
make cake home amaze thanks recipe best	positive	positive	positive
good cake recipe keep fridge minute pls reply thanks	positive	positive	positive
recreate ur recipe soooo perfect thanks much	positive	neutral	positive
become go cake deliciousness	neutral	positive	positive
wow look awesome thanks share	positive	positive	positive

Table 7. Comparison of classification results on plant-based dataset.

Comment	TextBlob	VADER	GSA
like	neutral	neutral	positive
can pressure cancer	neutral	negative	negative
look amazing yum	positive	positive	positive
amaze	neutral	positive	positive
thank definitely try	neutral	neutral	positive
look yummy	neutral	positive	positive
look amaze long last fridge	negative	neutral	positive
great video cant wait try tonight	positive	positive	positive
look delicious	positive	positive	positive
look soo good omg	positive	positive	positive

Table 8. Comparison of classification results on hedonic dataset.

Comment	TextBlob	Vader	GSA
yes pizza	neutral	neutral	positive
really love pizza	positive	positive	positive
amaze look delicious	positive	positive	positive
lose yeast	neutral	neutral	negative
fan pizza congratulation	neutral	positive	positive
yummy vanilla cake recipe	neutral	positive	positive
im person dont like pizza	neutral	neutral	negative
favourite pizza top mine chicken	negative	positive	positive
wow look tasty	negative	positive	positive
use plain self raise flour	negative	negative	neutral

Table 9. Classification report of Random Forest in plant-based dataset.

Class	Precision	Recall	F1 Score	Support
Negative	0.88	0.45	0.60	330
Neutral	0.86	0.97	0.91	1173
Positive	0.93	0.94	0.93	2070
Accuracy			0.90	3573
Macro avg.	0.89	0.79	0.81	3573
Weighted avg.	0.90	0.90	0.90	3573

Table 10. Performance values for the five models in the plant-based dataset.

Models	Accuracy	Precision	Recall	F1 Score
Support Vector Machine	0.93	0.91	0.87	0.89
Random Forest	0.90	0.89	0.79	0.81
Naïve Bayes	0.81	0.75	0.67	0.70
Logistic Regression	0.93	0.90	0.85	0.87
XGBoost	0.91	0.89	0.83	0.85

Table 11. Performance values for the five models in hedonic dataset.

Models	Accuracy	Precision	Recall	F1 Score
Support Vector Machine	0.96	0.95	0.93	0.94
Random Forest	0.92	0.92	0.86	0.88
Naïve Bayes	0.79	0.78	0.72	0.74
Logistic Regression	0.96	0.95	0.93	0.94
XGBoost	0.92	0.91	0.86	0.88

Table 12. TP, TN, FN, and FP of each class in SVM model in hedonic dataset.

Class	−1	0	1
TP	2054	6419	9573
TN	16,109	12,207	8592
FP	149	211	276
FN	370	25	241

Table 13. TP, TN, FN, and FP of each class in Logistic Regression model in hedonic dataset.

Class	−1	0	1
TP	1994	6416	9543
TN	16,108	11,937	8590
FP	150	301	278
FN	430	28	271

Table 14. Mann–Whitney U test results.

Variable	Statistic	p-Value	Result
Views	1411.0	$1.71 \times 10^{- 12}$	Reject the null hypothesis.There is a significant difference.
Comments	1413.0	$1.48 \times 10^{- 12}$	Reject the null hypothesis.There is a significant difference.
Likes	1397.0	$4.66 \times 10^{- 12}$	Reject the null hypothesis.There is a significant difference.
Engagement Rate	241.0	$2.79 \times 10^{- 6}$	Reject the null hypothesis.There is a significant difference.

Table 15. Descriptive statistics for plant-based dataset.

Plant Based:	Views	Comments	Likes	Engagement Rate
mean	$4.33 \times 10^{5}$	510.12	$1.38 \times 10^{4}$	3.90
median	$1.71 \times 10^{5}$	307.00	$5.98 \times 10^{3}$	3.42
std	$5.49 \times 10^{5}$	510.50	$2.00 \times 10^{4}$	1.64
min	$2.10 \times 10^{4}$	104.00	$1.11 \times 10^{3}$	1.31
max	$2.50 \times 10^{6}$	2174.00	$1.13 \times 10^{5}$	9.08
var	$3.02 \times 10^{11}$	260,611.14	$4.01 \times 10^{8}$	2.69
calculate_range	$2.48 \times 10^{6}$	2070.00	$1.12 \times 10^{5}$	7.77

Table 16. Descriptive statistics for hedonic dataset.

Hedonic:	Views	Comments	Likes	Engagement Rate
mean	$1.01 \times 10^{7}$	6579.75	$1.71 \times 10^{5}$	2.29
median	$6.27 \times 10^{6}$	5544.00	$1.39 \times 10^{5}$	2.10
std	$1.40 \times 10^{7}$	4978.95	$1.35 \times 10^{5}$	0.77
min	$1.68 \times 10^{6}$	1798.00	$3.33 \times 10^{4}$	0.70
max	$6.87 \times 10^{7}$	25,168.00	$6.22 \times 10^{5}$	4.06
var	$1.96 \times 10^{14}$	24,789,956.63	$1.83 \times 10^{10}$	0.59
calculate_range	$6.71 \times 10^{7}$	23,370.00	$5.88 \times 10^{5}$	3.36

Table 17. Overall engagement rate of the two datasets.

Dataset	Views	Comments	Likes	Engagement Rate
Plant-Based	25.540.596	30.097	812.411	3.30%
Hedonic	241.400.820	157.914	4.100.700	1.77%

Table 18. Total number of comments by year in plant-based dataset.

Plant Based
Year	Comment
2018	1135
2019	3003
2020	4658
2021	4847
2022	1907
2023	1461

Table 19. Total number of comments by year in hedonic dataset.

Plant Based
Year	Comment
2018	8941
2019	15,011
2020	31,047
2021	28,186
2022	12,242
2023	11,876

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tsiourlini, M.; Tzafilkou, K.; Karapiperis, D.; Tjortjis, C. Text Analytics on YouTube Comments for Food Products. Information 2024, 15, 599. https://doi.org/10.3390/info15100599

AMA Style

Tsiourlini M, Tzafilkou K, Karapiperis D, Tjortjis C. Text Analytics on YouTube Comments for Food Products. Information. 2024; 15(10):599. https://doi.org/10.3390/info15100599

Chicago/Turabian Style

Tsiourlini, Maria, Katerina Tzafilkou, Dimitrios Karapiperis, and Christos Tjortjis. 2024. "Text Analytics on YouTube Comments for Food Products" Information 15, no. 10: 599. https://doi.org/10.3390/info15100599

APA Style

Tsiourlini, M., Tzafilkou, K., Karapiperis, D., & Tjortjis, C. (2024). Text Analytics on YouTube Comments for Food Products. Information, 15(10), 599. https://doi.org/10.3390/info15100599

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Text Analytics on YouTube Comments for Food Products

Abstract

1. Introduction

2. Theoretical Background

2.1. Plant-Based Products

2.2. Hedonic Products

2.3. YouTube Comments as User Generated Content

2.4. Sentiment Analysis

2.5. Related Work

3. Materials and Methods

3.1. Data Collection for Sentiment Analysis

3.2. Data Preprocessing

3.3. Feature Extraction Using Frequency-Inverse Document Frequency

3.4. Model Training and Testing

3.5. Evaluation of the Models

3.6. Engagement Metrics and User Interaction

3.6.1. Mann–Whitney Test

3.6.2. Descriptive Statistics

3.6.3. Statistical Analysis and Data Visualization

4. Results

4.1. Comparison of Sentiment Analysis Tools: TextBlob, VADER, and GSA

4.2. Performance and Comparison of the ML Algorithms

4.2.1. Performance and Comparison of the ML Algorithms in Plant-Based Dataset

4.2.2. Performance and Comparison of the ML Algorithms in Hedonic Dataset

4.3. Engagement Metrics

5. Discussion

5.1. Future Work and Limitations

5.2. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI