1. Introduction
In the context of global business, investment plays a key role. However, when considering all parties involved, the pursuit of a favorable outcome is the objective that is sought after by all individuals. Hence, there exists the potential for the new, promising product to have a positive impact on certain businesses. As companies strive to create and provide products or services that align with client demands, it is imperative to consider customer happiness as a fundamental requirement. The implementation of risk mitigation strategies during the launch of a new product would contribute to a reduction in potential hazards, consequently lowering the likelihood of failure. The primary objective of this action is likely to be the attraction of additional investors. Nevertheless, inadequate comprehension of the circumstance can potentially yield unfavorable consequences. To mitigate risks in the dynamic and perpetual stock market, it would be advantageous to identify and monitor potential changes or past changes that may serve as indicators. Such observations would aid in limiting risks and would be a valuable solution.
In the past decade, a surge in technological advancements and platforms emerged, including streaming platforms. As a result, this sector experienced rapid growth, necessitating the application of data science and artificial intelligence to uncover additional insights and strategies for staying abreast of technological developments and their impact on industries. The surge in the user base of streaming platforms, particularly Netflix, during the year 2013, resulted in an expansion of its market presence. The expansion of Netflix was paralleled by the acknowledgment and commendation garnered by its original programming, which received nominations for and achieved victories at several award ceremonies. The selection of the case study for this paper was based on its growing popularity among individuals. In addition to analyzing sentiment in movie reviews, our research primarily centers on the exploration of stock values through data analysis. In this context, we employ natural language processing (NLP) as a method for filtering the movies used in the exploration phase.
Numerous artificial intelligence (AI) technologies, including sentiment analysis [
1,
2], have been widely employed. This has eliminated the necessity for companies, producers, and small craft owners to conduct street surveys in order to gain insights into potential future products or to gather feedback on previous offerings. Instead, sentiment analysis on reviews has become a prevalent approach [
3]. In the contemporary era of widespread internet access and the proliferation of online platforms dedicated to user-generated evaluations, manufacturers now possess a means by which they can ascertain critical feedback pertaining to the efficacy and potential areas of enhancement for their products or services.
Sentiment analysis is employed in various domains, as evidenced by a case study [
3] examining customer satisfaction in the restaurant industry through the analysis of reviews. Additionally, sentiment analysis has been utilized in the context of social and health situations, such as the COVID-19 pandemic, as demonstrated in a research paper [
4] focused on capturing the prevailing mood of individuals during that specific period. Furthermore, the utilization of sentiment analysis has extended to the realm of stock market analysis. In a notable work [
5], the authors employed LSTM and sentiment analysis to examine the correlation between news headlines and stock trends. Notably, the evaluation methodology included graphical visualization plots, deliberately avoiding absolute predictions. A previous work [
6] examined the use of professional third-party reviews (TPRs) to identify factors that impact the financial value of a company. The dataset comprises films that were released in the United States between February 2005 and April 2006. According to authors, the reviews may influence investors positively despite their nature being negative, under particular conditions. Furthermore, the distinct value of relative valence of TPRs, which is the measure used by authors, aligns with the overarching observation that stock prices fluctuate in response to unexpected news that alters expectations.
This investigation employs natural language processing (NLP) techniques [
1,
7]—as a prominent subfield within the realm of data sciences with several examples of applications—and sentiment analysis [
8,
9], which aims to determine the subjectivity and polarity of a provided text, in our case, the movie reviews, using TextBlob version 0.16.0, a Python-based tool that enables the implementation of several natural language processing (NLP) methodologies [
10,
11]. The ensuing sections of this paper are structured as follows:
Section 2 describes the datasets collected and outlines our approach, which is centered around natural language processing (NLP) and sentiment analysis.
Section 3 presents the results of sentiment analysis and the investigation of stock values during the release periods of movies. The discussion subsection covers the examination and analysis of these results, and the research is finalized in the Conclusions section.
3. Results and Discussion
The provided figures (
Figure 3 and
Figure 4) illustrate the number of reviews per class within the timeframe spanning from ‘06-06-2019’ to ‘06-12-2019’. The frequency of reactions appears to be concentrated in the latter months of the year, commencing from the tenth month. The price plot, depicted in
Figure 3, shows an increase in the number of reactions and sequential green peaks observed. Notably, the third green-marked area indicates a gradual increase in the values of the close price plot. This increase occurs after a time lag subsequent to the rise in the number of reactions, which is observed from around October 2019 to December 2019.
The negative review figure (
Figure 4) shows four distinct regions that have been highlighted in red. During the observation of these periods, it was seen that the price plot often exhibited a decline in value following a sequence of peaks represented by negative review plots, with a time delay. Following a brief interval, the value experienced a subsequent increase, a phenomenon that can be attributed to the influence of favorable reviews.
The figures (
Figure 5,
Figure 6 and
Figure 7) illustrate the count of reviews per class throughout a six-month period from ‘06-01-2020’ to ‘06-06-2020’ in relation to the plotted values of the closing price. A significant visual observation can be made from the total review plot (
Figure 5), wherein the final light-orange designated region indicates a decrease in reaction count. Additionally, the pricing plot exhibits a decline in value around the time period of ‘06-2020’.
In the present study, we aimed to investigate a visual correlation between social media and price.
Figure 7 illustrates a decline in the price plot subsequent to a prominent peak in the negative review plot during the period encompassing ‘03-2020’ and ‘04-2020’, as indicated by the red background. However, following the later part of the ‘04-2020’ month, the price exhibited a subsequent increase.
During the same time period in which the negative review plot reached its peak, the plots for positive reviews (
Figure 6) and the total count of reviews also exhibited a significant peak, surpassing the count of negative reviews in the second green marked area. This observation suggests a potential correlation between the subsequent increase in price value and the aforementioned trends, as depicted in the figures. Similar observations may be made in the vicinity of the temporal interval ‘02-2020’, wherein the count of positive reviews and the reaction plots exhibit comparably lower peaks, succeeded by an increase in price magnitude in proximity to the first green demarcated region.
Building on the visual patterns observed in the plots previously (
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7), we calculated the correlation coefficients after merging financial data and the extracted sentiment features. Interesting correlation values between the close price and the features were produced, where Rotten Tomatoes features exhibited correlations ranging between 0.26 and 0.44, while IMDb-derived features produced a broader range from −0.17 to 0.47.
Discussion
One notable observation is that there is a positive association between the number of reactions and the strength of the relationship, as evidenced by the manifestation of a correlation. When examining the data presented in
Table 4, it becomes apparent that there is a positive correlation between the count of reactions and the peaks observed in the price plot. This correlation is particularly evident during periods of high reaction counts. These findings suggest a potential relationship between the reviews and the price plot, specifically during the time periods when peaks in the review count align with subsequent peaks in the price plot.
The price value exhibited an upward trend following the peaks of the favorable reviews as depicted in the plots. Specifically, the green indicated areas in
Figure 3 and
Figure 6 indicate this observation. The plots of the negative reviews exhibited peaks, which were then followed by a decline in values. This trend is particularly evident in
Figure 4 and
Figure 7, where the areas outlined in red indicate a notable decrease. However, in contrast to the positive count, the negative count is comparatively smaller. It is worth noting that the red peaks were consistently followed by a decline in values. Consequently, further investigation is required to fully interpret these findings. This could involve incorporating additional data from different sources, like news datasets, as well as exploring whether the negative impact carries more significance than the positive impact.
These observations suggest a potential relationship between the quality and number of reviews and the movement of the displayed price plot. This observation was also evident in the reaction figure (
Figure 5), wherein a decrease in review count values corresponded to a decrease in price values. This raises the question of whether the lack of interest, whether positive or negative, among individuals can influence changes in values. However, to substantiate this claim, a more comprehensive and extensive study would be required.
While [
3] focuses on sentiment analysis of reviews for a specific restaurant to evaluate customer satisfaction in a narrowly defined context, and [
4] analyzes reviews from a broader platform to assess public mood during a particular health crisis and its relation to the stock market, our work takes a different direction, exploring the connection between sentiment expressed in reviews of a company’s products (in this case, movies) and the stock price of the production company, Netflix. In contrast to [
5] which uses headlines and stock trends by applying LSTM for the prediction model—an approach that reflects a more direct link to the stock market—our study investigates whether the sentiment extracted from reviews of the product can reveal a subtler, potentially more complex relationship with financial performance. The authors of [
6], meanwhile, analyze professional third-party reviews to identify factors that impact the financial value of a company, using a dataset of films released between February 2005 and April 2006. However, in our study, we use reviews from publicly accessed famous platforms such as Rotten Tomatoes and IMDb, covering a more recent and extended period from 2014 to 2020 while focusing on one production company.
In summary, our study offers a broader and more current dataset than comparable work involving movies, but more specified when compared to studies based on general social media platforms. Moreover, it goes beyond measuring customer satisfaction, aiming instead to investigate whether product reviews can have a meaningful impact on stock price movements, an approach that is both more targeted and innovative.
4. Conclusions
The focus of the research study was the utilization of sentiment analysis on datasets of movie reviews. The analysis was performed via the Text Blob tool and was further enhanced by including the star rating. The aim of this study was to calculate a sentiment score that would enable the visual analysis of the relationship between movie reviews and pricing trends over five-year discrete time periods from 2017 to 2021. The objective was to investigate the potential correlation between the public’s reception of new products, particularly movies, and the visual fluctuations in a company’s stock price.
The present research elucidated an association between the temporal movement of price plots and the evaluations of a product over five discrete time intervals. This implies that the introduction of a highly anticipated product, along with a significant volume of evaluations and responses on credible platforms, has the potential to influence the volatility of the stock price of the associated company. To optimize the robustness of this study, it is advisable to incorporate a variety of datasets obtained from reputable platforms. This will contribute to the augmentation of the quantity of assessments and data available, hence resulting in more resilient outcomes. Furthermore, the implementation of this research across multiple companies operating in different industries can aid in identifying the probable relationship between the timing of a particular product’s release, the response from the target audience, and the fluctuation of the company’s stock price. Essentially, it can furnish empirical proof to substantiate the proposition that substantial replies and evaluations on social media platforms have the potential to impact the volatility of stock values.