2. Materials and Methods
This research adopts a systematic approach to evaluate the influence of keyword usage on the efficiency of fishery product campaigns through YouTube, emphasizing supply chain marketing strategies [
35]. The research follows a three-stage process: sample selection, data collection, and analytical procedures, including both summary-level statistics (descriptive) and linear regression modeling using the R software (version 4.1.0). This approach ensures a comprehensive understanding of the connection between features of YouTube videos and how viewers interact with them.
The sample selection focuses on identifying YouTube channels that provide reliable fishery-related content, ensuring that the data accurately represent the seafood supply chain marketing landscape. The selection process involved targeting channels associated with sustainable fisheries, particularly those endorsed by recognized non-profit organizations such as FoodTank [
36], which highlights organizations contributing to sustainable fisheries and responsible seafood marketing. To achieve a representative dataset, sixteen fishery-related organizations with an active YouTube presence and significant engagement in seafood supply chain awareness campaigns were selected [
37]. These organizations have Marine Stewardship Council (MSC) certifications or actively promote sustainable fishing practices, aligning with global seafood sustainability efforts.
Data collection was conducted using VIDIQ [
38] software, a widely used YouTube analytics tool that evaluates content performance by extracting video metadata, audience interaction metrics, and keyword effectiveness. The dataset included all videos published by the selected YouTube channels, ensuring that the analysis captures longitudinal trends in seafood marketing [
6]. The primary dependent variable in this study is total video views (VIEWS), serving as a key performance indicator (KPI) of audience engagement. Additional independent variables include video duration, title word count, description word count, and interaction metrics (e.g., likes and shares). These metrics allow for an in-depth assessment of YouTube marketing strategies and their effectiveness in promoting sustainable fisheries.
The final stage of the methodology involves statistical analysis using R software, focusing on descriptive statistics and linear regression models. Descriptive statistics summarize the dataset’s characteristics, such as the distribution of video attributes, engagement metrics, and keyword frequency. Linear regression models examine the relationship between independent variables (e.g., keyword usage, video duration, engagement metrics) and dependent variables (e.g., video views, likes). This approach identifies the statistical significance of specific video characteristics in influencing audience engagement, providing insights into how keyword optimization affects YouTube’s algorithmic ranking and content visibility. The analysis also accounts for multicollinearity and model validation, ensuring the robustness of the results.
By integrating sample selection, data gathering, and statistical modeling, this research adds to the wider discourse on digital marketing in the seafood supply chain, offering empirical insights for content creators, marketers, and policymakers in sustainable fisheries and responsible seafood consumption campaigns.
2.2. Data Collection
Given the above, within this framework, information was gathered through the use of specific software of VIDIQ [
38], which is specifically designed to provide metrics for the evaluation and exploitation of the videos by the producers of each channel, or-in general- by the platform.
No restrictions were used for the selection of videos; all videos published by each channel up to the reporting date were included in the sample.
Dependent variables are those variables for which the hypothesis of their change is made by changing the independent variables. In other words, these are the variables that describe the phenomenon the researcher is interested in studying.
In this paper, the focus of interest is on understanding the pattern of total views of a YouTube video; therefore, the primary dependent variable of the research is the views of each of the specific videos in the population, as it (the population) was summarized in the previous section. For easy reference and also for inclusion in the relevant algebraic relationships below, we call this dependent variable VIEWS. We note that this variable is not accompanied by a unit of measurement; it is a pure number. In parallel with the dependent variable VIEWS, we create for the survey the variable LOG(VIEWS), which is obtained as the natural logarithm of the VIEWS variable for each particular video in the population. That is:
The transformation of the specific dependent variable is necessary as it allows, as will be shown below, the transformation of the quantitative regression model from a multiplicative to an additive one, which not only facilitates the analysis but also the reading/explanation of the resulting conclusions.
In addition to the views of each video in the population, this paper also considers the prediction of potential user interaction with these videos. To this end, the number of users who have expressed a positive attitude towards each video (“Likes”) is included in the survey as a second primary independent variable. Accordingly, for easy reference and inclusion in the relevant algebraic relationships below, we call this dependent variable YTLIKES. Like the VIEWS variable, the YTLIKES variable is not accompanied by a unit of measurement, being a pure number.
It should be noted that the definition of the above variables as independent variables relates to the specific context of the present study and describes their use in the context of the relevant models. Within a different methodological/research framework, these variables could be independent, potentially facilitating the investigation of ‘downstream’ phenomena, such as the wider acceptance/interaction of these channels in social media, a question that is outside the current research framework.
However, remaining within the present research framework, it is easy to see that the number of users expressing a positive attitude towards each video (YTLIKES) presupposes the existence of users to whom each specific video has been shown (VIEWS). Consequently, and within the present research framework, the variable VIEWS will also be used as an independent variable, in terms of the approach to the variable YTLIKES.
In conclusion, in the context of the present study, two primary independent variables, VIEWS and YTLIKES, are used, and the (independent) transformation variable of VIEWS via the natural logarithm (LOG(VIEWS)) is additionally created.
In contrast to dependent variables, independent variables are used as (potential) determinants of the phenomenon under study or, more practically, of the dependent variables. In other words, the concept of research is to examine whether changes in the independent variables (including the presence or absence of an item) lead to or explain (or at least correlate with) changes in the dependent variables.
Given the information that can be efficiently collected from YouTube, the independent variables in this survey are presented in
Table 1.
The function used for duration (seconds) is presented below:
In conclusion, 8 primary independent variables were created in the context of this research, as outlined above. To address the specific characteristics of fishery-related YouTube videos, the chosen independent variables—such as title word count, keyword presence, and video duration—were selected not only for their algorithmic relevance but also for their strategic applicability in communicating sustainability and traceability in the seafood supply chain. These structural characteristics reflect how fishery organizations shape narratives, engage consumers, and position products within global digital ecosystems. For example, longer descriptions may allow space for sustainability certifications or traceability information, which are vital for ethical seafood consumption. Similarly, the use of targeted keywords and metadata helps align video visibility with consumer search behaviors related to fishery topics. This tailored approach ensures the structural variables used in the model align with domain-specific communication strategies rather than generic video features.
2.3. Stages of Analysis
Having described the independent, dependent, and auxiliary variables used in the analysis, this section outlines the steps involved in conducting the survey. The survey process can be categorized into two distinct stages. In Stage A, the effects of macro-characteristics, such as the age of each video or the number of words in the title or description, are evaluated. In Stage B, if statistically and managerially significant effects are identified, the analysis then focuses on evaluating the impact of specific keywords.
2.3.2. Keywords’ Impact Analysis
Having examined the relationship between macro-features and projections, in Stage B, we move on to examine the relationship between the existence (or not) of specific keywords and projections. Specifically, the inclusion of keywords in the model is done as follows:
Isolation of the words included in the titles (tokenization)—for example, the (hypothetical) titles “Great fishing today!” and “What a great day for fishing on the lake” are decomposed into their constituent words and recombined into a unique word vector as follows:
“Great, fishing, today, what, a day, for, on, the, lake”.
At the same time, the frequencies of occurrence of each word are recorded, forming the above vector as follows:
“Great (2), fishing (2), today (1), what (1), a (1), day (1), for (1), on (1), the (1), lake (1)”.
The resulting vector is filtered for (a) unimportant words and (b) words that appear only a few times. Trivial words are those words that do not add ‘meaning’ to the sentence but are used as conjunctions, articles, etc. For example, in the vector above, the words ‘a, ‘on’, and ‘the’ clearly do not add meaning to the reader, although they occur very frequently in speech. The full list of meaningless words used [
33] is provided in
Table A1. Similarly, words that appear less than ten (10) times in the entire population of videos examined were excluded—although these words may describe a very specific video that enjoys particularly high (or low) levels of views, the use of a word that appears very infrequently creates theoretical and practical problems in creating a reliable (robust) and—most importantly—useful business model. Continuing the example of the vector above, this would be formulated as follows:
“Great (2), fishing (2), today (1), day (1), lake (1)”.
It now becomes obvious that the above words can be used as categorical variables in a multiple linear regression model.
In this particular case, the model used has the form:
where,
VIEWS is the number of views of a video;
a is the constant term of the model for the case where all xi equal zero (0);
bi is the marginal effect coefficient from the existence of the keyword xi;
xi is the auxiliary variable (dummy variable) that takes the value one (1) when the specific keyword is present in the video title and zero (0) when it is not present (boolean);
e is the statistical error.
Some observations are important for the completeness of the analysis at this point. Although technically, the analysis is not limited by the language of the text/title, in this research, only videos whose titles are in English are investigated. This was, on the one hand, so that we had a better understanding of the trivial words, and on the other hand, so that the conclusions could be processed. Since there are many English titles in the videos included on YouTube, we consider that this does not limit the analysis and, above all, the applicability of any conclusions in different contexts.
The model proposed above uses residual views as the dependent variable, but the logarithm of views (LOG(VIEWS)) could also be used. In this case, the model is not differentiated (the general assumptions of multiple linear regression apply), but the numerical value of the coefficients (a and β) is differentiated, as well as the way they are explained. Specifically, the modification of the terms for explanatory purposes is as follows:
For the case where all xi is equal to zero (0), the remaining views will be described by the relationship ea.
The effect of each keyword on the remaining views will be determined by the marginal coefficient of the model as eβ.
The standard significance tests of a multiple linear regression model (F-test) as well as the adjusted coefficient of determination (adjusted R2) are valid and can be tested normally.
4. Discussion
This research offers important perspectives on the elements that affect YouTube video performance, particularly about video characteristics, keyword optimization, and user engagement. By analyzing a dataset of 2882 videos across multiple channels, the study examined how video age, duration, metadata structure, and keyword selection impact viewership and interaction. The discussion section interprets these results in light of the existing literature, assessing their alignment with prior research while highlighting potential implications for content creators and digital marketers. The confirmed hypotheses reinforce established theories on search engine optimization and audience engagement, whereas the rejected hypotheses suggest the need for a more nuanced understanding of video consumption behaviors.
The results confirm H1, demonstrating that video age positively correlates with accumulated views, aligning with prior research emphasizing that older videos benefit from sustained exposure and algorithmic recommendations over time [
41,
42]. The statistical analysis reveals that for every additional active day, the number of views increases, reinforcing the premise that evergreen content gains traction through continuous engagement and search engine indexing. YouTube’s algorithm prioritizes videos with proven engagement, often resurfacing older, high-performing content in recommendations [
43]. However, this effect may be influenced by initial engagement metrics—videos that perform well early tend to continue gaining traction.
Contrary to expectations, H2 was not statistically confirmed, indicating that video duration alone does not significantly impact viewership. While prior studies have suggested that shorter videos might yield higher retention rates due to declining attention spans [
44], the current findings suggest that factors such as content quality, engagement triggers, and audience intent may moderate this relationship [
45]. YouTube’s recommendation algorithm increasingly favors longer videos with high watch time and retention, as they contribute more to overall session duration, a key ranking factor [
46]. Therefore, while shorter videos may attract initial clicks, longer content that retains viewers for extended periods can perform equally well or better in terms of engagement.
The statistical results do not confirm H3 or H4, implying that title and description length do not directly linearly influence video viewership. While prior research suggests that concise and keyword-rich titles improve CTR [
47], the current study finds no strong evidence that longer titles negatively impact views. Similarly, despite the expectation that lengthy descriptions might dilute keyword effectiveness, reducing discoverability [
48], the data does not support this hypothesis. These findings suggest that optimizing video metadata requires a balance—titles should be engaging and relevant, but not necessarily short, while descriptions should be informative without overwhelming the viewer.
This study confirms H5 and H6, emphasizing the critical role of keywords in determining video visibility and engagement. The analysis demonstrates that keyword-optimized content significantly affects view counts, corroborating research on YouTube SEO and search behavior [
49]. Specific keywords such as “sustainability”, “fisheries”, and “conservation” yielded higher engagement, aligning with previous findings that niche-specific, high-relevance keywords enhance discoverability [
50]. Conversely, generic or highly competitive keywords may dilute a video’s ranking potential due to oversaturation. These results underscore the need for content creators to employ targeted keyword strategies, integrating trending terms while maintaining relevance to their audience.
The results confirm H7, highlighting a strong correlation between views and user interactions, particularly likes. This finding aligns with studies emphasizing engagement as a primary driver of YouTube’s recommendation algorithm [
51]. The analysis shows that highly viewed videos tend to receive more likes, reinforcing prior research on social proof and engagement metrics [
52]. Videos that actively encourage audience interaction—through calls to action, community engagement, and high-quality content—often outperform those that rely solely on organic discovery [
53]. These results highlight the necessity of fostering viewer interaction to sustain visibility and maximize reach.
5. Conclusions
This study presents meaningful theoretical insights into areas such as digital marketing, search engine optimization, and user interaction on social media. This study confirms that video longevity plays a critical role in increasing views over time, aligning with previous research on algorithmic ranking factors [
54]. However, the absence of a statistically significant impact of video duration, title length, and description length on viewership challenges the existing literature that emphasizes content length as a determinant of audience retention [
55]. This suggests that YouTube’s recommendation system may prioritize other engagement-driven factors, such as keyword relevance and user interaction. The research also highlights the importance of keyword selection in improving video visibility, reinforcing prior studies on SEO optimization and digital content strategies [
56]. These insights expand theoretical models on content discoverability, emphasizing that keyword optimization may serve as a stronger predictor of video success than structural video attributes.
From a practical perspective, this study provides actionable insights for content creators, marketers, and organizations using YouTube as a promotional platform. The confirmation of H5 and H6 suggests that strategic keyword use in video titles significantly influences viewership, supporting best practices in digital marketing that advocate for SEO-driven content strategies [
57]. Organizations in the fishery and sustainability sector can leverage this finding to optimize their video campaigns by incorporating high-impact keywords aligned with audience search behaviors. Furthermore, the confirmed relationship between views and user interactions (H7) underscores the need for engagement-driven content strategies. Given that videos with higher numbers of likes receive increased visibility and organic reach, marketers should prioritize community engagement through compelling storytelling and interactive features [
58]. This aligns with broader digital marketing research emphasizing the role of engagement metrics in social media algorithm rankings [
59].
Although this study offers important contributions, it also presents certain limitations that open pathways for future research. First, the sample consists of videos from a specific thematic domain—fisheries and environmental sustainability—raising concerns about the generalizability of findings to other industries. The effects of keywords and engagement metrics may vary across different content genres, such as entertainment, education, or product marketing. Future research should extend the analysis to diverse thematic categories to examine whether similar keyword-driven patterns hold across industries [
60]. Moreover, the analysis relies on cross-sectional data gathered at a specific moment, limiting insights into longitudinal trends. A time-series analysis of video performance over extended periods would offer a deeper understanding of the evolution of keyword effectiveness and engagement patterns [
61].
Another key limitation is the exclusion of qualitative video characteristics, such as production quality, storytelling techniques, and audience demographics. While the study confirms the statistical significance of keyword selection and engagement metrics, it does not account for content quality factors that may drive user interest [
62]. For instance, videos with compelling narratives and high production value may experience higher engagement regardless of keyword optimization. Future studies should incorporate qualitative content analysis methods to explore the impact of storytelling, visual aesthetics, and brand positioning on video success. Additionally, research could examine the role of psychographic segmentation in understanding how different audience personas interact with various types of video content [
63].
Looking ahead, subsequent studies should investigate the cause-and-effect dynamics between subscriber count and video views in the context of fisheries. While this study acknowledges that a channel’s subscriber base may influence video reach, the direction of causality remains unclear. A larger-scale, longitudinal dataset would help establish whether an increase in video views leads to higher subscriber growth or vice versa [
64]. Moreover, further investigation into the early-life performance of videos is warranted. Since video longevity is a confirmed predictor of views, researchers should examine factors that accelerate early video traction, such as promotional strategies, cross-platform marketing, and influencer collaborations [
65]. Understanding how videos gain momentum in their initial stages would be particularly beneficial for marketers seeking to optimize content distribution strategies.
To sum up, this research offers essential perspectives on the function of keyword optimization, engagement metrics, and video longevity in determining YouTube video success. While it confirms the importance of strategic keyword selection and user interactions, it also challenges traditional assumptions regarding video duration and metadata structure. These findings offer both theoretical contributions to digital marketing literature and practical implications for content creators aiming to enhance the video discoverability of fisheries. However, this study’s limitations highlight the need for broader, cross-industry research incorporating qualitative and demographic factors. Future investigations should explore longitudinal video performance trends, audience segmentation, and the interplay between content quality and algorithmic visibility. By filling these research voids, academics and professionals are able to further refine content marketing strategies and improve the effectiveness of digital campaigns across diverse platforms.