To Boost Sales in E-Commerce: More or More Aligned? Dual Dimensions of Hashtag Strategy and the Moderating Role of Influencers’ Adaptive Self-Presentation
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors- Alignment measure is not semantic. The Hashtag–Content Alignment Ratio is computed via exact string matching — if the hashtag text appears verbatim in the caption, it counts as aligned. This captures textual repetition, not semantic coherence. A hashtag thematically relevant to the post but phrased differently is misaligned; a redundant verbatim repeat is aligned. The construct label "semantic alignment" is not defensible given this operationalization.
- MBTI classifier is unvalidated for this context. The classifier was trained on PersonalityCafe, an English-language self-report forum. It is applied here to Korean commercial Instagram captions — a different language, register, and communicative purpose. No accuracy, precision, recall, or F1 metrics are reported on any validation set. The entire moderator variable rests on a tool whose performance in this domain is unknown.
- Robustness tables contain apparent copy-paste errors. R² values in Tables 7 and 8 (nano-influencer subsample, N=780) are identical to Tables 3 and 6 (full sample, N=932). This is statistically impossible across different samples. Several coefficients also replicate exactly. The tables require full audit and correction before the robustness evidence can be interpreted.
- J/P binarization is theoretically inconsistent. The paper theorizes self-presentation as adaptive and continuous — grounded in Fleeson's density distribution model — and operationalizes it as a binary per-event window. This collapses within-person variation into a dichotomy and contradicts the adaptive logic on which the framework is built.
- Endogeneity is unresolved. J-type classification may proxy for broader differences in influencer professionalism, campaign planning, or product selection — not just communication style. Influencer fixed effects absorb time-invariant confounders but do not address within-person variation in strategic effort correlated with the J/P measure. No instrumental variable, lagged predictor, or quasi-experimental strategy is attempted.
- Interaction effects are asserted without demonstration. The claim that hashtag frequency reverses sign for J-type influencers is stated in the text but not supported by marginal effects calculations or interaction plots. The coefficient signs support the claim directionally, but the total marginal effect at J=1 needs explicit computation and visualization for readers to evaluate its substantive magnitude.
- The nano-influencer robustness check is not a meaningful robustness check. At N=780 out of 932 total observations, the subsample represents 84% of the full dataset. This is effectively the same sample with a small number of observations removed. It provides no evidence about whether findings generalize across influencer tiers.
- The main analysis model fit is poor. Within-R² values of 0.079, 0.076, and 0.101 across the three frequency-based models indicate that hashtag frequency and its interaction with J/P explain very little of the within-influencer variation in log sales. This warrants discussion — either the model is underspecified, or frequency is a weak commercial signal regardless of moderation.
- Two citations are incorrect. Reference 43 (Cheng, Ioannou & Serafeim, 2014) examines CSR and access to debt financing — it has no bearing on adaptive performance, negotiation, or leadership effectiveness, as claimed. Reference 2 (DereÅ„, brand24.com) is a commercial marketing blog and should not be used to support an empirical claim in a peer-reviewed article. Both must be replaced.
- "Semantic alignment" must be redefined or remeasured. Either replace the string-matching ratio with a genuine semantic similarity measure — cosine similarity over sentence embeddings, LDA-based topic overlap, or similar — or reframe the construct explicitly as textual co-occurrence and revise all theoretical language accordingly. The current treatment is indefensible as written.
- The seven-day pre-event window lacks justification. No theoretical rationale or empirical basis is provided for this choice. A sensitivity analysis across alternative windows (e.g., 3-day, 10-day, 14-day) is necessary to confirm that findings are not an artifact of this specific aggregation decision.
- Classifier performance must be reported. At minimum: accuracy and F1-score on a held-out sample, with explicit discussion of cross-language and cross-domain validity. Without this, the J/P variable cannot be treated as a measured construct.
- Marginal effects must be reported for H3 and H4. Interaction plots with confidence bands, along with the total marginal effects of hashtag frequency at J=0 and J=1, are required. The text currently interprets the interaction qualitatively without quantitative support.
- The alignment analysis should be repositioned as primary. Relegating H2 and H4 to an "additional analysis" section misrepresents the paper's contribution. The semantic dimension is, at least theoretically, as central as frequency, and the alignment results are stronger. The current structure works against the paper's own argument.
Author Response
Dear Reviewer 1,
We sincerely thank you for the rigorous, detailed, and constructive evaluation of our manuscript. The comments addressed several of the most fundamental aspects of the study—including the operationalization of the alignment construct, the validation of the MBTI-based classifier, the design of the robustness check, the interpretation of interaction effects, and the overall positioning of the empirical analyses. We have treated each point with the seriousness it deserves and have undertaken substantial methodological, structural, and theoretical revisions across the manuscript.
Comment 1: Alignment measure is not semantic. The Hashtag–Content Alignment Ratio is computed via exact string matching — if the hashtag text appears verbatim in the caption, it counts as aligned. This captures textual repetition, not semantic coherence. A hashtag thematically relevant to the post but phrased differently is misaligned; a redundant verbatim repeat is aligned. The construct label “semantic alignment” is not defensible given this operationalization.
Response 1:
We thank the reviewer for pointing out the discrepancy between the original operationalization and the use of the term “semantic alignment.” We agree that the previous Hashtag–Content Alignment Ratio, which relied on exact string matching between hashtags and caption text, primarily captured lexical overlap rather than true semantic coherence.
In response to this concern, we substantially revised the measurement approach and re-estimated all analyses accordingly. Specifically, we replaced the original exact string-matching ratio with an embedding-based semantic similarity measure using multilingual Sentence-BERT representations (Reimers & Gurevych, 2019). For each post, semantic embeddings were generated separately for the caption text and the hashtag set, and cosine similarity between the two embeddings was calculated to capture semantic relatedness at the contextual level. This revised operationalization allows semantically related expressions to be recognized even when exact lexical repetition is absent. Consistent with this methodological revision, we also updated the construct terminology throughout the manuscript from “Hashtag–Content Alignment Ratio” to “Hashtag–Content Semantic Alignment” (or “semantic similarity” where appropriate). Importantly, all hypotheses, regression analyses, robustness checks, interaction analyses, and related interpretations were fully re-estimated using the revised semantic similarity measure, and the main findings remained substantively consistent.
Mention exactly where in the revised manuscript this change can be found – page 9, 4th paragraph, and line 409~437.
Comment 2: MBTI classifier is unvalidated for this context. The classifier was trained on PersonalityCafe, an English-language self-report forum. It is applied here to Korean commercial Instagram captions — a different language, register, and communicative purpose. No accuracy, precision, recall, or F1 metrics are reported on any validation set. The entire moderator variable rests on a tool whose performance in this domain is unknown.
Response 2:
To address concerns regarding construct validity and predictive performance, we conducted an independent supervised validation experiment using a fine-tuned DistilBERT model on the MBTI 500 dataset, a publicly available corpus of MBTI-labelled Reddit posts. The classifier achieved acceptable predictive performance on a held-out test set (Accuracy = 73.75%, Precision = 0.6898, Recall = 0.6787, F1-score = 0.6842), and these validation metrics are now reported in the revised manuscript (Table X). We further clarified in the manuscript that the J/P variable should not be interpreted as a direct psychometric measure of stable personality traits, but rather as a linguistic operationalization of self-presentational tendencies manifested through textual expression in social media contexts.
Mention exactly where in the revised manuscript this change can be found – page 10, Table 1, and line 472.
Comment 3: Robustness tables contain apparent copy-paste errors. R² values in Tables 7 and 8 (nano-influencer subsample, N=780) are identical to Tables 3 and 6 (full sample, N=932). This is statistically impossible across different samples. Several coefficients also replicate exactly. The tables require full audit and correction before the robustness evidence can be interpreted.
Response 3:
To address this issue, we removed the previous nano-influencer robustness analysis and replaced it with a new time-based robustness check using temporally separated subsamples. Specifically, the sample was divided into earlier campaigns (2018–2019) and later campaigns (2020), which also approximately captures changing social commerce conditions surrounding the COVID-19 period. All robustness models were fully re-estimated using the corresponding temporal subsamples, resulting in distinct coefficient estimates and R² values across periods. The revised results are now reported in Tables 7 and 8.
Mention exactly where in the revised manuscript this change can be found – page 18, 1st paragraph, and line 619~623 & page 19, 1st paragraph, and line 625~645.
Comment 4: J/P binarization is theoretically inconsistent. The paper theorizes self-presentation as adaptive and continuous — grounded in Fleeson's density distribution model — and operationalizes it as a binary per-event window. This collapses within-person variation into a dichotomy and contradicts the adaptive logic on which the framework is built.
Response 4:
We appreciate the reviewer’s insightful comment regarding the potential inconsistency between our theoretical framing of self-presentation as adaptive and continuous and our event-level binary operationalization of the J/P dimension.
Our theoretical framework does not assume that influencers possess fixed categorical personalities. Rather, following William Fleeson’s density distribution perspective, we conceptualize self-presentation as a dynamic and context-sensitive process in which personality-relevant expressions fluctuate across situations and interaction episodes. Accordingly, the purpose of our operationalization is not to classify influencers into stable personality “types,” but rather to identify the contextually dominant expressive orientation reflected in a specific promotional event.
From this perspective, binary operationalization does not deny the underlying continuity of self-presentation. Instead, it captures whether a relatively more Judging-oriented versus Perceiving-oriented presentation mode was dominant within a given influencer-event observation. We have clarified this logic in both the theoretical background and methods sections by emphasizing that adaptive self-presentation may vary continuously across situations while still exhibiting temporarily dominant expressive orientations within specific interaction episodes.
In addition, prior social media personality prediction studies have commonly operationalized MBTI-related linguistic expressions as categorical outcomes in noisy text-based environments to improve interpretability and classification stability (e.g., Al-Fallooji & Al-Azawei, 2022). Consistent with this stream of research, our binary indicator should therefore be interpreted as an event-level dominant presentation mode rather than a fixed dispositional personality classification.
Mention exactly where in the revised manuscript this change can be found – page 6, 5th paragraph, and line 269~293.
Comment 5: Endogeneity is unresolved. J-type classification may proxy for broader differences in influencer professionalism, campaign planning, or product selection — not just communication style. Influencer fixed effects absorb time-invariant confounders but do not address within-person variation in strategic effort correlated with the J/P measure. No instrumental variable, lagged predictor, or quasi-experimental strategy is attempted.
Response 5:
To address this concern, the revised manuscript substantially strengthened by clarifying the event-level panel structure of the data, expanding the discussion of control variables, and explicitly explaining the inclusion of both influencer fixed effects and time fixed effects. In addition, we clarified that the models incorporate a broad set of behavioral, engagement-related, and campaign-level controls to partially account for differences in influencers’ content management practices, follower engagement patterns, promotional intensity, and group-buying campaign execution.
Mention exactly where in the revised manuscript this change can be found – page 11(all paragraph, line 475~498), and page13(3rd paragraph, line 520~525)
Comment 6: Interaction effects are asserted without demonstration. The claim that hashtag frequency reverses sign for J-type influencers is stated in the text but not supported by marginal effects calculations or interaction plots. The coefficient signs support the claim directionally, but the total marginal effect at J=1 needs explicit computation and visualization for readers to evaluate its substantive magnitude.
Response 6:
The reviewer’s concern regarding the interpretation of interaction effects has been addressed by additionally reporting marginal effects and interaction plots with confidence intervals for both H3 and H4. Specifically, we calculated the total marginal effects of hashtag frequency and hashtag–content alignment separately for J = 0 (P-type self-presentation) and J = 1 (J-type self-presentation) using the margins command in Stata. We further visualized these effects using interaction plots with 95% confidence intervals. The results show that the marginal effect becomes substantially stronger under J-type self-presentation, supporting the proposed moderating mechanism. These additional analyses provide quantitative evidence for the interaction effects and allow readers to directly evaluate the substantive magnitude and direction of the moderation effects.
Mention exactly where in the revised manuscript this change can be found – page 16, Figure 1, and line 594 & page 18, Figure 2, and line 616.
Comment 7: The nano-influencer robustness check is not a meaningful robustness check. At N=780 out of 932 total observations, the subsample represents 84% of the full dataset. This is effectively the same sample with a small number of observations removed. It provides no evidence about whether findings generalize across influencer tiers.
Response 7: (refer to Comment 3)
To address this issue, we removed the previous nano-influencer robustness analysis and replaced it with a new time-based robustness check using temporally separated subsamples. Specifically, the sample was divided into earlier campaigns (2018–2019) and later campaigns (2020), which also approximately captures changing social commerce conditions surrounding the COVID-19 period. All robustness models were fully re-estimated using the corresponding temporal subsamples, resulting in distinct coefficient estimates and R² values across periods. The revised results are now reported in Tables 7 and 8.
Mention exactly where in the revised manuscript this change can be found – page 18, 1st paragraph, and line 619~623, Table 6 & page 19, 1st paragraph, and line 625~643, Table 7.
Comment 8: The main analysis model fit is poor. Within-R² values of 0.079, 0.076, and 0.101 across the three frequency-based models indicate that hashtag frequency and its interaction with J/P explain very little of the within-influencer variation in log sales. This warrants discussion — either the model is underspecified, or frequency is a weak commercial signal regardless of moderation.
Response 8:
The revised manuscript explicitly notes that fixed-effects models in panel-data research often exhibit relatively lower explanatory power because time-invariant between-unit variation is absorbed by the fixed effects themselves (Ozili, 2023). The revised discussion further clarifies that, in social science research, models with relatively low R-squared values may still provide meaningful insights when the estimated coefficients are theoretically consistent and statistically significant. Accordingly, the interpretation of the results places greater emphasis on the statistical significance, theoretical consistency, and robustness of the estimated coefficients rather than on the absolute magnitude of the R-squared values.
Mention exactly where in the revised manuscript this change can be found – page 15(4소 paragraph, line 583~589)
Comment 9: Two citations are incorrect. Reference 43 (Cheng, Ioannou & Serafeim, 2014) examines CSR and access to debt financing — it has no bearing on adaptive performance, negotiation, or leadership effectiveness, as claimed. Reference 2 (DereÅ„, brand24.com) is a commercial marketing blog and should not be used to support an empirical claim in a peer-reviewed article. Both must be replaced.
Response 9:
We thank the reviewer for identifying the citation inconsistencies. We have replaced the previously miscited reference (Cheng et al., 2014) with literature directly related to adaptability, leadership effectiveness, and negotiation performance. In addition, we removed the non-peer-reviewed marketing blog citation and replaced it with peer-reviewed social media and hashtag research. The revised references now more accurately support the corresponding theoretical claims.
Mention exactly where in the revised manuscript this change can be found – page 7, 2nd paragraph, and line 301~309.
Comment 10: "Semantic alignment" must be redefined or remeasured. Either replace the string-matching ratio with a genuine semantic similarity measure — cosine similarity over sentence embeddings, LDA-based topic overlap, or similar — or reframe the construct explicitly as textual co-occurrence and revise all theoretical language accordingly. The current treatment is indefensible as written.
Response 10:
We agree that the original string-matching ratio was insufficient to capture true semantic similarity between hashtags and caption content. In response, we fully redefined and remeasured the construct using an embedding-based semantic similarity approach. Specifically, instead of relying on exact lexical overlap, we employed multilingual Sentence-BERT embeddings and calculated cosine similarity between caption text embeddings and hashtag embeddings to capture contextual semantic relatedness at the sentence level. This revised approach enables semantically related expressions to be recognized even when different lexical forms are used, thereby substantially improving construct validity relative to the previous operationalization. Consistent with this revision, we updated the construct definition, measurement description, theoretical explanations, and terminology throughout the manuscript. In addition, all empirical analyses—including the main regressions, interaction models, robustness checks, and marginal effects analyses—were fully re-estimated using the revised semantic similarity measure. The revised results remained substantively consistent with the original findings, providing stronger support for the proposed theoretical relationships using a more appropriate semantic operationalization.
Mention exactly where in the revised manuscript this change can be found – page 9, 4th paragraph, and line 409~437.
Comment 11: The seven-day pre-event window lacks justification. No theoretical rationale or empirical basis is provided for this choice. A sensitivity analysis across alternative windows (e.g., 3-day, 10-day, 14-day) is necessary to confirm that findings are not an artifact of this specific aggregation decision.
Response 11:
The revised manuscript now provides additional justification for the use of the seven-day pre-event window. Specifically, the seven-day aggregation period was established based on prior studies [71–72], which suggest that promotional exposure and follower engagement in influencer group-buying campaigns tend to intensify during the week preceding the sales event. The revised text further clarifies that influencers continuously engage in promotional postings during this period and that both commercial and non-commercial posts may contribute to shaping consumer purchase behavior. Accordingly, the seven-day window is interpreted not as an arbitrary specification choice, but as a theoretically and operationally grounded aggregation period reflecting both prior literature and the actual structure of influencer group-buying campaigns.
Mention exactly where in the revised manuscript this change can be found – page 8(3rd paragraph, line 370~379)
Comment 12: Classifier performance must be reported. At minimum: accuracy and F1-score on a held-out sample, with explicit discussion of cross-language and cross-domain validity. Without this, the J/P variable cannot be treated as a measured construct.
Response 12:
We thank the reviewer for highlighting concerns regarding cross-language and cross-domain validity. To improve compatibility between the English-language MBTI classification model and the Korean Instagram caption data used in this study, Korean captions were first translated into English using the googletrans library prior to classification. The translated captions were then aggregated and processed for MBTI-based linguistic inference. In addition, we conducted a separate validation experiment using a fine-tuned DistilBERT classifier on the MBTI 500 dataset and reported held-out test set performance metrics, including accuracy and F1-score. While these steps improve the methodological rigor of the classification procedure, we acknowledge that linguistic and contextual nuances may not be perfectly preserved across translation and platform contexts. Accordingly, the J/P variable in this study should be interpreted as a linguistic operationalization of self-presentational tendencies rather than a direct psychometric measurement of stable personality traits.
Mention exactly where in the revised manuscript this change can be found – page 10, 3rd paragraph, and line 439~472.
Comment 13: Marginal effects must be reported for H3 and H4. Interaction plots with confidence bands, along with the total marginal effects of hashtag frequency at J=0 and J=1, are required. The text currently interprets the interaction qualitatively without quantitative support.
Response 13:
The reviewer’s concern regarding the interpretation of interaction effects has been addressed by additionally reporting marginal effects and interaction plots with confidence intervals for both H3 and H4. Specifically, we calculated the total marginal effects of hashtag frequency and hashtag–content alignment separately for J = 0 (P-type self-presentation) and J = 1 (J-type self-presentation) using the margins command in Stata. We further visualized these effects using interaction plots with 95% confidence intervals. The results show that the marginal effect becomes substantially stronger under J-type self-presentation, supporting the proposed moderating mechanism. These additional analyses provide quantitative evidence for the interaction effects and allow readers to directly evaluate the substantive magnitude and direction of the moderation effects.
Mention exactly where in the revised manuscript this change can be found – page 16, Figure 1, and line 594 & page 18, Figure 2, and line 616.
Comment 14: The alignment analysis should be repositioned as primary. Relegating H2 and H4 to an "additional analysis" section misrepresents the paper's contribution. The semantic dimension is, at least theoretically, as central as frequency, and the alignment results are stronger. The current structure works against the paper's own argument.
Response 14:
We fully agree with the reviewer. The previous structural choice—placing the semantic (similarity) analysis in an “Additional Analysis” section—was inconsistent with the dual-dimensional framework developed in Section 2 and understated the empirical centrality of the semantic results (β = 0.8260 for similarity vs. β = −0.1648 for frequency in our estimates). In the revised manuscript, the empirical core of the paper has therefore been restructured around two parallel primary analyses, as detailed below.
- Section headings and table captions throughout the manuscript have been updated to reflect two co-equal primary analyses:
– Section 3.3.1: Empirical Model for Primary Analysis I (Hashtag Frequency)
– Section 3.3.2: Empirical Model for Primary Analysis II (Hashtag–Posting Similarity)
– Section 4.1: Primary Analysis I: Hashtag Frequency and Sales Performance
– Section 4.2: Primary Analysis II: Hashtag–Posting Similarity and Sales Performance
– Section 5.1 / 5.2: Robustness Check for Primary Analysis I / II
– Table 4 / 5 / 6 / 7 captions and Appendix A / B titles have been revised accordingly.
- A new introductory paragraph at the head of Section 4 explicitly frames both analyses as co-equal.
Mention exactly where in the revised manuscript this change can be found – page 13 / 14 / 15 / 16 / 17 / 18 / 19 / 20 / 25 / 26, paragraph 4th page 14, and line 506 / 527 / 549 / 559 / 592 / 596 /613 / 619 / 624 / 625 /646 / 815 / 818.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe effects of hashtags in e-commerce have been investigated. However, this study examines how the two dimensions of hashtag strategy—semantic alignment and frequency—interact with influencers’ adaptive, MBTI-based expressive styles to influence sales performance in Instagram group-buying campaigns. This topic could have interesting and important theoretical and empirical contributions.
Detailed comments and suggestions are followed:
- More hashtag semantic alignment Positively affect sales? Why? Could it be also negative? More is less sometimes.
- The development of moderating effects needs to be improved with the support of theories and the associated literature.
- The statement of H3 is clear. It must be revised so that it clearly explains the moderating effect of Self-Presentation on Hashtag Strategy.
- The sample needs to explain more in details, especially how the data is matched in terms of t? Whether they are more than one commodity or more than one brand in the contents that are published by influencers? Whether the commodities belong to same industry? These might provide suggestions regarding some very important missing control variables. Any other control variables specifically related to group-buying?
- Why hypotheses 1 & 3, and 2&4 are tested by two separated models? Why you test H2 & H4 without adding the hashtag frequency, it might generate endogeneity problem.
- The robustness check for a different measurement of hashtag semantic alignment is encouraged. Since this is the metrics that the authors generated which might result in endogeneity.
- In your Discussions, you should compare your results with current hashtag research and eWOM research to clearly explain your contributions.
Author Response
Dear Reviewer 2,
We sincerely thank you for the thoughtful and constructive evaluation of our manuscript. The reviewer's comments highlighted several critical aspects of the study that warranted further clarification—including the theoretical mechanisms underlying the proposed moderating effects, the formulation of the moderation hypotheses, the transparency of the sample structure and control-variable design, and the explicit positioning of our findings against the broader hashtag and eWOM literatures. We have treated each point with the seriousness it deserves and have undertaken substantial theoretical, methodological, and discursive revisions across the manuscript.
Comment 1: More hashtag semantic alignment Positively affect sales? Why? Could it be also negative? More is less sometimes.
Response 1:
The revised text explains that semantically aligned hashtags may enhance message fluency, reduce interpretive ambiguity, and strengthen perceived authenticity when hashtags are meaningfully connected to the core narrative and communicative intent of the post. At the same time, we also acknowledge that excessively strategic or overly optimized hashtag alignment may generate perceptions of artificiality or over-curation, thereby weakening authenticity and persuasive effectiveness.
Mention exactly where in the revised manuscript this change can be found – page 4(1ST paragraph, and 169~172).
Comment 2: The development of moderating effects needs to be improved with the support of theories and the associated literature.
Response 2:
We substantially revised Section 2.2.2 to provide a clearer theoretical mechanism explaining why self-presentational style conditions the effectiveness of hashtag strategies.
Specifically, we strengthened the moderation logic by incorporating additional theoretical grounding from the processing fluency, regulatory fit, and cue congruence literature. The revised manuscript now explains that audiences do not interpret hashtags in isolation, but instead evaluate hashtag cues within the broader communicative context established by the influencer’s self-presentational style. Drawing on prior research, we argue that persuasive effectiveness increases when communication elements are perceived as stylistically coherent, cognitively consistent, and congruent with audience expectations, whereas stylistic incongruence may generate cognitive disfluency, perceived clutter, and reduced persuasive effectiveness.
Building on these mechanisms, we further clarified why high hashtag frequency is more likely to function effectively under Judging-oriented self-presentation, where audiences perceive dense hashtag usage as congruent with structured and goal-directed communication. In contrast, under Perceiving-oriented self-presentation, excessive hashtag use may create stylistic mismatch and greater cognitive processing burden. We also strengthened the theoretical development of H4 by elaborating how semantic hashtag–content alignment enhances narrative coherence, communicative intentionality, and cognitive integration under more structured self-presentational contexts.
To support these revisions, we added additional citations from the processing fluency, persuasion, and cue congruence literature, including Alter and Oppenheimer (2009), Lee and Aaker (2004), and Park et al. (2013).
Mention exactly where in the revised manuscript this change can be found – page 7, 2nd paragraph, and line 301~309.
Comment 3: The statement of H3 is clear. It must be revised so that it clearly explains the moderating effect of Self-Presentation on Hashtag Strategy.
Response 3:
We thank the reviewer for this important observation. We agree that, although the development paragraph preceding H3 discussed moderation, the H3 statement itself was framed as a simple conditional effect rather than as a moderation hypothesis. An interaction hypothesis should explicitly identify the moderator, the focal relationship, the direction of moderation, and the comparison baseline. We have therefore rewritten H3 (and harmonized H4 in the same form).
Mention exactly where in the revised manuscript this change can be found – page 7 / 8, paragraph 5th / 3rd, and line 330 to 333 / 352 to 355.
Comment 4: The sample needs to explain more in details, especially how the data is matched in terms of t? Whether they are more than one commodity or more than one brand in the contents that are published by influencers? Whether the commodities belong to same industry? These might provide suggestions regarding some very important missing control variables. Any other control variables specifically related to group-buying?
Response 4:
The revised manuscript substantially expands the description of the sample structure, event-level matching process, and control-variable design in Sections 3.1 and 3.2.5. Specifically, the revised text now clarifies that the unit of analysis is the repeated promotional event level and that social media activities generated during the seven-day pre-event period were aggregated and matched to each event-level sales outcome.
The revised manuscript also provides additional explanation regarding the operational context of influencer group-buying campaigns, in which both commercial and non-commercial posts may simultaneously influence consumer purchase behavior during the promotional period. Furthermore, the sales dataset includes detailed event- and product-level information, including product names, category classifications, prices, quantities sold, and promotional campaign information.
To reduce concerns regarding omitted variable bias, the empirical models incorporate a broad set of content-related, engagement-related, and campaign-level control variables. In particular, variables such as promotion ratio, campaign incentives, and the number of transactions were included to partially account for differences in promotional intensity, campaign scale, and group-buying operational strategies. In addition, influencer fixed effects and time fixed effects were incorporated to account for unobserved time-invariant influencer heterogeneity and common temporal shocks.
Mention exactly where in the revised manuscript this change can be found – page 8(all paragraph, line359~388), and page 11(all paragraph, line 475~498).
Comment 5: Why hypotheses 1 & 3, and 2&4 are tested by two separated models? Why you test H2 & H4 without adding the hashtag frequency, it might generate endogeneity problem.
Response 5:
We thank the reviewer for raising this important point regarding the model structure. The hypotheses were intentionally organized into two separate model groups because they examine two conceptually distinct dimensions of hashtag strategy. Specifically, H1 and H3 focus on the quantitative dimension of hashtag usage, operationalized through hashtag frequency, whereas H2 and H4 focus on the semantic quality dimension of hashtag strategy, operationalized through hashtag–content semantic alignment.
The purpose of separating these analyses was to distinguish the effects of structural hashtag intensity from the effects of semantic consistency between hashtags and post content. Although related, these constructs capture different theoretical mechanisms and were therefore analyzed in separate models to improve interpretability and reduce conceptual overlap. Importantly, the semantic alignment models additionally controlled for hashtag frequency, allowing the estimated semantic alignment effects to be interpreted above and beyond the quantitative intensity of hashtag usage.
We have clarified this distinction and the rationale for the separate model structure in the revised manuscript.
Mention exactly where in the revised manuscript this change can be found – page 8, 1st paragraph, and line 335~350.
Comment 6: The robustness check for a different measurement of hashtag semantic alignment is encouraged. Since this is the metrics that the authors generated which might result in endogeneity.
Response 6:
We thank the reviewer for raising this important concern regarding the potential omission of hashtag frequency in the semantic alignment models. In response, we clarified the model specification and confirmed that hashtag frequency was included as a control variable in all semantic alignment analyses. Specifically, the semantic alignment models controlled for the total number of hashtags used during the focal posting window, allowing the estimated effect of Hashtag–Content Semantic Alignment to be interpreted above and beyond the structural intensity of hashtag usage itself.
This specification helps mitigate potential endogeneity concerns arising from the possibility that posts with higher hashtag frequency may mechanically exhibit higher semantic alignment scores. By simultaneously including both hashtag frequency and semantic alignment in the regression models, the analyses isolate the semantic quality dimension of hashtag strategy from the quantitative usage dimension. The revised manuscript now explicitly clarifies this modeling strategy in both the methods and results sections.
Mention exactly where in the revised manuscript this change can be found –page 20, Table 7, and line 646.
Comment 7: In your Discussions, you should compare your results with current hashtag research and eWOM research to clearly explain your contributions.
Response 7:
We agree that an explicit comparison with the broader hashtag and eWOM literatures is essential to articulate our contribution. In the previous version, our discussion focused largely on the internal logic of our framework without sufficient dialogue with prior empirical findings. In the revised manuscript, we have added a new dedicated subsection at the head of the Discussion—Section 6.1, “Positioning the Findings within Hashtag and eWOM Research”—and have renumbered the remaining subsections accordingly. The new subsection compares our findings against three streams of literature.
Mention exactly where in the revised manuscript this change can be found – page 21, paragraph 1st to 4th, and line 658 to 690.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsGood works
