Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Audio Features and Crowdfunding Success: An Empirical Study Using Audio Mining

J. Theor. Appl. Electron. Commer. Res. 2024, 19(4), 3176-3196; https://doi.org/10.3390/jtaer19040154

by Miao Miao¹, Yudan Wang^2,*

, Jingpeng Li³, Yushi Jiang³ and Qiang Yang^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4: Anonymous

J. Theor. Appl. Electron. Commer. Res. 2024, 19(4), 3176-3196; https://doi.org/10.3390/jtaer19040154

Submission received: 24 August 2024 / Revised: 11 November 2024 / Accepted: 13 November 2024 / Published: 18 November 2024

(This article belongs to the Topic Interactive Marketing in the Digital Era)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for the opportunity to review this manuscript.

Upon reviewing your manuscript, I found the study's contribution to the field of crowdfunding and audio feature analysis to be significant. However, there are areas where further clarification and elaboration would greatly enhance the quality and readability of your work. Below are specific points that require attention:

1. You mention the use of Linguistic Inquiry and Word Count (LIWC) software for calculating text content features but lack detail on how these features are used to assess the success of crowdfunding projects. It is essential to expand on this section by explaining how the textual analysis outcomes contribute to understanding project performance.

2. The methodology includes various technical steps and tools, such as the use of Laplacian filters to detect edge sharpness for video quality assessment and Google Colab notebooks for image feature recognition. For reproducibility, it is imperative that you provide detailed descriptions of the techniques employed and their implementation specifics.

3. While several controlled variables are included in the study, there is no explicit discussion on why these variables were chosen and how they influence the results. Please add an explanation regarding the rationale behind selecting these controls and discuss their roles within the crowdfunding context.

4. In the conclusion, you highlight that your study extends the research perspective on crowdfunding performance and opens new avenues for exploring the prerequisites for crowdfunding success. To strengthen this aspect, further develop the discussion by linking your findings with signal theory and other relevant theories. Address how the study's insights on leveraging audio features to enhance unstructured information could inform future research.

These recommendations aim to improve the clarity and robustness of your manuscript. Addressing them will likely increase the manuscript's impact and its contribution to the field.

Good luck with your future research!

Comments on the Quality of English Language

The text is easy to follow.

Author Response

Comments 1: You mention the use of Linguistic Inquiry and Word Count (LIWC) software for calculating text content features but lack detail on how these features are used to assess the success of crowdfunding projects. It is essential to expand on this section by explaining how the textual analysis outcomes contribute to understanding project performance.

Response 1: We greatly appreciate your suggestions. In the original text, we have already described in detail how LIWC (Linguistic Inquiry and Word Count) can be used to extract and analyze the textual content of speakers. As you suggested, however, we did indeed lack a detailed presentation of how textual content features might interfere with our study’s results. By reviewing relevant literature, we have added supplementary information to the original text to explain how the textual content features used by speakers can influence the performance of crowdfunding campaigns.

The sentence is as follows: “For instance, when project initiators employ wording that conveys excitement, charm, and positivity in their crowdfunding videos, it can significantly enhance the final funding performance.” (see page 11, line 413-415)

Reference: Anglin, A.H.; Pidduck, R.J. Choose Your Words Carefully: Harnessing the Language of Crowdfunding for Success. Bus. Horiz. 2022, 65, 43-58.

Comments 2: The methodology includes various technical steps and tools, such as the use of Laplacian filters to detect edge sharpness for video quality assessment and Google Colab notebooks for image feature recognition. For reproducibility, it is imperative that you provide detailed descriptions of the techniques employed and their implementation specifics.

Response 2: Thanks for your comments. The Laplacian operator is commonly employed in image processing for edge detection. As a second-order differential operator, the Laplacian can emphasize areas of abrupt change in an image, effectively highlighting edges. By computing the second derivative of pixel values in an image, the Laplacian filter is able to accentuate local maxima or minima within the image, aiding in the localization of edges.

In video quality assessment, edge sharpness is a critical factor in evaluating video quality. Utilizing the Laplacian filter can assist in assessing whether the details in video frames are clear and visible, serving as a reference metric for video quality. Sharper edges generally indicate higher video quality.

Google Colab (Collaboratory) is a free cloud-based service for Jupyter notebooks that allows you to write and execute Python code. It is particularly well-suited for tasks involving machine learning and image processing. Within Google Colab, you can leverage pre-installed libraries such as OpenCV, TensorFlow, or PyTorch for identifying image features.

In the original text, we provided a brief introduction to the functionality of the two analytical tools in assessing image quality. Following your suggestion, we have expanded upon the initial description, offering a more detailed explanation of the operational specifics of Google Colab Notebook in identifying image features.

The sentence is as follows: “In Google Colab, images are processed by loading them into memory, preprocessing (e.g., resizing, normalization), applying image processing or ML techniques, and visualizing the results using libraries like OpenCV and Matplotlib.” (see page 11, line 442-445)

Reference: Carneiro, T.; Medeiros Da Nobrega, R.V.; Nepomuceno, T.; Bian, G.; De Albuquerque, V.H.C.; Filho, P.P.R. Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications. Ieee Access. 2018, 6, 61677-61685.

Comments 3: While several controlled variables are included in the study, there is no explicit discussion on why these variables were chosen and how they influence the results. Please add an explanation regarding the rationale behind selecting these controls and discuss their roles within the crowdfunding context.

Response 3: Thank you very much for your advice. In this research, the control variables include textual content features and visual features. Textual content features encompass function words, cognitive words, emotional words, and social process words. For visual features, we selected image quality as the variable. We have already provided a detailed explanation for choosing image quality as a control variable. However, the rationale behind selecting function words, cognitive words, emotional words, and social process words was relatively brief. Therefore, following your suggestion, we have individually explained the reasons and underlying principles for choosing these four textual content features. Additionally, we have supplemented the potential roles these control variables may play in the crowdfunding process.

The sentences are as follows: “Firstly, functional words, typically encompassing elements that facilitate the con-struction of sentence structures such as prepositions, conjunctions, and auxiliaries, play a pivotal role. Despite conveying minimal lexical information individually, functional words are indispensable for facilitating the listener's comprehension of both the syntactic organization and the logical interrelations within utterances. Secondly, cognitive words, including nouns and verbs, convey the core message of the discourse directly. The selection of appropriate cognitive words can render a speech more concrete and vivid, thereby aiding the audience in constructing understanding and facilitating the retention of the presented content. Thirdly, emotional words refer to those terms that carry strong emotional connotations, such as “love”, "fear", and "anger". The judicious use of affective vocabulary can enhance the persuasiveness and resonance of a speech; how-ever, overutilization might come across as exaggerated or insincere. Fourthly, social process words encompass lexical items that pertain to interpersonal interactions, such as pronouns like "we" and "you," along with verbs indicating requests, commands, or sug-gestions. The utilization of these social process terms fosters a sense of community or personal connection, thereby enhancing the interactivity and engagement within a speech.” (see page 11, line 417-433)

Reference: Mahowald, K.; Fedorenko, E.; Piantadosi, S.T.; Gibson, E. Info/Information Theory: Speakers Choose Shorter Words in Pre-dictive Contexts. Cognition. 2013, 126, 313-318.

Pennebaker, J.W.; Francis, M.E. Cognitive, Emotional, and Language Processes in Disclosure. Cognition and Emotion. 1996, 10, 601-626.

Landis, T. Emotional Words: What's so Different From Just Words? Cortex. 2006, 42, 823-830.

Kumpulainen, K. The Nature of Peer Interaction in the Social Context Created by the Use of Word Processors. Learn Instr. 1996, 6, 243-261.Carneiro, T.; Medeiros Da Nobrega, R.V.; Nepomuceno, T.; Bian, G.; De Albuquerque, V.H.C.; Filho, P.P.R. Performance Analysis of Google Colaboratory as a Tool for Accelerating Deep Learning Applications. Ieee Access. 2018, 6, 61677-61685.

Comments 4: In the conclusion, you highlight that your study extends the research perspective on crowdfunding performance and opens new avenues for exploring the prerequisites for crowdfunding success. To strengthen this aspect, further develop the discussion by linking your findings with signal theory and other relevant theories. Address how the study's insights on leveraging audio features to enhance unstructured information could inform future research.

Response 4: Thank you so much for your comments. We have provided a deeper supplementation in the theoretical contribution section. We emphasize the innovation of this study in enriching the application of signaling theory within research on unstructured information. Additionally, we highlight that integrating signaling theory with other theories and concepts from the field of psychology can significantly enhance both the applicability and explanatory power of signaling theory.

The sentences are as follows: “From a broader perspective, our research explores the importance of leveraging un-structured information, such as audio features, within digital environments. This pro-vides a novel and significant viewpoint on how to transcend traditional research frameworks and investigate a wider variety of signal types and their applications across diverse scenarios. Additionally, our study integrates signaling theory with many con-cepts and theories related to cognitive processing in psychology, elucidating the interplay between these disciplines. This interdisciplinary application of knowledge offers a more compelling explanation of the stimuli that signaling theory can address from richer and more diverse information sources, greatly expanding its applicability.” (see page 16, line 578-586)

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Well done on your paper - it is both an interesting and original topic and you make a compelling argument.

Suggest to make limited additions / adjustments:

good abstract
row 20. This is not a correct statement, crowdfunding can have high transaction costs and high asymmetry. what makes crowdfunding different from other fundraising transactions is the direct contact between supply and demand or funding, without the presence of a financial intermediary that typically is responsible for the screening and monitoring of projects. crowdfunding is a financial innovation. Interesting to note that crowdfunding can be a mechanism that competes with or complements more traditional financial services, such as microfinance as it can target entrepreneurs. See for example: Glisovic, Jasmina and Gonzalez, Henry and Saltuk, Yasemin and de Mariz, Frederic, Volume Growth and Valuation Contraction, Global Microfinance Equity Valuation Survey 2012 (May 3, 2012). Available at SSRN: https://ssrn.com/abstract=2625221 or http://dx.doi.org/10.2139/ssrn.2625221
row 25. suggest to add the overall size of crowdfunding to balance your argument, its a small niche in financial innovation
row 29. Consider to add a broader comment first, to emphasize the role of technology across industries, in particular in finance. this allows to connect your paper with a broader stream of research. For example: https://www.researchgate.net/publication/375866096_Fintech_for_Impact_How_Can_Financial_Innovation_Advance_Inclusion
row 48 - row 581. consider gender - would that be a predictor of success (male vs female voice), could there be a risk of prejudice integrated in the type of voice? See Prof Zuboff and Surveillance Capitalism on the risks of contributing data. interesting that you mentioned those limitations.
row 65. justify why you chose Kickstarter
literature review - well done
row 314. great chart. those dont seem to be “controlled variables”, but rather “tested variables”, correct?
what control variables are you using to test your hypothesis, such as country, sector, size of poejcts?
row 381. be careful with the use of “controlled variables”. in statistics, the control variables are those additional variables used in your tests to confirm/infirm your hypothesis. using the word “controlled” may confuse the reader
row 421. test: are your correlating the variables with amount raised, or with success of the campaign (with a binary outcome)? amount raised depends on the initial size of the project, and may not be the best approach
row 505. nice addition to existing literature

Author Response

Comments 1: row 20. This is not a correct statement. Crowdfunding can have high transaction costs and high asymmetry. what makes crowdfunding different from other fundraising transactions is the direct contact between supply and demand or funding, without the presence of a financial intermediary that typically is responsible for the screening and monitoring of projects. crowdfunding is a financial innovation. Interesting to note that crowdfunding can be a mechanism that competes with or complements more traditional financial services, such as microfinance as it can target entrepreneurs. See for example: Glisovic, Jasmina and Gonzalez, Henry and Saltuk, Yasemin and de Mariz, Frederic, Volume Growth and Valuation Contraction, Global Microfinance Equity Valuation Survey 2012 (May 3, 2012). Available at SSRN: https://ssrn.com/abstract=2625221 or http://dx.doi.org/10.2139/ssrn.2625221

Response 1: Thank you so much for your comments. Following your advice, we carefully read the article "Volume Growth and Valuation Contraction." Through reading and discussion, we identified certain inaccuracies in the original definition of crowdfunding. Consequently, we have revised the definition of crowdfunding, combining your guidance with insights from the referenced literature.

The sentences are as follows: “As a financial innovation, crowdfunding directly connects the supply and demand sides of funding, eliminating the need for financial intermediaries that typically screen and monitor projects in traditional fundraising transactions. Crowdfunding is perceived as a mechanism that competes with or complements more conventional financial services, such as microcredit, because it offers entrepreneurs more accessible and expedited services.” (see page 1, line 22-27)

References: Chaney, D. A Principal–Agent Perspective On Consumer Co-Production: Crowdfunding and the Redefinition of Consumer Power. Technological Forecasting & Social Change. 2019, 141, 74-84.

Glisovic, J.; González, H.; Saltuk, Y.; de Mariz, F. Volume Growth and Valuation Contraction, Global Microfinance Equity Valuation Survey 2012. Global Microfinance Equity Valuation Survey. 2012.

Comments 2: row 25. Suggest to add the overall size of crowdfunding to balance your argument, it’s a small niche in financial innovation

Response 2: Thanks for your comments. In the original text, we provided data related to the growth rate of crowdfunding. We also highlighted the significant breakthroughs in fundraising speed and success rates of crowdfunding projects in recent years. Following your suggestion, we have augmented this section with specific figures detailing the size of the crowdfunding market, thereby further emphasizing the importance of crowdfunding as a fundraising mechanism.

The sentence is as follows: “The global crowdfunding market was valued at USD 19.86 billion in 2023 and is projected to expand from USD 22.12 billion in 2024 to USD 72.88 billion by 2032.” (see page 1, line 28-30)

References: Crowdfunding Market Size to Grow Usd 72.88 Billion by 2032, at a Cagr of 16.1% From 2024 to 2032. Available online: https://www.globenewswire.com/news-release/2024/08/12/2928427/0/en/Crowdfunding-Market-size-to-grow-USD-72-88-Billion-by-2032-at-a-CAGR-of-16-1-from-2024-to-2032-Polaris-Market-Research.html (accessed on 23/10/2024).

Comments 3: row 29. Consider to add a broader comment first, to emphasize the role of technology across industries, in particular in finance. this allows to connect your paper with a broader stream of research. For example: https://www.researchgate.net/publication/375866096_Fintech_for_Impact_How_Can_Financial_Innovation_Advance_Inclusion

Response 3: Thank you very much for your advice. We have read the article you recommended, "Fintech for Impact: How Can Financial Innovation Advance Inclusion," and subsequently adjusted the introduction to technological development in our original text. We adopted a broader concept, emphasizing the impetus that technological advancements bring to the financial sector. As an innovative financial model within this sector, crowdfunding, with the aid of new technologies, can convey information and signals to potential investors in more engaging and multimodal ways, thereby further facilitating the success of crowdfunding initiatives.

The sentences are as follows: “The burgeoning field of fintech has enhanced inclusivity and convenience in fund-raising for individuals and small businesses. Within the innovative financial model of crowdfunding, technological advancements have enabled project initiators to evolve from traditional text and images to richer, more engaging multimodal forms of information.” (see page 1, line 35-38)

References: de Mariz, F. Fintech for Impact: How Can Financial Innovation Advance Inclusion? De Mariz, Frederic. Fintech for Impact: How Can Financial Innovation Advance Inclusion. 2020.

Zhao, K.; Zhou, L.; Zhao, X. Multi-Modal Emotion Expression and Online Charity Crowdfunding Success. Decis. Support Syst. 2022, 163, 113842.

Comments 4: row 48 - row 58. Consider gender - would that be a predictor of success (male vs female voice), could there be a risk of prejudice integrated in the type of voice? See Prof Zuboff and Surveillance Capitalism on the risks of contributing data. interesting that you mentioned those limitations.

Response 4: Thank you so much for your comments. We highly agree with the point you raised, that voice gender can potentially impact the success rate of crowdfunding. Firstly, from a socio-cultural perspective, perceptions of male and female voices may vary across different cultural and social backgrounds. For instance, in some cultures, male voices might be perceived as symbols of authority or credibility, whereas in others, female voices may be seen as more empathetic or evocative of sympathy. These perceptual differences could indirectly influence the audience's views and trust levels towards crowdfunding projects.

Secondly, from the standpoint of communication style and persuasiveness, males and females may differ in their communication styles, which could affect their persuasiveness. Some research suggests that women tend to use more emotional expressions and storytelling techniques, whereas men might lean towards logical reasoning and data-supported arguments. These differing communication approaches could appeal to distinct types of audiences, thereby impacting the success rates of crowdfunding campaigns.

Thirdly, considering the specific types of projects, different crowdfunding project categories may require speakers with varying styles. For example, technology products or business plans might favor voices that sound professional and authoritative, while art or charity projects could benefit from voices that convey emotional expression and empathy. Thus, selecting the appropriate voice gender for the speaker can also influence the success of specific types of projects.

However, owing to technical constraints during our data collection and analysis, we were unable to include voice gender as a potential variable in this study. Therefore, we have addressed this limitation in the final section of our paper, acknowledging the constraints and suggesting avenues for future research.

The sentences are as follows: “For example, listeners might assess the credibility of speakers differently based on vocal gender, which could lead to varying levels of engagement from gender-preferential audiences across different crowdfunding initiatives. These multifaceted influences of-fer promising avenues for future research, enriching and expanding the application of existing theories in the field of marketing and providing more comprehensive and scientifically grounded guidance for the success rates of fundraising projects. Future experiments, without infringing upon data privacy, could utilize richer datasets through ap-propriate channels to conduct more accurate analyses and predictions of behavioral pat-terns and outcomes.” (see page 17, line 652-660)

References: Martín-Santana, J.D.; Muela-Molina, C.; Reinares-Lara, E.; Rodríguez-Guerra, M. Effectiveness of Radio Spokesperson's Gen-der, Vocal Pitch and Accent and the Use of Music in Radio Advertising. Brq Business Research Quarterly. 2015, 18, 143-160.

Zuboff, S. The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power, ed.; PublicAffairs: New York, 2018.

Comments 5: row 65. Justify why you chose Kickstarter.

Response 5: Thank you so much for your comments. In the Data Selected section, we have already elaborated on the advantages of Kickstarter. Choosing Kickstarter as the source of data for crowdfunding projects was an excellent decision. As one of the largest crowdfunding platforms globally, Kickstarter offers a vast array of project cases spanning multiple domains including creativity, technology, and design, making it an ideal data source for studying crowdfunding behaviors, models, and outcomes.

Firstly, Kickstarter hosts a diverse range of projects, from artistic works to technological innovations, providing rich case studies for research. Secondly, although not all data are publicly available, Kickstarter provides a wealth of accessible information, such as project descriptions, funding targets, and fundraising progress. Thirdly, Kickstarter boasts an active community of supporters and creators, which aids in understanding user behavior and interaction patterns. Lastly, the platform features numerous successful crowdfunding examples, which can be analyzed to identify the factors contributing to project success.

Considering these points, we have made appropriate additions to the introduction section.

The sentences are as follows: “Kickstarter is a prestigious crowdfunding platform renowned for its authority and representativeness within the industry. It frequently garners extensive media coverage and serves as a supporter or partner for numerous creative and entrepreneurial endeavors.” (see page 2, line 74-77)

References: Blaseg, D.; Schulze, C.; Skiera, B. Consumer Protection On Kickstarter. Marketing Science (Providence, R.I.). 2020, 39, 211-233.

Comments 6: row 314. Great chart. those don’t seem to be “controlled variables”, but rather “tested variables”, correct?

What control variables are you using to test your hypothesis, such as country, sector, size of projects?

Response 6: Thank you so much for your comments. We apologize for the typographical error in our manuscript, where “control variables” was incorrectly written as “controlled variables”. In our study, the control for textual content features and image features was not achieved through preemptive measures but rather during the data analysis phase, where these were treated as control variables. Therefore, we have corrected all instances in the full manuscript where “controlled variables” was mistakenly used, changing it to “control variables”.

Comments 7: row 381. Be careful with the use of “controlled variables”. in statistics, the control variables are those additional variables used in your tests to confirm/infirm your hypothesis. using the word “controlled” may confuse the reader.

Response 7: Thank you so much for your comments. Consistent with the previous modification suggestion, we apologize for the typographical error. All variables mentioned in the text are “control variables”, not “controlled variables”. Therefore, we have corrected all instances where this error occurred, changing them to “control variables”.

Comments 8: row 421. Test: are your correlating the variables with amount raised, or with success of the campaign (with a binary outcome)? amount raised depends on the initial size of the project, and may not be the best approach.

Response 8: Thank you for your valuable feedback on our research. We understand your perspective and agree that associating variables with the binary success metric of crowdfunding projects could offer a different angle to evaluate the impact of vocal characteristics. Our choice to use the total amount of funds raised as the dependent variable was primarily due to data availability and resource constraints at the initial stage of our study. Specifically:

First, concerning data availability: At the outset of our research, we had access to the actual total amounts raised by each crowdfunding project. Success, typically defined as reaching or exceeding the funding goal, was a binary outcome that was not consistently documented in our dataset, making it challenging to accurately analyze using this metric.

Second, in terms of study design: This research aimed to explore the specific effects of vocal characteristics on the amount of funds raised. As a continuous variable, the total amount of funds raised provides richer information, allowing us to capture the nuanced impacts of vocal characteristics on the degree of funding obtained. In contrast, the binary success metric, while important, only indicates whether a project reached its goal, without revealing the extent to which it surpassed the target.

Third, regarding resource limitations: Given the constraints of time and manpower, working with continuous variables required fewer preprocessing steps compared to handling categorical variables, which influenced our initial design decisions.

Therefore, we have explicitly acknowledged this limitation in the discussion section and suggested that future research could consider using funding success (i.e., reaching or exceeding the funding goal) as another important dependent variable to further validate the role of vocal characteristics. Additionally, we recommend that subsequent studies, where data permits, combine the project's funding goal to investigate variations in funding outcomes.

The sentences are as follows: “Thirdly, due to limitations in data collection, our analysis focused on the impact of various vocal features on the amount of crowdfunding received, without examining whether these features affect the overall success or failure of the campaigns. Future re-searches could provide richer supplementation in this regard. Additionally, considering that the amount of crowdfunding is influenced by the scale of the project, this represents a limitation of our current study. Subsequent research could further categorize and classify crowdfunding projects, conducting data analysis within more homogenous sample groups to enhance the rigor of the results.” (see page 17, line 642-649)

Comments 9: row 505. nice addition to existing literature.

Response 9: Thank you so much for your comments. We have reviewed new literature and incorporated it into the original text.

References: Chen, J.; Du, M.; Yang, X. How Emotional Cues Affect the Financing Performance in Rewarded Crowdfunding? - An Insight into Multimodal Data Analysis. Electron. Commer. Res. 2024.

Moy, N.; Chan, H.F.; Torgler, B.; Chialvo, D.R. How Much is Too Much? The Effects of Information Quantity On Crowd-funding Performance. Plos One. 2018, 13, e0192012.

Shneor, R.; Vik, A.A. Crowdfunding Success: A Systematic Literature Review 2010–2017. Balt. J. Manag. 2020, 15, 149-182.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The study uses audio analysis techniques to examine data from 4,500 crowdfunding campaigns on the Kickstarter platform between 2013 and 2016, investigating the impact of audio features on crowdfunding success rates. Authors find that moderate speech rate, loudness, pitch, and emotional arousal significantly enhance crowdfunding success, while excessively high or low audio features have negative effects. This research offers practical guidance for project initiators in developing promotional strategies and for platforms in optimizing user experience.

I recommend adding to the abstract a description of both the hypotheses and the results of the paper.

This study fills a research gap in the analysis of voice mining as a precursor to crowdfunding success. The study supplements existing research findings based on surveys and experiments by using innovative machine-learning metrics to measure the voice features in real crowdfunding platform promotional videos. Also extending the application of sound data and voice mining technology to the crowdfunding literature from a methodological perspective.

Hypotheses are clearly defined in the text of the article, and all the facts from the researched area are explained at the end of the article. It also suggests future research in the area.

Author Response

Comments 1: The study uses audio analysis techniques to examine data from 4,500 crowdfunding campaigns on the Kickstarter platform between 2013 and 2016, investigating the impact of audio features on crowdfunding success rates. Authors find that moderate speech rate, loudness, pitch, and emotional arousal significantly enhance crowdfunding success, while excessively high or low audio features have negative effects. This research offers practical guidance for project initiators in developing promotional strategies and for platforms in optimizing user experience.

Response 1: Many thanks to you for the positive comments.

Comments 2: I recommend adding to the abstract a description of both the hypotheses and the results of the paper.

Response 2: Thank you so much for your comments. As you advocated, we have supplemented the abstract with descriptions of the study's hypotheses and findings.

The sentences are as follows: “Grounded in the signaling theory, we posited four hypotheses suggesting that speech rate, loud-ness, pitch, and emotional arousal would each exhibit an inverted U-shaped relationship with crowdfunding success rates. Through data analysis, we found that moderate levels of speech rate, loudness, pitch, and emotional arousal significantly enhanced crowdfunding success, whereas extremes in these vocal characteristics had a detrimental effect.” (see page 1, line 9-13)

Comments 3: This study fills a research gap in the analysis of voice mining as a precursor to crowdfunding success. The study supplements existing research findings based on surveys and experiments by using innovative machine-learning metrics to measure the voice features in real crowdfunding platform promotional videos. Also extending the application of sound data and voice mining technology to the crowdfunding literature from a methodological perspective.

Response 3: Many thanks to you for the positive comments.

Comments 4: Hypotheses are clearly defined in the text of the article, and all the facts from the researched area are explained at the end of the article. It also suggests future research in the area.

Response 4: Many thanks to you for the positive comments.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript titled "Audio Features and Crowdfunding Success: An Empirical Study Using Audio Mining" explores the impact of auditory elements on crowdfunding success by analyzing the audio characteristics of various campaign videos. The findings indicate a U-shaped relationship between success and several features, such as volume and speed.

While the study addresses an intriguing topic and is well-written and structured, I believe it would benefit from further refinement. Below are my detailed suggestions for improvement.

Major Concerns

1. Hypothesis Development The distinction between Hypotheses 2 and 3 is unclear. Hypothesis 2 examines "loudness," which is related to volume, while Hypothesis 3 focuses on "pitch," which also pertains to the perceived "highness or lowness of sound" (see line 240). If there is a meaningful difference between these two concepts, it is not sufficiently highlighted in the text. I recommend either merging these hypotheses or providing a clearer explanation to distinguish them.

2. Sample: Time Period and Source The sample period (2013-2016) is somewhat outdated. While the auditory characteristics examined may not be highly time-sensitive, expanding the dataset to include more recent campaigns would enhance the generalizability of the results. Additionally, while Kickstarter is the source of the data, it is not the only relevant platform. I encourage the authors to discuss this limitation and, if possible, extend the analysis to include other crowdfunding platforms to improve the external validity of the findings.

3. Language Considerations The manuscript does not address the language used in the campaign videos. It is likely that most of the videos are in English, but it would be useful to confirm this assumption and, if applicable, discuss how videos in other languages were handled. Language could significantly influence the independent variables, such as pitch or speed, potentially affecting the results. A discussion of how language and dialect were accounted for in the analysis would be a valuable addition.

4. Campaigns The summary statistics indicate that some campaigns in the sample raised no funds (min = 0). It would be helpful to provide greater transparency regarding the distribution of these zero-fund campaigns by including a detailed table in the appendix. Additionally, I recommend conducting a robustness check that excludes unfunded campaigns to assess whether the findings hold. Campaigns that raise no funds may have unique characteristics that differentiate them systematically from those that achieve some level of success.

5. Mechanisms and Channels The paper currently lacks an investigation into the underlying mechanisms driving the U-shaped relationship between the independent variables and campaign success. What are the possible explanations for these relationships? Conducting tests to explore these mechanisms would enhance the depth of the analysis.

6. Cultural Factors The study does not account for cultural or linguistic differences, which could influence how sound cues are perceived by individuals from different backgrounds. While it may be challenging to identify the cultural or linguistic attributes of backers, the authors could focus on the country of origin of the companies to address this issue. I suggest re-running the analysis while excluding firms from (i) each country individually, (ii) each language group, and (iii) each cultural group, to test the robustness of the findings. The authors may find Melitz and Toubal (2014) helpful for language categorizations and Hofstede (2011) for cultural characteristics. Additionally, investigating heterogeneity in the results based on similar cultural or linguistic groups would strengthen the paper.

Minor Concerns

The term "Controlled variables" should be replaced with "Control variables" (see line 382).
The summary statistics table would benefit from additional information, including the number of observations and the median value for each variable. Additionally, the manuscript does not provide sufficient details about how the authors assessed the distributional form of the data. I recommend including graphical evidence of these distributions in the appendix.
The estimated coefficients for all control variables should be introduced in Tables 3 and 4. Furthermore, the authors should discuss whether the signs and significance levels of these variables align with previous literature or theoretical expectations.

Author Response

Response 1: Thank you so much for your comments. We understand your questions regarding the distinction between Hypothesis 2 and Hypothesis 3 and value your feedback. To clarify the difference between these concepts, we have provided a detailed description at the beginning of Hypothesis 3 in the revised version. Below is a brief explanation of the distinction between loudness and pitch:

Loudness refers to the perceived strength of sound, primarily determined by the amplitude of sound waves. Higher loudness indicates that the sound is perceived as stronger. In human perception, loudness is a nonlinear characteristic, meaning that even though changes in sound wave amplitude may be linear, the perceived changes in loudness can be nonlinear.

Pitch refers to the perception of the frequency of sound, usually associated with the frequency of sound waves. High-frequency sounds are perceived as having a higher pitch, while low-frequency sounds are perceived as having a lower pitch. Although pitch is directly related to frequency, the perceived changes in pitch are not always proportional to changes in frequency.

While both loudness and pitch are dimensions describing sound characteristics, they involve different physical properties of sound (amplitude and frequency, respectively). Therefore, in our study, we examined the impact of loudness and pitch on the success rate of crowdfunding separately, anticipating that these two sound characteristics might influence listeners' perceptions and responses through different mechanisms.

To further clarify the distinctions between these hypotheses, we have added more detailed explanations in the relevant sections of the text, ensuring that readers can clearly understand why we separately test the potential nonlinear effects of loudness and pitch on the success rate of crowdfunding campaigns.

The sentence is as follows: “Unlike loudness, which pertains to the strength of sound (amplitude of sound waves), pitch focuses on the highness or lowness of sound (frequency of the sound).” (see page 7, line 258-259)

Reference: Fletcher, H. Loudness, Pitch and the Timbre of Musical Tones and their Relation to the Intensity, the Frequency and the Overtone Structure. The Journal of the Acoustical Society of America. 1934, 6, 59-69.

Comments 2: Time Period and Source The sample period (2013-2016) is somewhat outdated. While the auditory characteristics examined may not be highly time-sensitive, expanding the dataset to include more recent campaigns would enhance the generalizability of the results. Additionally, while Kickstarter is the source of the data, it is not the only relevant platform. I encourage the authors to discuss this limitation and, if possible, extend the analysis to include other crowdfunding platforms to improve the external validity of the findings.

Response 2: Thank you for your insightful comment on the time period of our sample. We acknowledge the importance of using up-to-date data to enhance the generalizability of our results. To address this, we have expanded our dataset to include more recent crowdfunding campaigns from January 2023 to June 2023. This additional analysis is part of our robustness checks, described in Section 4.5 of the manuscript. Specifically, lines 506 to 524 detail our methodology and findings, ensuring that our auditory characteristic analysis remains relevant and reliable. We appreciate your suggestion, which has helped improve the applicability and strength of our study.

We take very seriously your mention of the limitations involved in using data exclusively from the Kickstarter platform. Here is an elaboration on the rationale for choosing Kickstarter as our data source and the challenges faced when extending the analysis to multiple platforms:

Firstly, Kickstarter, being one of the earliest crowdfunding platforms, has accumulated a substantial amount of historical data. This data, gathered over a long period, is relatively mature and stable, providing a sufficient sample size for research purposes. Secondly, projects on the Kickstarter platform follow a consistent format and set of standards, which helps reduce variable confusion caused by rule differences across platforms, making our research findings more reliable. Additionally, Kickstarter enjoys a high level of recognition and credibility among users, implying that projects hosted on this platform possess good representativeness, reflecting the general situation in the crowdfunding domain.

However, as you pointed out, collecting data from a single platform does introduce certain limitations, especially concerning the external validity of our findings. We have addressed this limitation in the discussion section, suggesting that future research could consider a multi-platform comprehensive analysis to gain a more holistic understanding of how vocal characteristics impact crowdfunding success across different platform contexts.

The sentence is as follows: “Firstly, given that all the data collected for this study originated from the Kickstarter platform, there are inherent limitations to the generalizability of the findings when applied to other crowdfunding platforms. Future research can collect data from a broader range of crowdfunding platforms to further expand this study.” (see page 17, line 652-657)

Comments 3: Language Considerations The manuscript does not address the language used in the campaign videos. It is likely that most of the videos are in English, but it would be useful to confirm this assumption and, if applicable, discuss how videos in other languages were handled. Language could significantly influence the independent variables, such as pitch or speed, potentially affecting the results. A discussion of how language and dialect were accounted for in the analysis would be a valuable addition.

Response 3: Thank you so much for your comments, we agree with it. First, during the data selection process, we confirmed that all crowdfunding videos used for analysis were in English. This information has been added to the data selection section of the paper to ensure that readers understand the linguistic context of our study. The sentence is as follows: “In all crowdfunding videos under examination, the project initiators used the English language.” (see page 9, line 354-355)

Second, regarding the impact of accents and dialects on vocal characteristics such as pitch or loudness, we recognize this as a complex and intriguing topic. However, due to the limitations of current technology and resources, we were unable to measure and statistically account for accents and dialects in the videos scientifically and reasonably. Consequently, we have detailed this as a limitation in the discussion section and suggested that future research could explore this direction to gain a more comprehensive understanding of how different accents and dialects affect the success rate of crowdfunding campaigns.

Third, selecting multilingual crowdfunding videos is indeed a factor worth considering because different languages can influence vocal characteristics. For example, different languages have distinct phonetic characteristics that might affect the manifestation of vocal traits such as pitch and loudness. The cultural connotations carried by different languages could influence listeners' perceptions and reactions. Moreover, the grammatical structures and vocabulary of different languages might impact the cognitive processing of listeners, thereby affecting their interpretation and evaluation of the speaker's vocal characteristics. To eliminate additional variables brought by language differences, this study controlled for the language variable by selecting English-language videos.

The sentence is as follows: “Lastly, due to the limitations of current audio analysis methods, our study did not comprehensively cover known voice features such as language, accent, sound quality, gender, and aesthetics, among others.” (see page 17, line 650-652)

Reference: Melitz, J.; Toubal, F. Native Language, Spoken Language, Translation and Trade. J. Int. Econ. 2014, 93, 351-363.

Comments 4: Campaigns The summary statistics indicate that some campaigns in the sample raised no funds (min = 0). It would be helpful to provide greater transparency regarding the distribution of these zero-fund campaigns by including a detailed table in the appendix. Additionally, I recommend conducting a robustness check that excludes unfunded campaigns to assess whether the findings hold. Campaigns that raise no funds may have unique characteristics that differentiate them systematically from those that achieve some level of success.

Response 4: Thank you for your thoughtful comment. To address your concerns regarding the zero-fund campaigns, we have provided a detailed table in the appendix which outlines the distribution of these campaigns. Additionally, we performed a robustness check by excluding the unfunded campaigns to ensure the reliability of our findings. The results confirmed that our conclusions remain consistent and robust. For a detailed description of these modifications and findings, please refer to Section 4.5 of the manuscript, specifically lines 506 to 524. We appreciate your valuable suggestions, as they have significantly enhanced the transparency and robustness of our study.

Comments 5: Mechanisms and Channels The paper currently lacks an investigation into the underlying mechanisms driving the U-shaped relationship between the independent variables and campaign success. What are the possible explanations for these relationships? Conducting tests to explore these mechanisms would enhance the depth of the analysis.

Response 5: Thank you for your attention to our research work and for providing valuable suggestions. Your point about the lack of exploration into the underlying mechanisms driving the relationship between our independent variables and campaign success is indeed very important and valuable.

Our research aims to explore the relationship between the vocal characteristics of speakers in crowdfunding videos (such as speaking rate, loudness, pitch, and emotional arousal) and the success of crowdfunding campaigns using real-world data. This approach focuses on observational studies designed to uncover associations between these variables in real-world settings. However, as you correctly pointed out, to deeply understand the mechanisms behind these relationships, experimental designs may be necessary.

Given that our research design primarily relies on observational data collected under natural conditions, we did not include an experimental component. This is because controlling for all other variables that might affect the success of crowdfunding campaigns is extremely challenging under natural conditions. Therefore, exploring the underlying mechanisms is more aligned with future research directions, especially under experimental conditions where isolating and testing causal relationships between different variables can be better achieved.

Nonetheless, in our study, we proposed an explanatory framework based on signaling theory, suggesting that the vocal characteristics of speakers might influence investors' cognitive and emotional responses through information transmission and emotional arousal. Specifically, the speaking rate, loudness, and pitch of the speaker can be seen as signals of information strength and credibility, which can affect investors' judgments about the authenticity of the project. Emotional arousal levels reflect the intensity of emotions expressed by the speaker, capable of resonating emotionally with potential backers and influencing their decision-making behavior.

In the revised version, we have strengthened the articulation of the signaling theory-based explanatory mechanisms and emphasized that future research could verify these mechanisms through experimental methods. We believe this will help readers better understand our findings and provide guidance for subsequent research.

The sentences are as follows: “Within the framework of signaling theory, speech rate can be considered a signal that conveys attributes of the initiator, such as their level of confidence and professionalism.” (see page 5, line 187-189)

“Investors, as signal receivers, rely on the clarity and comprehensibility of the signals for processing information.” (see page 6, line 207-208)

“Loudness serves as a signal that can reflect the initiator's emotional intensity and level of engagement.” (see page 6, line 221-222)

“Appropriate loudness can serve as an effective signal, conveying the initiator's enthusi-asm and sincerity towards the project, thereby influencing the investor's cognitive and emotional responses.” (see page 6, line 240-242)

“According to signaling theory, pitch can serve as an indicator of the initiator's emotional state, allowing investors to evaluate the initiator's sincerity and the authenticity of the project based on these signals.” (see page 7, line 280-282)

“Emotional arousal is a significant non-verbal signal transmitted through voice elements such as music and speech. According to signaling theory, emotional arousal can reflect the initiator's emotional state and their level of commitment to the project. Investors use these signals to assess the feasibility of the project and the credibility of the initiator.” (see page 7, line 297-301)

Comments 6: Cultural Factors The study does not account for cultural or linguistic differences, which could influence how sound cues are perceived by individuals from different backgrounds. While it may be challenging to identify the cultural or linguistic attributes of backers, the authors could focus on the country of origin of the companies to address this issue. I suggest re-running the analysis while excluding firms from (i) each country individually, (ii) each language group, and (iii) each cultural group, to test the robustness of the findings. The authors may find Melitz and Toubal (2014) helpful for language categorizations and Hofstede (2011) for cultural characteristics. Additionally, investigating heterogeneity in the results based on similar cultural or linguistic groups would strengthen the paper.

Response 6: Thank you very much for your careful guidance. Your suggestion to reanalyze the data considering the cultural, linguistic differences, and cultural traits of the company's country of origin would indeed help in testing the robustness of our findings and provide a broader perspective. Studies such as those by Mertz and Ouba (2014) on language classification and Hofstede (2011) on cultural dimensions are critical references for future research.

However, given the realities of our current study, implementing this suggestion poses several challenges:

Firstly, there are limitations in data acquisition. Obtaining information about the cultural or linguistic attributes of backers is a formidable task, especially when focusing solely on the country of origin of the project initiators, as it remains difficult to gather enough data representative of all potential backers.

Secondly, there is a risk of insufficient sample sizes. If we were to segment the samples according to different cultural or linguistic groups, our existing dataset might result in too small subgroups, undermining the effectiveness of statistical analyses.

Lastly, complexity increases. Introducing more categorical variables would significantly complicate the model, requiring more resources and technical capabilities to handle and interpret the data, which exceeds the capacity of our current study.

Given these challenges, our current research design indeed falls short in adequately accounting for the influences of cultural or linguistic differences. We have explicitly pointed this out in the discussion section and listed it as one of the limitations of our study. We agree that future research could delve deeper into these factors by employing more detailed cultural and linguistic classifications to further investigate their impact on the success of crowdfunding projects.

Reference: Melitz, J.; Toubal, F. Native Language, Spoken Language, Translation and Trade. J. Int. Econ. 2014, 93, 351-363.

Minor Concerns:

Comments 1: The term "Controlled variables" should be replaced with "Control variables" (see line 382).

Response 1: Thank you so much for your comments. We apologize for the typographical error in our manuscript, where "control variables" was incorrectly written as "controlled variables." Therefore, we have corrected all instances where this error occurred, changing them to "control variables."

Comments 2: The summary statistics table would benefit from additional information, including the number of observations and the median value for each variable. Additionally, the manuscript does not provide sufficient details about how the authors assessed the distributional form of the data. I recommend including graphical evidence of these distributions in the appendix.

Response 2: Thank you for your valuable suggestions. I have updated Table 2 to include the median values for each variable, as well as the number of observations. Additionally, I have provided a detailed description of the distributional form assessment in Section 4.4 of the manuscript. I appreciate your feedback and believe these enhancements will improve the clarity and completeness of the analysis.

Comments 3: The estimated coefficients for all control variables should be introduced in Tables 3 and 4. Furthermore, the authors should discuss whether the signs and significance levels of these variables align with previous literature or theoretical expectations.

Response 3: Thank you for your insightful comments. We have made the necessary revisions by introducing the estimated coefficients for all control variables in Tables 3 and 4. Additionally, we have included a discussion on whether the signs and significance levels of these variables align with previous literature or theoretical expectations. We appreciate your feedback, as it has contributed to enhancing the depth and relevance of our study.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

My previous comments have been addressed.

Author Response

Comments 1: My previous comments have been addressed.

Response 1: Thank you very much for your valuable feedback on my manuscript. I am pleased to see that you believe my previous revisions have addressed all of your initial comments. This is a great encouragement for me, and I will continue to strive to improve the quality of my research. Once again, thank you for your support and guidance!

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

I would like to thank the Authors for addressing most of my previous concerns about their paper.

I have two further suggestions that I believe could enhance the presentation of the paper:

Time Sample

I appreciate the efforts in conducting a robustness check using data for 2023. However, the ideal approach would have been to include all available data from 2013 to 2023. Could the Authors clarify why this strategy was not pursued? If this decision was due to data availability, I suggest explicitly noting this limitation in the robustness check section. Alternatively, if the data is available, performing the robustness test with the full sample would be advisable.

Mechanisms and Channels

I understand that the Authors' approach is observational, and I appreciate their discussion of the signaling theory as the most likely mechanism. However, it may still be feasible to test this mechanism using a proxy for high versus low signals from the company behind the idea presented in the video. Potential proxies for measurement could include: (a) tenure or age of the company at the time of the video; (b) presence of filed patents; and (c) revenues relative to the sample median. By setting the median as a threshold for one or more of these variables, a binary indicator could be created, equal to 1 if the value is above the median and 0 otherwise; the analysis could then assess how the results vary with estimations on the split sample. Data for points (a) and (c), in particular, should be relatively easy to obtain. I suggest that the Authors consider this analysis to substantiate their expectations around signaling theory.

Author Response

Comments 1: Time Sample. I appreciate the efforts in conducting a robustness check using data for 2023. However, the ideal approach would have been to include all available data from 2013 to 2023. Could the Authors clarify why this strategy was not pursued? If this decision was due to data availability, I suggest explicitly noting this limitation in the robustness check section. Alternatively, if the data is available, performing the robustness test with the full sample would be advisable.

Response 1: Thank you very much for your valuable feedback on my manuscript. We fully understand and agree with your suggestion to use all available data from 2013 to 2023 for the robustness check. However, due to the following reasons, we are currently unable to implement this approach:

1. Data Availability Constraints: Despite our best efforts to obtain as much data as possible, some key variables are missing or unreliable in the earlier years that prevent us from acquiring complete data for the entire period from 2013 to 2023.

2. Time Constraints: Our research is nearing completion, and re-collecting and processing data for the entire period from 2013 to 2023 would require significant time and resources, potentially delaying the submission of the revised manuscript. Especially considering that the editor has given us only 5 days for revisions, it would be extremely challenging to gather such a large sample and conduct additional analyses within such a short timeframe.

3. Representativeness of the Current Sample: We have used data from Representativeness of the Current Sample: We have used data from 2013-2016, and 2023, which provides a representative time span and effectively captures recent trends. The robustness check conducted with this data still yields results that are highly credible and convincing.

In addition, we acknowledge the limitations of our current dataset and have highlighted the constraints it imposes on the generalizability of our findings. We strongly encourage future studies to incorporate larger-scale samples to verify these results more comprehensively (see page 17, line 671-674).

We hope these explanations are understood and appreciated. Thank you again for your thoughtful suggestions, which have been invaluable in strengthening our work.

Comments 2: Mechanisms and Channels. I understand that the Authors' approach is observational, and I appreciate their discussion of the signaling theory as the most likely mechanism. However, it may still be feasible to test this mechanism using a proxy for high versus low signals from the company behind the idea presented in the video. Potential proxies for measurement could include: (a) tenure or age of the company at the time of the video; (b) presence of filed patents; and (c) revenues relative to the sample median. By setting the median as a threshold for one or more of these variables, a binary indicator could be created, equal to 1 if the value is above the median and 0 otherwise; the analysis could then assess how the results vary with estimations on the split sample. Data for points (a) and (c), in particular, should be relatively easy to obtain. I suggest that the Authors consider this analysis to substantiate their expectations around signaling theory.

Response 2: Thank you very much for your valuable feedback on our manuscript. Your suggestion to use proxy variables for high versus low signals from the company to test the signaling theory mechanism is highly appreciated and indeed valuable. We fully agree with this approach. However, due to the challenges in data acquisition and the scope limitations of our current study, we are currently unable to implement this suggestion. While data on the company’s tenure and revenue is relatively easy to obtain, data on patent filings is more difficult to acquire as it is often scattered across different databases and requires significant time and effort to integrate and validate. Additionally, our study primarily focuses on the content of the videos and their impact on interactive marketing, and introducing additional company-specific variables might blur the focus and coherence of the research.

The sentences are as follow: “Additionally, we did not incorporate proxy variables for the strength of signals from the company, such as the company's tenure, patent filings, and revenue levels, to further test the signaling theory mechanism. Future research could address this limitation by incorporating these company-specific variables and using more comprehensive datasets to further validate and extend the conclusions of this study. Doing so would not only en-hance the internal validity of the research but also provide a deeper understanding of the role of signaling theory in interactive marketing.” (see page 17, line 664-670)

Author Response File: Author Response.pdf

Article Menu

Audio Features and Crowdfunding Success: An Empirical Study Using Audio Mining

Major Concerns

Minor Concerns

Further Information

Guidelines

MDPI Initiatives

Follow MDPI