Detecting Indicators for Startup Business Success: Sentiment Analysis Using Text Data Mining
Abstract
:1. Introduction
2. Literature Review
2.1. UGC Analysis
2.2. Sentiment Analysis with Social Network Analysis
2.3. Textual Analysis
3. Research Questions
4. Methodology
4.1. Data Sampling
- Active Twitter profile (profiles without activity in the three months prior to the use of #Startups were deleted)
- Twitter user profile had a profile photo and a cover picture
- No retweets. Retweets from the same tweet about #startup, “start-up”, “startups” and “start-ups” were removed (i.e., considered as duplicate content)
- Public profiles. Only public profiles and tweets using #Startups in English were included
- Minimum 80 characters. Tweets had to be at least 80 characters long (including spaces) and use the #Startups hashtag. This means that tweets without the “#” or a wrong label like “# startups” were omitted.
4.2. Topic Identification Using LDA
4.3. Sentiment Analysis
4.4. Textual Analysis
5. Results Analysis
5.1. Latent Dirichlet Allocation (LDA) Model
5.2. Topic Sentiment Identification
5.3. Textual Analysis Results
6. Discussion
7. Conclusions
7.1. Theoretical Implications
7.2. Practical Implications
Author Contributions
Funding
Conflicts of Interest
References
- Zutshi, A.; Grilo, A.; Jardim-Gonçalves, R. A dynamic agent-based modeling framework for digital business models: Applications to Facebook and a popular Portuguese online classifieds website. In Digital Enterprise Design & Management; Springer: Cham, Switzerland, 2014; pp. 105–117. [Google Scholar]
- Baum, J.A.; Silverman, B.S. Picking winners or building them? Alliance, intellectual, and human capital as selection criteria in venture financing and performance of biotechnology startups. J. Bus. Ventur. 2004, 19, 411–436. [Google Scholar] [CrossRef]
- Saura, J.R.; Reyes-Menendez, A.; Alvarez-Alonso, C. Do online comments affect environmental management? Identifying factors related to environmental management and sustainability of hotels. Sustainability 2018, 10, 3016. [Google Scholar] [CrossRef]
- Baum, J.A.; Calabrese, T.; Silverman, B.S. Don’t go it alone: Alliance network composition and startups’ performance in Canadian biotechnology. Strateg. Manag. J. 2000, 21, 267–294. [Google Scholar] [CrossRef]
- Anderson, M.; Magruder, J. Learning from the crowd: Regression discontinuity estimates of the effects of an online review database. Econ. J. 2012, 122, 957–989. [Google Scholar] [CrossRef]
- Jia, S. Leisure Motivation and Satisfaction: A Text Mining of Yoga Centres, Yoga Consumers, and Their Interactions. Sustainability 2018, 10, 4458. [Google Scholar] [CrossRef]
- Islam, M.; Fremeth, A.; Marcus, A. Signaling by early stage startups: US government research grants and venture capital funding. J. Bus. Ventur. 2018, 33, 35–51. [Google Scholar] [CrossRef]
- Kopera, S.; Wszendybył-Skulska, E.; Cebulak, J.; Grabowski, S. Interdisciplinarity in Tech Startups Development–Case Study of ‘Unistartapp’Project. Found. Manag. 2018, 10, 1–10. [Google Scholar] [CrossRef]
- Hagen, C.; Bergh, N.S.; Christensen, S. Startups Seeking Business Angel Financing-From the Entrepreneur’s Perspective. Master’s Thesis, NTNU, Trondheim, Norway, 2018. [Google Scholar]
- Taylor, B.D.; McNair, D.E. Virtual School Startups: Founder Processes in American K-12 Public Virtual Schools. Int. Rev. Res. Open Distrib. Learn. 2018, 19. [Google Scholar] [CrossRef]
- Wouters, M.; Anderson, J.C.; Kirchberger, M. New-Technology Startups Seeking Pilot Customers: Crafting a Pair of Value Propositions. Calif. Manag. Rev. 2018, 19. [Google Scholar] [CrossRef]
- Herráez, B.; Bustamante, D.; Saura, J.R. Information classification on social networks. Content analysis of e-commerce companies on Twitter. Rev. Espac. 2017, 38, 16. [Google Scholar]
- Saura, J.R.; Palos-Sanchez, P.R.; Correia, M.B. Digital Marketing Strategies Based on the E-Business Model: Literature Review and Future Directions. In Organizational Transformation and Managing Innovation in the Fourth Industrial Revolution; IGI Global: Hershey, PA, USA, 2019; pp. 86–103. [Google Scholar]
- Saura, J.R.; Palos-Sanchez, P.R.; Rios Martin, M.A. Attitudes to environmental factors in the tourism sector expressed in online comments: An exploratory study. Int. J. Environ. Res. Public Health 2018, 15, 553. [Google Scholar] [CrossRef] [PubMed]
- Saura, J.R.; Palos-Sanchez, P.; Reyes-Menendez, A. Marketing a través de Aplicaciones Móviles de Turismo (M-Tourism). Un estudio exploratorio. Int. J. World Tourism 2017, 4, 8. [Google Scholar] [CrossRef]
- Fukugawa, N. Is the impact of incubator’s ability on incubation performance contingent on technologies and life cycle stages of startups? evidence from Japan. Int. Entrep. Manag. J. 2018, 14, 457–478. [Google Scholar] [CrossRef]
- Reyes-Menendez, A.; Saura, J.R.; Alvarez-Alonso, C. Understanding #WorldEnvironmentDay User Opinions in Twitter: A Topic-Based Sentiment Analysis Approach. Int. J. Environ. Res. Public Health 2018, 15, 2537. [Google Scholar] [CrossRef]
- Ye, Q.; Law, R.; Gu, B.; Chen, W. The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput. Hum. Behav. 2011, 27, 634–639. [Google Scholar] [CrossRef]
- Palos-Sanchez, P.; Saura, J.R.; Martin-Velicia, F. A study of the effects of Programmatic Advertising on users’ Concerns about Privacy overtime. J. Bus. Res. 2019, 96, 61–72. [Google Scholar] [CrossRef]
- Lee, T.Y.; Bradlow, E.T. Automated marketing research using online customer reviews. J. Mark. Res. 2011, 48, 881–894. [Google Scholar] [CrossRef]
- Büschken, J.; Allenby, G.M. Sentence-based text analysis for customer reviews. Mark. Sci. 2016, 35, 953–975. [Google Scholar] [CrossRef]
- Hao, H.; Zhang, K.; Wang, W.; Gao, G. A tale of two countries: International comparison of online doctor reviews between China and the United States. Int. J. Med. Inform. 2017, 99, 37–44. [Google Scholar] [CrossRef]
- Miller, M.; Banerjee, T.; Muppalla, R.; Romine, W.; Sheth, A. What are people tweeting about Zika? An 561 exploratory study concerning symptoms, treatment, transmission, and prevention. JMIR Public Health Surveil. 2017, 3, e38. [Google Scholar] [CrossRef]
- Liu, X.; Burns, A.C.; Hou, Y. An investigation of brand-related user-generated content on Twitter. J Advert. 2017, 46, 236–247. [Google Scholar] [CrossRef]
- Wang, F.; Zhai, Y. Social structure and evolvement of WeChat groups: A case study based on text mining. J. China Soc. Sci. Technol. Inform. 2016, 35, 617–629. [Google Scholar]
- Liang, Y.; Liu, Y.; Chen, C.; Jiang, Z.G. Extracting topic-sensitive content from textual documents: A hybrid topic model approach. Eng. Appl. Artif. Intell. 2018, 70, 81–91. [Google Scholar] [CrossRef]
- Arora, A.; Fosfuri, A.; Rønde, T. Waiting for the Payday? The Market for Startups and the Timing of Entrepreneurial Exit (No. w24350); National Bureau of Economic Research: Cambridge, MA, USA, 2018. [Google Scholar]
- Bennett, D.; Yábar, D.P.B.; Saura, J.R. University Incubators May Be Socially Valuable, but How Effective Are They? A Case Study on Business Incubators at Universities. In Entrepreneurial Universities; Innovation, Technology, and Knowledge Management; Peris-Ortiz, M., Gómez, J., Merigó-Lindahl, J., Rueda-Armengot, C., Eds.; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Palos-Sanchez, P.; Martin-Velicia, F.; Saura, J.R. Complexity in the Acceptance of Sustainable Search Engines on the Internet: An Analysis of Unobserved Heterogeneity with FIMIX-PLS. Complexity 2018, 1–19. [Google Scholar] [CrossRef]
- Saura, J.R.; Palos-Sánchez, P.; Cerdá Suárez, L.M. Understanding the Digital Marketing Environment with KPIs and Web Analytics. Future Internet 2017, 9, 76. [Google Scholar] [CrossRef]
- Hasan, A.; Moin, S.; Karim, A.; Shamshirband, S. Machine Learning-Based Sentiment Analysis for Twitter Accounts. Math. Comput. Appl. 2018, 23, 11. [Google Scholar] [CrossRef]
- Blei, D.M. Probabilistic topic models. Commun. ACM 2012, 55, 77–84. [Google Scholar] [CrossRef]
- Garbuio, M.; Lin, N. Artificial Intelligence as a Growth Engine for Health Care Startups: Emerging Business Models. Calif. Manag. Rev. 2018. [Google Scholar] [CrossRef]
- Pak, A.; Paroubek, P. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the LREC, Valletta, Malta, 17–23 May 2010. [Google Scholar]
- Kuo, T.-T.; Hung, S.-C.; Lin, W.-S.; Peng, N.; Lin, S.-D.; Lin, W.-F. Exploiting latent information to predict diffusions of novel topics on social networks. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, Association for Computational Linguistics, Jeju Island, Korea, 8–14 July 2012; pp. 344–348. [Google Scholar]
- Honeycutt, C.; Herring, S.C. Beyond microblogging: Conversation and collaboration via Twitter. In Proceedings of the 42nd Hawaii International Conference on System Sciences, Hawaii, HI, USA, 5–8 January 2009; pp. 1–10. [Google Scholar]
- Boyd, D.; Golder, S.; Lotan, G. Tweet, tweet, retweet: Conversational aspects of retweeting on twitter. In Proceedings of the IEEE 43rd Hawaii International Conference on Social Systems (HICSS), Kauai, HI, USA, 5–8 January 2010. [Google Scholar]
- Bologna, G.; Hayashi, Y. A Rule Extraction Study from SVM on Sentiment Analysis. Big Data Cognit. Comput. 2018, 2, 6. [Google Scholar] [CrossRef]
- Vásquez, G.A.; Escamilla, E.M. Best Practice in the Use of Social Networks Marketing Strategy as in SMEs. Procedia Soc. Behav. Sci. 2014, 148, 533–542. [Google Scholar] [CrossRef]
- Saito, K.; Nakano, R.; Kimura, M. Prediction of information diffusion probabilities for independent cascade model. In Knowledge-Based Intelligent Information and Engineering Systems; Springer: Berlin/Heidelberg, Germany, 2008; pp. 67–75. [Google Scholar]
- Jiang, B.; Liang, J.; Sha, Y.; Li, R.; Liu, W.; Ma, H.; Wang, L. Retweeting behavior prediction based on one-class collaborative filtering in social networks. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, Tuscany, Italy, 17–21 July 2016; ACM: New York, NY, USA, 2016; pp. 977–980. [Google Scholar]
- Reyes-Menendez, A.; Palos-Sanchez, P.R.; Saura, J.R.; Martin-Velicia, F. Understanding the Influence of Wireless Communications and Wi-Fi Access on Customer Loyalty: A Behavioral Model System. Wirel. Commun. Mob. Comput. 2018. [Google Scholar] [CrossRef]
- Kwon, S. Gerontechnology: Research, Practice, and Principles in the Field of Technology and Aging; Springer Publishing Company, LLC: New York, NY, USA, 2017. [Google Scholar]
- Ramirez-Andreotta, M.; Brody, J.; Lothrop, N.; Loh, M.; Beamer, P.; Brown, P. Improving Environmental Health Literacy and Justice through Environmental Exposure Results Communication. Int. J. Environ. Res. Public Health 2016, 13, 690. [Google Scholar] [CrossRef]
- Rosa, H.; Carvalho, J.P.; Astudillo, R.; Batista, F. Detecting user influence in twitter: Pagerank vs. katz, a case study. In Proceedings of the Seventh European Symposium on Computational Intelligence and Mathematics, Cádiz, Spain, 7–10 October 2015. [Google Scholar]
- Palomino, M.; Taylor, T.; Göker, A.; Isaacs, J.; Warber, S. The Online Dissemination of Nature–Health Concepts: Lessons from Sentiment Analysis of Social Media Relating to “Nature-Deficit Disorder”. Int. J. Environ. Res. Public Health 2016, 13, 142. [Google Scholar] [CrossRef] [PubMed]
- Palos-Sanchez, P.R. Drivers and Barriers of the Cloud Computing in SMEs: The Position of the European Union. Harv. Deusto Bus. Res. 2017, 6, 116–132. [Google Scholar] [CrossRef]
- Reyes-Menendez, A.; Saura, J.R.; Palos-Sanchez, P.; Alvarez-Garcia, J. Understanding User Behavioral Intention to adopt a Search Engine that promotes Sustainable Water Management. Symmetry 2018, 10, 584. [Google Scholar] [CrossRef]
- Palos-Sanchez, Saura, Jr.; Reyes-Menendez, A.; Esquivel, I.V. Users Acceptance of Location-Based Marketing Apps in Tourism Sector: An Exploratory Analysis. J. Spat. Organ. Dyn. 2018, 6, 258–270. [Google Scholar]
- Gosh, D.; Guha, R. What are we ‘tweeting’ about obesity? Mapping tweets with topic modeling and geographic information system. Cartogr. Geogr. Inform. Sci. 2013, 40, 90–102. [Google Scholar] [CrossRef] [PubMed]
- Saura, J.R.; Reyes-Menendez, A.; Palos-Sanchez, P. Un Análisis de Sentimiento en Twitter con Machine Learning: Identificando el sentimiento sobre las ofertas de# BlackFriday. Revista Espacios 2018, 39, 16. [Google Scholar]
- Palos-Sánchez, P.R.; Arenas-Márquez, F.J.; Aguayo-Camacho, M. Determinants of Adoption of Cloud Computing Services by Small, Medium and Large Companies. J. Theor. Appl. Inf. Technol. 2017, 95. [Google Scholar]
- Palos Sánchez, P.R. Estudio organizacional del cloud computing en empresas emprendedoras. Rev. 3c Tecnol. 2017, 6. [Google Scholar] [CrossRef]
Characteristics | [18] | [5] | [20] | [21] | [22] | [23] | [3] | [24] | This Research |
---|---|---|---|---|---|---|---|---|---|
Online Rating | √ | √ | - | √ | √ | - | - | - | - |
Comments | - | √ | √ | √ | √ | √ | √ | √ | √ |
LDA | - | - | - | √ | √ | √ | √ | √ | √ |
Social Interactions | - | - | - | √ | √ | √ | √ | √ | √ |
Topic Frequency | - | - | √ | √ | - | √ | √ | √ | √ |
Characteristics | [34] | [35] | [36] | [17] | [37] | [3] | This Research |
---|---|---|---|---|---|---|---|
Neural Connection Analysis | √ | - | √ | - | √ | - | - |
Textual Analysis | - | √ | √ | √ | √ | √ | √ |
Time | - | √ | - | - | - | - | √ |
Hashtags, URLs, or Mentions | - | √ | - | √ | - | √ | √ |
Topic | - | √ | √ | √ | - | √ | √ |
Classification of Information | √ | - | √ | √ | - | √ | √ |
Characteristics | [43] | [44] | [14] | [40] | [41] | [45] | [37] | [6] | This Research |
---|---|---|---|---|---|---|---|---|---|
Classification into Nodes | √ | √ | - | √ | √ | √ | √ | - | - |
Categorization | √ | √ | √ | √ | √ | - | - | √ | √ |
Word Count | √ | √ | √ | - | - | √ | √ | √ | √ |
Keywords | - | - | √ | - | - | √ | √ | √ | √ |
Topic Name | Topic Description |
---|---|
Business Angels | Relationship with investors or business angels to obtain financing for startups. |
Business Plans | Information about how to prepare a business plan for startups, which is adapted to its ecosystem. |
Startup Project | Information about the startup’s foundation, creation, management, and team structure |
Startup Methodology | Lean startup method for the development of successful startups. Guidelines to structure the projects. |
Startup Incubators | Information about start-up incubators or accelerators that offer startup acceleration and promotion programs in their training programs. |
Startup Jobs | Job profiles and job offers in startups. Specialist profiles for developers or digital marketing. |
Startup Founders | Information for startup CEOs (Chief Executive Officer) and team leaders. |
Technology-Based Startup | Startups that develop or improve the technologies on which their business model is based, seeking innovation and excellence in sustainable business processes and quality. |
Startup Geo-Location | Location of startups and information about them. Main startup’s location identified. |
Startup Tools | Tools that startups use to organize team management and collaboration between the startups’ team members. |
Startup Frameworks and Programming Languages | Programming languages and frameworks that are usually used in startups to develop their projects. |
Topic Name * | Words in the Topic | Sentiment |
---|---|---|
Business Angels | invest in startups, funding for startups, investor for a startup, startup capital, money for a startup, raise capital for a startup, investing in startups, angel’s startup, venture capital startups | Negative |
Business Plans | market a startup, finance a startup, write a business plan, business idea, startups businesses, business plan startup, equity, | Neutral |
Startup Project | build a startup, fund a startup, startup costs, startup business, tech startups, small business startup, early stage startups, | Neutral |
Startup Methodology | lean startup, education startups, startups books, startups academics reports, sustainable startup, crowd-funding startups | Positive |
Startup Incubators | startup institute, startups academy, startup hub, | Neutral |
Startup Jobs | work for a startup, startup jobs, startup intern, startup hiring, | Negative |
Startup Founders | leader, startup manager, entrepreneur, entrepreneurship, leadership, leader | Positive |
Technology-Based Startup | Machine learning, Iot (Internet of Things), AI (Artificial Intelligence), Big Data, Social Network Analysis, Cryptocurrencies, Bitcoin, Digital Marketing, SEO (Search Engine Optimization), SEM (Search Engine Marketing), Social Media Optimization, Neuromarketing | Positive |
Startup Geolocation | India, Berlin, Boston, USA, UK, Israel, Dallas, China, Silicon Valley | Neutral |
Startup Tools | Slack, Trello, Google Analytics, Docker, GitHub | Positive |
Startup Frameworks and Programming Languages | PHP, Python, Java, Go, JS, Node.js, Angular.js, Django, MySQL, PostgreSQL, HTML, CSS | Negative |
N1 | Key Factors | Weighted Percentage | Count |
---|---|---|---|
Startup Tools | Tools for management in startups are essential due to the large number of processes that are carried out at the same time. Organizational tools help startup teams to communicate better between different processes | 2.18 | 452 |
Technology-Based Startup | Artificial Intelligence is the future for innovative startups Machine learning is one of the technologies that leads innovation in startups | 1.98 | 245 |
Startup Founders | The leadership of the CEO in a startup is considered key to its success Teamwork and highly specialized profiles are key to success in a startup | 0.97 | 212 |
Startup Methodology | The creation and development process in a startup is unique. It must include special procedures for the particular business model used by the startup. | 0.98 | 190 |
N2 | Key factors | Weighted Percentage | Count |
---|---|---|---|
Business Plans | Business plans in startups define the viability of the products or services offered. It is of critical importance before receiving an investment. | 2.06 | 310 |
Startup Project | Startup projects should be sustainable, exponential, and innovative. In addition, they should be based on technological breakthroughs. | 1.43 | 259 |
Startup Incubators | Startup incubators are an opportunity to start projects with the help of mentors and funding. Startup accelerators are important for small startups that need help to develop their ideas and business plans. | 1.31 | 237 |
Startup Geolocation | The location of a startup can help its success.The ecosystems and locations where there are many startups can help the projects be successful because of the surrounding community. | 0.97 | 179 |
N3 | Key Factors | Weighted Percentage | Count |
---|---|---|---|
Startup Frameworks and Programming Languages | Although they are important for the development of startups, there may be problems when trying to find adequately specialized professionals in programming. | 2.11 | 382 |
Startup Jobs | These are usually low quality with low salaries, although there is a dynamic work environment. | 1.36 | 275 |
Business Angels | High financial charges when there is a need for investment. The return demanded is too high. | 1.22 | 2.54 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Saura, J.R.; Palos-Sanchez, P.; Grilo, A. Detecting Indicators for Startup Business Success: Sentiment Analysis Using Text Data Mining. Sustainability 2019, 11, 917. https://doi.org/10.3390/su11030917
Saura JR, Palos-Sanchez P, Grilo A. Detecting Indicators for Startup Business Success: Sentiment Analysis Using Text Data Mining. Sustainability. 2019; 11(3):917. https://doi.org/10.3390/su11030917
Chicago/Turabian StyleSaura, Jose Ramon, Pedro Palos-Sanchez, and Antonio Grilo. 2019. "Detecting Indicators for Startup Business Success: Sentiment Analysis Using Text Data Mining" Sustainability 11, no. 3: 917. https://doi.org/10.3390/su11030917