Predicting Business Innovation Intention Based on Perceived Barriers: A Machine Learning Approach

: In the Industry 4.0 scenario, innovation emerges as a clear driver for the economic development of societies. This e ﬀ ect is particularly true for the least developed countries. Nevertheless, there is a lack of studies that analyze this phenomenon in these nations. In this context, this study aims to examine the impact of perceived barriers to innovation to predict companies (cid:48) innovative intentions in an emerging economy. This study is a preliminary e ﬀ ort to use data mining and symmetry-based learning concepts, especially classiﬁcation, to assist the identiﬁcation of strategies to incentivize intention to innovate in companies. Using the decision tree classiﬁcation technique, we analyzed a sample of Chilean companies (N = 5876). The sample was divided into large enterprises (LEs) and small and medium enterprises (SMEs). In the group of large companies, the barriers that most impact the intention to innovate are innovation cost, lack of demand innovations, and lack of qualiﬁed personnel. Alternatively, in the group of small-medium companies, the barriers that most impact the intention to innovate are lack of own funds, lack of demand innovations, and lack of information about technology. These results show how the perceptions of barriers are signiﬁcant to predict the intentions of innovation in Chilean companies. Furthermore, the perceptions of these barriers are contingent on the organizational sizes. These ﬁndings contribute to understanding the e ﬀ ect of contingencies on innovative intention in an emerging economy. C.R.-C. and B.H.-R.; data curation, C.R.-C.; writing—original draft preparation, C.R.-C., B.H.-R. and P.R.-C.; writing—review and editing, C.R.-C. and B.H.-R.; visualization, C.R.-C. and B.H.-R.; supervision, C.R.-C.; project administration, C.R.-C.; funding acquisition, C.R.-C. All authors have


Introduction
The fourth Industrial Revolution refers to a change in manufacturing logic in the direction of an increasingly decentralized and self-regulating approach to value creation; a process supported by concepts and new technologies that help companies meet future production requirements [1]. Innovation is key to this process of adopting concepts and new technologies, as exemplified by the adoption of the Internet of Things as an enabler of service innovation [2]. Additionally, both application-pull and technology-push directions for Industry 4.0 are related to innovation capabilities [3]. In this scenario, innovation emerges as a driver for the economic development of societies. This effect is particularly true for the least developed countries. Given that developed nations have a high intensity of knowledge, Industry 4.0 is used in these territories as a tool for further development of the knowledge economy. In contrast, in developing countries, Industry 4.0 is seen as a self-goal [4]. Besides, the development prospects of Industry 4.0 in the global economy indicate that in some future scenarios, this phenomenon may become a competitive advantage for developing countries compared to developed nations or at least a source of competitive parity. However, in other future scenarios, emerging economies will not be winners [5]. Despite the importance of understanding innovation as crucial for Industry 4.0, as far as we know, there is a lack of studies that analyze business innovation in developing countries.
In this context, it is crucial to understand the barriers that prevent the fostering of innovation. Researchers have used different theoretical lenses to explain the obstacles to innovation. For example, scholars have explained these barriers from the perspective of absorptive capacity [6], dynamic capabilities [7], organizational learning [8], organizational routines [9], and social capital [10]. Most of these studies agree that the availability of resources can help explain the perception of barriers to innovation, since resources may shape capabilities and motivations. For instance, authors claim that resource-rich organizations are more capable of innovating, so they will encounter fewer barriers to innovation [11,12]. This perception of higher capabilities also promotes an aggregate sense of self-enhancement that motivates to initiate the exploration of new opportunities. Nevertheless, the availability of resources depends on the munificence of the countries. Munificence describes "the extent to which an environment supports the sustained growth of a firm" [13]. Developed countries are more munificent than developing countries. This means that in developed countries there are more opportunities to pursue and more availability of resources to pursue them [14]. In developing countries, this situation differs because organizations tend to be "deprived of superior technology and the supporting infrastructure often found in developed countries" [15]. Organizations are open systems that interact with the environment and depend on the availability of environmental resources [16]. Thus, we can presume that organizations in developed countries will distinctly perceive the barriers as organizations in developed countries. However, current studies do not account for the contingent dynamics that affect emerging countries because more of the empirical studies have assessed the barriers in developed countries c.f. [17,18].
As such, we asked how organizations in developing countries perceived barriers to innovation. To address this question, we examined the perception of barriers in organizations from Chile, a developing country. Consequently, this study aims to analyze the impact of perceived barriers to innovation to predict the innovative intention of companies in an emerging economy. We drew on a contingency theoretical lens that argues that mechanisms and barriers change according to internal and external contingencies [19]. Thus, we compared how perceptions change towards the barriers for innovation between large organizations that possess more resources, and small and medium organizations that possess fewer resources. We used a machine learning approach to assess this contingent model, in particular, the decision tree classification technique. This technique is a hierarchical structure consisting of nodes and directed edges that "solves a classification problem by asking a series of carefully crafted questions about the attributes of the test record" [20]. In general, this study procedure is an instance of the use of data mining and symmetry-based learning concepts for particular classification and subsequent prediction.
This study contributes to understanding and comparing the effects of the critical perceived barriers to the intention to innovate between different realities and environments, such as those that happen in large, and small-medium enterprises in an emerging country like Chile. These findings will help policymakers and managers make adequate decisions to avoid or eliminate these obstacles in a way that achieves better business performance and sustainable results.
The organization of this paper is as follows. In Section 2, we describe the secondary data gathered from the Chilean Innovation Survey and the method used to analyze the data about perceived barriers of innovation. The results of applying the decision tree method to predict the innovating perception in large and SMEs organizations are presented in Section 3. Section 4 includes the discussion, implications for practitioners and policymakers, and future research. The last section presents the main conclusions of this study.

Data
We used archival data as secondary sources of information [21]. The source of information was acquired from the Innovation Survey (EI-10) from the Ministry of Economy [22]. This survey was based in the Oslo Manual 3rd edition [23]. The data collected by this survey is part of the information included in the business innovation statistics and indicators database of the OCDE developed by the working party of National Experts on Science and Technology Indicators (NESTI) [24,25]. The Innovation questions cover two years of 2015-2016. The survey was self-applied using the web page of the Innovation Division of the Ministry of Economy from the Ministry of Economy. The analysts checked whether the online surveys were answered. In the case of surveys without answers, the analysts sent a reminder by e-mail. Alternatively, the analysts contacted the organizations by phone, offering help for answering the survey when needed. The mean number of contacts between the analyst and the informant was three. However, reaching the organizations could take up to seven tries [25]. The sample comprises 5876 Chilean organizations that had sales over US$90,000 per year. The demographic characteristics of the organizations in our sample are shown in Table 1. Sales per year defined the size of the organization. The small organizations comprise sales over US$90M and under US$880M and represent 41% of the sample. The medium organizations include sales over US$880M and under US$3500M and represent 23% of the sample. Finally, large organizations comprise sales of over US$3500M and represent 36% of the sample. The average organization age was 18 years (s.d. = 15.3). The organizations encompassed a broad range of industries, covering natural resources (13%), manufacturing (19%), electricity and water (2%), construction (10%), trade (16%), transport (6%), telecommunication (3%), and services (31%). According to the innovation type, Table 2 shows the intention of innovating in the next two years of these companies. For the study, we established a company as intending to innovate if it wanted to perform at least one innovation type.
This study considered the obstacles or barriers perceived by the companies of the sample as attributes to predict the intention to innovate. These obstacles were based on the guidelines of the Oslo Manual [23]. Table 3 shows that the analysis finds a dozen obstacles. Furthermore, given the possible differences between the company sizes, we included the size as an attribute in SME analysis. The scores detail the perceived importance of each obstacle by company size and in total. We categorized the Symmetry 2020, 12, 1381 4 of 9 importance of the barriers as null, low, medium, and high, and for the calculation, we coded it as 0, 1, 2, and 3, respectively.

Machine Learning Approach
We used a machine learning approach to achieve the objectives of this study. In particular, the prediction of innovation intention was conducted using decision trees. A decision tree is an inverted tree-shaped model made up of a set of nodes intended to decide on values affiliated to a class. This learning algorithm identifies a model that best fits the relationship between the attribute set and the class label of the input data. This research chooses this nonparametric technique due to several reasons following Tan, Steinbach and Kumar [20]. First, the decision trees do not require prior assumptions about the type of probability distributions satisfied by the class and other attributes. Second, the computational construction of decision trees is inexpensive and fast, even when the training set size is considerable. Third, the interpretation of decision trees, especially smaller-sized trees, requires less effort. Fourth, decision tree algorithms are quite robust to the presence of noise, especially when methods for avoiding overfitting are employed. Fifth, the accuracy of decision trees is not impacted by the presence of strongly correlated attributes and irrelevant attributes during preprocessing. Ultimately, the technique is useful for predictive modeling, i.e., anticipating the class label of unknown records, in this case, the perceived barriers to business innovation intention in small and medium-sized (SME) and large-sized organizations.
In the decision trees method, an algorithm is used to divide a dataset into categories belonging to the response variable. For the implementation we used RapidMiner. Specifically, we employed the C4.5 algorithm, which builds decision trees from a collection of training data using the notion of information entropy [26] see Figure 1.
We used a grid optimization strategy as a procedure for the setting of the parameters. In particular, the algorithm was optimized based on the split and stop criteria. The division criteria evaluated were Symmetry 2020, 12, 1381 5 of 9 gain ratio, information gain, Gini index, and accuracy. Regarding the detention criteria, the maximum depth was assessed with possible values ranging from 2 to 25. In the case of SME companies, the procedure result indicates information gain as the division criterion and value three as the maximum depth. In regards to large companies, the procedure result indicates accuracy as the division criterion and value five as the maximum depth.
Symmetry 2020, 12, x FOR PEER REVIEW 5 of 10 label of unknown records, in this case, the perceived barriers to business innovation intention in small and medium-sized (SME) and large-sized organizations.
In the decision trees method, an algorithm is used to divide a dataset into categories belonging to the response variable. For the implementation we used RapidMiner. Specifically, we employed the C4.5 algorithm, which builds decision trees from a collection of training data using the notion of information entropy [26] see Figure 1. We used a grid optimization strategy as a procedure for the setting of the parameters. In particular, the algorithm was optimized based on the split and stop criteria. The division criteria evaluated were gain ratio, information gain, Gini index, and accuracy. Regarding the detention criteria, the maximum depth was assessed with possible values ranging from 2 to 25. In the case of SME companies, the procedure result indicates information gain as the division criterion and value three as the maximum depth. In regards to large companies, the procedure result indicates accuracy as the division criterion and value five as the maximum depth.
Additionally, to avoid overfitting, all analyses were performed using 10-fold cross-validation. In general, the cross-validation procedure consists of two phases. The first phase trains a model, and after that, the second phase applies the trained model and measures its performance. In the case of 10-fold cross-validation, the procedure divides the data sample into ten subsets of equal size. Of the ten subsets, the method preserves a single subgroup as test data, and the remaining nine subsets are used as training data. This process is repeated ten times, and each of the ten subsets is used once as test data. Lastly, the procedure averages the results of the ten iterations to produce a single estimation.

Results
In the case of SME companies, prediction results in Table 4 indicate that the method performs well concerning the ability to select the cases that need to be chosen (innovation intention of SME companies) with a sensitivity of 81.90% ± 3.25%. The values of the other calculated criteria are specificity 46.45% ± 4.71%, precision 59.91% ± 2.11%, and accuracy 63.95% ± 2.44%. Additionally, to avoid overfitting, all analyses were performed using 10-fold cross-validation. In general, the cross-validation procedure consists of two phases. The first phase trains a model, and after that, the second phase applies the trained model and measures its performance. In the case of 10-fold cross-validation, the procedure divides the data sample into ten subsets of equal size. Of the ten subsets, the method preserves a single subgroup as test data, and the remaining nine subsets are used as training data. This process is repeated ten times, and each of the ten subsets is used once as test data. Lastly, the procedure averages the results of the ten iterations to produce a single estimation.

Results
In the case of SME companies, prediction results in Table 4 indicate that the method performs well concerning the ability to select the cases that need to be chosen (innovation intention of SME companies) with a sensitivity of 81.90% ± 3.25%. The values of the other calculated criteria are specificity 46.45% ± 4.71%, precision 59.91% ± 2.11%, and accuracy 63.95% ± 2.44%. According to the results, the following points guide the intention of innovation for SME companies. Firstly, when the perception of lack of own funds as an obstacle to innovation is nil, the perception of lack of information on technology arises as a discriminating variable; in this group, companies that perceive this obstacle as null do not declare their intention to innovate. Secondly, when there is a perception of a lack of own funds as an obstacle to innovation, the perception of lack of demand Symmetry 2020, 12, 1381 6 of 9 predicts their intention to innovate. The companies in this group that perceive it as a critical barrier "not necessarily due to lack of demand for innovations" indicate that they have no intention to innovate. In contrast, companies that perceive this barrier as medium or less declare a higher purpose to innovate. Figure 2 shows these results.

Class recall
46.44% 81.90% According to the results, the following points guide the intention of innovation for SME companies. Firstly, when the perception of lack of own funds as an obstacle to innovation is nil, the perception of lack of information on technology arises as a discriminating variable; in this group, companies that perceive this obstacle as null do not declare their intention to innovate. Secondly, when there is a perception of a lack of own funds as an obstacle to innovation, the perception of lack of demand predicts their intention to innovate. The companies in this group that perceive it as a critical barrier "not necessarily due to lack of demand for innovations" indicate that they have no intention to innovate. In contrast, companies that perceive this barrier as medium or less declare a higher purpose to innovate. Figure 2 shows these results. In the case of large companies, prediction results in Table 5 indicate that the method performs properly concerning the ability to select the cases that need to be chosen (innovation intention of large companies) with a sensitivity of 87.11% ± 2.39%. The values of the other calculated criteria are specificity 40.62% ± 3.86%, precision 65.24% ± 1.71%, and accuracy 66.71% ± 2.39%. Following the results, the subsequent perceptions guide the intention of innovation for large companies. Firstly, when the perception of the very high cost of innovation as an obstacle to innovation is nil, various understandings of barriers emerge as essential to predict, for the most part, the lack of intention to innovate. Secondly, when there is a perception that the very high cost of innovation is an obstacle to innovation, but the perceived importance of the barrier "not necessary due to lack of demand for innovations" is not high, companies have a greater intention to innovate. Figure 3 shows these results. In the case of large companies, prediction results in Table 5 indicate that the method performs properly concerning the ability to select the cases that need to be chosen (innovation intention of large companies) with a sensitivity of 87.11% ± 2.39%. The values of the other calculated criteria are specificity 40.62% ± 3.86%, precision 65.24% ± 1.71%, and accuracy 66.71% ± 2.39%. Following the results, the subsequent perceptions guide the intention of innovation for large companies. Firstly, when the perception of the very high cost of innovation as an obstacle to innovation is nil, various understandings of barriers emerge as essential to predict, for the most part, the lack of intention to innovate. Secondly, when there is a perception that the very high cost of innovation is an obstacle to innovation, but the perceived importance of the barrier "not necessary due to lack of demand for innovations" is not high, companies have a greater intention to innovate. Figure 3 shows these results.

Discussions
Data science is rising as an interdisciplinary field that mixes statistics, data mining, machine learning, and analytics to understand and explain how to generate prediction models. Consequently, the value and effectiveness of problems related to social data and data science are being recognized in social disciplines. Particularly in the management discipline, since data science methods allow scholars, among other benefits, to get better answers to existing questions, more immediate and accurate results are expected to evaluate existing theories [27]. In this light, this study has used a machine learning approach applied to business innovation data as a way to explore the barriers that

Discussions
Data science is rising as an interdisciplinary field that mixes statistics, data mining, machine learning, and analytics to understand and explain how to generate prediction models. Consequently, Symmetry 2020, 12, 1381 7 of 9 the value and effectiveness of problems related to social data and data science are being recognized in social disciplines. Particularly in the management discipline, since data science methods allow scholars, among other benefits, to get better answers to existing questions, more immediate and accurate results are expected to evaluate existing theories [27]. In this light, this study has used a machine learning approach applied to business innovation data as a way to explore the barriers that exist for this behavior, and their results confirm the usefulness of data science to evaluating conceptual propositions in business. In this vein, we believe that the analysis findings could reveal the nonlinear effects of independent variables in the intention to innovate, as stated [28].
Our findings show how organizational resources moderate the perception of barriers to innovate. Small and medium organizations (SMEs) perceive these barriers differently than large organizations. For SMEs, the awareness of lack of funds appeared as a relevant discriminant variable for the intention to innovate. Nevertheless, in both cases (when organizations do or do not perceive the lack of own funds as a barrier), the absorptive capacity of these organizations explained their intention to innovate. Absorptive capacity describes the organizational ability to scan the change of environment. In our study, we can recognize this capacity in organizations that can identify the lack of information about technology or demand. In other words, although SMEs perceive the lack of resources as a barrier, yet they can recognize the environmental needs, SMEs will show a higher intention to innovate.
Alternatively, in the case of large organizations, the cost of innovation is a more relevant discriminant variable than the perception of own funds since these organizations count with their resources. When large organizations perceived the high cost of innovation as nil, several barriers relative to technology and market uncertainty emerged as predictors of the intention to innovate. On the contrary, when these organizations perceived a high cost of organizations yet recognized that there was a demand for innovation, they also showed an interest to innovate. Thus, as the same as SMEs, we can suggest that organizations that can scan the environmental needs or have higher absorptive capacity will show a higher intention to innovate. This last result is in agreement with previous studies [10,29,30].
These results are especially interesting for both government and managers. The findings could be used in the development of public policy for supporting and encouraging innovation in large, medium, and small-sized companies. These government policies can help countries remain competitive in a global market and improve firms competitiveness and sustainability through direct implications for employment and a country's economic viability. The results may also provide insights for managers who are attempting to encourage innovation, especially when the Industry 4.0 era becomes more widespread in developed countries and world-class industries.
Understanding these perceived barriers can help decision-makers foster an innovative culture by supporting new technology strategies or avoiding an attitude of resistance to new ideas. The rules of business are changing with the inclusion of Industry 4.0, where the consumer market is looking for smart products and services more personalized to satisfy unique needs. However, many traditional industries continue to operate under marketing strategies that have demonstrated their ineffectiveness. Companies that are transitioning into Industry 4.0 need to plan new marketing actions [27] if predicting the business innovation intention based on perceived barriers is the first step to advance.
This study is not without limitations, which suggests avenues for future research. The main limitation is the heterogeneity of our sample, reducing the accuracy of the study results. Our sample includes a wide variety of industries that moderate the intention to innovate of the organizations. Thus, future research could examine which are the barriers to innovation within specific industries. Another limitation is that the organizations of our study are in Chile. As such, the results can be contingent on the country's characteristics. However, the data of the sample of this study is based on an OCDE survey. Thus, future studies could benefit from geographically diverse research settings. Finally, future studies could explore other machine learning techniques to improve the predictive capacity of the models, such as random forest or support vector machine, or adding different attributes to the learning process.

Conclusions
Innovation emerges as a fundamental driver for the economic development of societies and more in the current Industry 4.0 era. However, there are a lack of studies that analyze business innovation in developing countries compared to empirical research available in developed countries. Thus, this study investigates the impact of perceived barriers for business innovation to predict companies' innovative intentions in an emerging economy like Chile. Drawing on the contingent theory and applying a machine learning approach, the research concludes that organizational resources moderate the perception of barriers to innovate. SMEs perceive these barriers differently than large organizations. Notably, the knowledge of lack of own funds appears as a relevant discriminant variable for the intention to innovate for SMEs. At the same time, for larger companies, the cost of innovation is the most pertinent discriminant variable since these organizations count with their resources. Additionally, large firms and SMEs that can scan the environment, recognize the market needs, and have higher absorptive capacity will show a higher intention to innovate.
These findings can benefit organizations in other emerging economies to predict the innovation intention of organizations based on perceived barriers. Understanding these perceived barriers can help decision-makers foster an innovative culture by supporting new technology strategies or avoiding an attitude of resistance to new ideas. For example, countries and companies that are transitioning into Industry 4.0 could establish new actions, and policies about innovation intention referred to the perceived barriers.
The study's results from Chile cannot be generalized entirely. However, since these barriers are based on an international standard, Oslo Manual, the methodology applied in this study can be used to predict the business innovation intentions of large, medium, and small enterprises in other developing countries and verify the moderator effect of organizational resources.