An Application of Statistical Methods in Data Mining Techniques to Predict ICT Implementation of Enterprises

: Globalization, Industry 4.0, and the dynamics of the modern business environment caused by the pandemic have created immense challenges for enterprises across industries. Achieving and maintaining competitiveness requires enterprises to adapt to the new business paradigm that characterizes the framework of the global economy. In this paper, the applications of various statistical methods in data mining are presented. The sample included data from 214 enterprises. The structured survey used for the collection of data included questions regarding ICT implementation intentions within enterprises. The main goal was to present the application of statistical methods that are used in data mining, ranging from simple/basic methods to algorithms that are more complex. First, linear regression, binary logistic regression, a multicollinearity test, and a heteroscedasticity test were conducted. Next, a classiﬁer decision tree/QUEST (Quick, Unbiased, Efﬁcient, Statistical Tree) algorithm and a support vector machine (SVM) were presented. Finally, to provide a contrast to these classiﬁcation methods, a feed-forward neural network was trained on the same dataset. The obtained results are interesting, as they demonstrate how algorithms used for data mining can provide important insight into existing relationships that are present in large datasets. These ﬁndings are signiﬁcant, and they expand the current body of literature.


Introduction
The globalization of markets has introduced numerous challenges to enterprises across the globe. Due to the constant and dynamic changes in the global market, competition has intensified [1][2][3][4][5]. The widespread application of modern information communication technologies (ICTs) disrupted the existing competitive relations between enterprises [6,7]. In addition, customers and consumers are becoming more educated and informed about products, services, quality, and post-purchase support [8][9][10]. There is a continuous fragmentation and segmentation of markets [11,12], which can significantly affect established business models. Therefore, the vast majority of enterprises have to capture the main driving factors of technological development and apply this knowledge to their advantage [13,14].
Furthermore, Industry 4.0 has introduced a new framework for conducting business and a need for the digitization of the business process [15,16]. Industry 4.0 is characterized by technologies including cyber-physical systems, 3D manufacturing, big data analytics, the Internet of Things, the Internet of People, cloud-based solutions, the Internet of Value, artificial neural networks, and other state-of-the-art technological solutions [17][18][19]. In addition, the next industrial revolution, Industry 5.0, is slowly gaining traction. This revolution is based on the same technological principles as Industry 4.0, but in addition to focusing on profits as the main driver of business, it also focuses on sustainable development [20,21].
Enterprises have to be aware that intellectual capital has a key role in achieving and maintaining competitiveness in the globalized market [22][23][24]. Intellectual capital not only

Modern Business Environment
The modern business environment is affected by several main factors. Enterprises encounter obstacles in their efforts to attain and preserve competitiveness as a result of globalization, the dynamic development of advanced ICTs, post-pandemic economic mechanisms [40,41], the challenges of sustainability and sustainable development [42,43], and the concept of Society 5.0 [44].
The fast development of ICTs has significantly contributed to market, capital, and workforce globalization. This has led to intensified competitive relations between enterprises across industries. Enterprises have been forced to realize that ICT implementation is occurring slowly but surely, becoming an imperative for increasing competitiveness. ICTs can be implemented in many business processes, including manufacturing, distribution, logistics, marketing, quality management, human resource management (HRM) [45], customer relationship management (CRM) [46], accounting [47], and warehousing [48].
ICTs are the cornerstone of Industry 4.0, and for the purposes of this study, one technological concept can be highlighted-big data analytics (big data and data mining) [49]. The modern business environment relies on a constant and high volume of data from various sources. Enterprises have to address and tap into these data distribution channels and extract valuable information [50]. In addition to the challenges of Industry 4.0, few enterprises have been left unscathed by the globally widespread negative effects of the pandemic [51]. The pandemic has intensified the digitization of business models, and the reliance on the Internet as a medium for strategic business planning has intensified [52]. This trend will continue, leaving traditional business models obsolete or with inadequate competitive abilities [52]. In addition, the challenges of sustainable development, which is the cornerstone of Industry 5.0 and Society 5.0, have to be taken into consideration [53,54]. Industry 5.0 and the emerging concept of Society 5.0, in addition to sustainability in business and sustainable development goals, involve the utilization of data in enterprises, government offices, and social security-based services; the advanced implementation of social security systems; smart manufacturing; and other organizations [55][56][57]. Additionally, intellectual capital as a concept that represents a key resource for improving business performance and effectively applying advanced technologies [58,59].
When considering all the aforementioned factors as a whole, it is important to take note of one more key factor-market assessment. This refers to an enterprise's ability to evaluate its external business environment [60][61][62]. In the value economy, which characterizes new globalized markets, enterprises are increasingly focusing on intellectual capital and innovation derived from it [63]. Additionally, appropriately assessing the market and being agile to the dynamic changes is imperative for enterprise development [64]. ICTs play an important role in determining what is needed and expected in a specific market. ICTs provide necessary tools for creating and distributing instruments such as surveys and forms for data collection. This data can be further used to understand customer preferences and behavior [65]. Thus, it is evident that market assessment has the potential to influence whether enterprises will implement and use ICTs in their business. Additionally, business performance and ICT application are inevitably related. Better business performance can lead to increased financial resources for new ICT implementation, and enterprise growth is expected to be accompanied by technological growth [66]. Therefore, an increase in business performance can affect ICT implementation. Overall, without proper mechanisms to recognize changes and trends in the market, enterprises cannot adapt or survive in the long term.

Knowledge Gap and Hypotheses
The objective of this study was to demonstrate the application of statistical methods for data mining. The research framework and dataset involve the analysis of ICT implementation and application in enterprises. The paper addresses two crucial knowledge gaps. Firstly, there is a lack of studies analyzing ICT implementation in enterprises from transitional countries. Furthermore, existing studies that partially address this subject do not utilize data mining algorithms. Secondly, this paper provides a significant overview of how basic statistical methods compare to more complex approaches.
This study presents the application of statistical methods in data mining, and the results obtained are equally significant. The results highlight factors that potentially affect an enterprise's intention to implement and apply ICTs.
Based on the literature review and the goal of the paper, the following hypotheses are established as guidelines for data mining: • H 5 : Market assessment positively affects intentions to implement and apply ICTs within an enterprise.
The purpose of these hypotheses is to provide a research framework and guide the application of statistical methods on the acquired dataset.

Research Framework and Dataset
The framework and the methodology of this study are in accordance with the best practices in this type of research [67]. The research methodology includes the following:

•
Identifying the primary goal of the research (which is to present the application of data mining algorithms and attempt to predict enterprises' intentions of implementing and applying ICTs with multiple potential predictors); • Literature review (obtaining and concisely presenting a theoretical background in the domain of modern business environments); • Data acquisition (developing a structured survey, survey distribution, data collection, and creating an integrated dataset for data mining via various statistical methods); • Statistical methods (including descriptive statistics, linear regression, binary logistic regression, multicollinearity test, heteroscedasticity test, QUEST classification tree algorithm, support-vector machine-SVM); • Information extraction and discussion (presenting and determining whether it is possible to predict intentions of ICT implementation and application with different statistical methods).
The study was conducted through a structured survey in accordance with best practices in this domain. The dataset included survey data from 214 enterprises (n = 214), with details presented in Table 1. The majority has BSc degree-159 participants;

Enterprise
The majority of enterprises are micro-sized (36.45%) and small-sized (49.07%). This is expected as the majority of enterprises in the Republic of Serbia are micro-and small-sized enterprises. Industries in which the enterprises conduct their business: manufacturing (58); textile (28); agriculture (4); mining and quarrying (2); information and communication (41); wholesale and retail (9); construction (9); finance and insurance (26); healthcare and social work (9); water, electricity, gas supply (12); education (10); other (6). Note: Additional details are presented in the Appendix A.
The survey items and additional details are presented in Table A1 (Appendix A). Since the majority of enterprises in Serbia are SMEs, it was expected that the survey data would primarily consist of data from SMEs, with a few large enterprises included. The data collection was conducted in 2021, and the dataset included information about the respondent (gender, age, education), enterprise size, business performance, awareness of ICT importance, intellectual capital management, and market assessment. Future research may consider other potential predictors. The dataset is robust and suitable for applying data mining algorithms.
The research was conducted in three main phases. The first phase involved developing a structured survey, analyzing existing studies in this domain, and setting up the data collection process. The literature sources were analyzed, and the theoretical framework was developed. The anonymous surveys were distributed, and participants were given one month to complete them. An integrated dataset was created for statistical analysis.
The second phase focused on data analysis/data mining, utilizing statistical methods ranging from simple to complex, such as descriptive statistics, linear regression, binary logistic regression, multicollinearity test, heteroscedasticity test, QUEST classification tree algorithm, support-vector machine-SVM, and feed-forward neural network. The methods were conducted using WPS Spreadsheets and MPlus 7.11 software.
In the third phase, the obtained results were analyzed and evaluated to highlight the study's significance, drawbacks, and advantages. The developed decision tree is a clear example of how statistical methods in classification algorithms can be applied to data mining. The SVM results provide additional insight into how data mining algorithm can extract valuable information, and the feed-forward neural network provided a contrasting perspective to the previous methods.

QUEST Algorithm for Data Mining
The applied QUEST algorithm was selected over other classification algorithms for several reasons: It can be used with variate splits as well as for a combination of linear splits [66].
The QUEST algorithm first selects the primary variable and determines its split point, making it unbiased towards categorical variables. A quadratic discriminant analysis (QDA) is used to merge multiple variable classes into two super-classes, and then binary split is performed [70][71][72]. If there are two binary split points, the QUEST algorithm chooses the one closer to the sample mean. The algorithm is constructed by selecting a split independent variable, and after selecting a split point for the selected independent variable, it stops.
A pseudocode is presented in Algorithm 1. to provide a clear overview of the QUEST algorithm. The pseudocode is presented in the same form as in similar studies in this domain.
The pseudocode is further explained in more detail on how to apply the QUEST algorithm to the dataset. The steps are as follows: Step 1: Set the level of significance α, where α ∈ (0,1). α = 0.05.
Step 2: Determine the number of independent variables, M. In the survey data, we have 8 independent variables, which are Manager Age, Manager Gender, Manager Education, Enterprise Size, Business Performance, Awareness of ICT Importance, Intellectual Capital, and Market Assessment. We also have 1 dependent variable, which is Intentions to Implement ICTs. Among the 8 independent variables, 5 are categorical (Manager Gender, Manager Education, Enterprise Size, Awareness of ICT Importance, and Market Assessment), and 3 are continuous (Manager Age, Business Performance, and Intellectual Capital).
Step 3: For each continuous independent variable X, compute the smallest p-value using the ANOVA F-test to test if all categories of the dependent variable have the same mean as X, and compute the smallest p-value using Pearson's chi-squared (χ 2 ) statistic to test if all categories of X have the same distribution across the dependent variable. For each categorical independent variable X, perform Pearson's χ 2 test for independence between Y and X, and find the p-value using the χ 2 statistic. We can use the scipy.stats module in Python to perform these statistical tests and compute the p-values.
Step 4: Identify the independent variable with the smallest p-value and denote it as X*. If the smallest p-value is less than α/M, where M is the total number of independent variables, then SELECT X* as the predictor for splitting the node. Otherwise, go to step 5.
Step 5: If all independent variables have been tested and none have a p-value less than α/M, stop the splitting process and form a leaf node with the majority class label of the samples in the node. Otherwise, repeat steps 3 to 5 for each child node.
Set M as the total number of independent variables. Let M1 be the number of continuous and ordinal variables. 4. M = total_num_independent_vars 5. M1 = num_continuous_ordinal_vars 6.
For each continuous or ordinal independent variable X, compute the smallest p-value using the ANOVA F-test to test if all categories of the dependent variable have the same mean as X and compute the smallest p-value using Pearson's chi-squared (χ 2 ) statistic.

7.
For X in continuous_ordinal_vars: For each categorical independent variable, perform Pearson's χ 2 test for independence between Y and X, and find the p-value using the χ 2 statistic. 11.
Identify the independent variable with the smallest p-value and denote it as X*. 14. smallest_p_value = min(p_value_f_test, p_value_chi_squared) 15.
X* = independent_var_with_smallest_p_value 16. If the smallest p-value is less than α/M, where M is the total number of independent variables, then SELECT X* as the predictor for splitting the node. Otherwise, go to step 4. 17.
Calculate Levene's F statistic based on the absolute deviation of X from its class mean to test if the variances of X for different classes of Y are the same for each continuous independent variable X and find the p-value for the test. 22.
If the smallest p-value is less than α/(M + M1), where M1 is the number of continuous independent variables, then SELECT X** as the split independent variable for the node. Otherwise, do not split this node. 30.
Else: 33. do_not_split_node = True Step 6: Calculate Levene's F statistic based on the absolute deviation of X from its class mean to test if the variances of X for different classes of Y are the same for each continuous independent variable X and find the p-value for the test. Identify the independent variable with the smallest p-value and denote it as X**. If the smallest p-value is less than α/(M + M1), where M1 is the number of continuous independent variables, then SELECT X** as the split independent variable for the node. Otherwise, do not split this node.
Step 7: Repeat steps 3 to 6 until all nodes are either leaf nodes or cannot be split further based on the significance level α.
Further, the established equations for Pearson's chi-squared statistic and ANOVA F statistic are presented.
The equation for Pearson's chi-squared statistic: where O i,j is the observed value, E i,j is the expected value for the ith row and jth column of the data cell.
The equation for the ANOVA F statistic: where nj is the sample size in the jth group, Xj is the sample mean, and the X is the overall mean. N is the total number of observations, and k is the number of independent groups. In Table 2. the methodology summary is presented.

SVM Algorithm for Data Mining
Machine learning presents a significant advancement in the field of algorithms. It is a megatrend in various fields, including clustering (unsupervised learning), classification (supervised learning), and dynamic programming (reinforcement learning) [73]. Classification is an important topic in machine learning and is commonly used for predictive modeling [73]. Decision trees (such as the aforementioned QUEST algorithm) are one of the several types of machine learning classification methods. Another well-known classification algorithm is support vector machines (SVMs), which are widely applicable in various fields [74].
SVMs are trained to separate training data into two classes by determining the hyperplane that acts as the separator. The position of the hyperplane is defined by a subset of vectors called support vectors. For non-linear problems, the data are mapped into higher-dimensional spaces using kernel functions. Figure 1 illustrates the main principles of SVM for a linear problem. Next, in Figure 2. the main principle for a non-linear problem is presented. This includes higher dimensional spaces via kernel functions. The main mathematical principle of SVM includes [76] the input parameters, which are a series of data points (x 1 , y 1 ), (x 2 , y 2 ), . . . (x i , y i ), where x i is the normalized input parameter, while y i is the normalized intention to implement an ICT under the input sample i. From here, SVM approximates the function in the following form: where φ(x) presents the higher-dimensional space, which is non-linearly mapped. In order to find ω and b, the regularized risk function is used: where ω 2 is the regularized term, and by minimizing it, the function can be as flat as is the empirical error. The error is measured by insensitive loss function (ε). This function has the following form: This way, the range of ε values is determined in a way that, if the prediction is accurate, the loss will be zero. If the predicted value is outside the range, the loss is the difference between the predicted value and the distance of ε value range.
Further, in order to estimate ω and b, Equation (4) is transformed into the original objective function that has the following form: Equation (6) is hard to solve; thus, the four Lagrange multipliers are introduced, which results in the following equation: Further, the kernel function K(x i , y j ) for the higher-dimensional space is introduced into Equation (7): K(x i , x j ) is equal to the inner product of vector x i and vector x j within the Ø(x i ) and Ø(x j ) places. The use of kernels makes it possible to conduct all calculations in the input space.
The pseudocode for the SVM is presented in Algorithm 2. The response variable is impact, which refers to the intentions to implement an ICT solution within the enterprise. For validation, 50 observations were taken randomly from the sample, and linear kernel function was used for preprocessing.

Feed-Forward Neural Network
In addition to the aforementioned algorithms and statistical approaches, a feedforward neural network was applied to the dataset. The neural network was trained on the dataset that included the dependent and independent variables. To test the hypotheses, we can use the trained network with different combinations of values for the independent variables to analyze the relationship between each independent variable and the dependent variable. Python and Tensorflow were used for the feed-forward neural network. The Python code for the neural network training is presented in Algorithm 3.
# Split the data into training and testing sets 8.
X_train The dependent variable in this study was the intention to implement ICTs. As this is a binary classification, a binary cross-entropy was used as the loss function, and the Adam optimizer was used for compiling the model. Categorical variables were encoded using one-hot encoding, and feature values were standardized with StandardScaler. Figure 3. presents the architecture schematics of the neural network used in this study.
The feed-forward neural network has 2 layers, with 8 and 16 neurons for hidden layers and sigmoid activation function for the output layer. The model was trained with a batch size of 32 and for 100 epochs. The accuracy of the model is 85.8%; precision is 91.5; recall is 82.1; and F1-score is 86.8.

Results
Due to the large dataset and extensive structured survey, the results of the descriptive statistics are presented in Table A2. (Appendix B). The results of the linear regression analysis are presented in Table 3. In the linear regression, business performance, awareness of ICT importance, intellectual capital, and market assessment were observed as independent variables or predictors, while ICT implementation intentions were the dependent variable. The linear regression results showed that awareness of ICT importance and intellectual capital did not have a significant influence on ICT implementation intentions. Interestingly, the business performance had a low to medium negative influence, indicating that higher business performance would reduce ICT implementation intentions. The market assessment had a low to moderate positive influence on ICT implementation intentions.
Further, a binary logistic regression was conducted using stepwise regression to select the significant predictor variables automatically. The results of the binary logistic regression are presented in Table 4.
The results indicate that classification success is 82%, sensitivity is 85%, and specificity is 88%. In addition, multicollinearity is tested, and the results are presented in Table 5.
The variance inflation factors (VIFs) are under and slightly above 2.500. The heteroscedasticity is evaluated via the Breusch-Pagan and White test. The results are presented in Table 6. The results indicate that there is no statistically significant correlation. The p-values of the Breusch-Pagan test and the White test indicate that the distribution of variances is homoscedastic and that there is no heteroscedasticity present in this model.
The QUEST classification decision tree was used to attempt to predict ICT implementation intentions in enterprises. The tree had a yes/no structure. The tree analyzed several influencing features. Cross-validation was also included. In this case, the following features were found to have an effect on ICT implementation intentions:   In the first step, the model was able to predict 80.7% of the time if the participant thinks that modern ICTs are important for economic growth. In the next step, the model was able to predict ICT application 86.67% of the time.
On the other side of the first split, managers/directors/owners of enterprises who do not think that ICTs are not important for economic growth but who would implement an ICT solution are affected by the industry in which their enterprise conducts business and by the opinions on the most competitive industries. The predictive precision value of this model (risk value) is 0.224 (0.005), which means that a misclassification occurs in 22.4% of the cases. Next, the summary of the optimized SVM classifier is presented in Table 7.
The confusion matrix of the SVM analysis is presented in Table 8. Following Table 8, the receiver operating characteristic curve-ROC curve is presented in Figure 5.   The ROC curve graphically presents the specificity and sensitivity for different thresholds. The area under the curve (AUC) value is 0.822, indicating that the model has an 82.2% chance of correctly classifying observations in accordance with the positive class. Specifically, the model has an 82.2% probability of accurately determining whether an enterprise owner/manager intends to implement and apply an ICT solution. The AUC equation for the binary SVM classifier is as follows: The equation can be noted as a sum of pairwise comparisons between positive and negative samples. These samples are weighted by the predicted scores of the SVM classifier. The term [y i = 1,y j = 0, f i > f j ] is computed for each positive sample i and for each negative sample j. If the SVM classifier correctly predicted a higher score for the positive sample i than for the negative j; otherwise, it is 0. The term is then summed up over all pairs (both positive and negative samples) with the goal of obtaining the numerator of the AUC equation. The denominator in the equation is simply the total number of pairwise comparisons between positive and negative samples, regardless of the SVM classifier's prediction: [y i = 1,y j = 0]. The numerator and denominator are divided to obtain the AUC score.

Assessing the Results of the Linear Regression Analysis
This study had two parallel, mutually non-exclusive goals. The first goal was to present multiple statistical methods, ranging from simple to complex, for data analysis and data mining. The second goal aimed to predict ICT implementation intentions in enterprises.
The results of the descriptive statistics do not provide a deeper understanding of the observed factors. No significant observations or causations can be derived. However, some insights can still be obtained. The majority of respondents (77.6%) stated that they would implement an ICT solution within their enterprise.
Next, based on the results of the linear regression analysis, the proposed hypotheses can be assessed as follows: Compared to descriptive statistics, linear regression provides a deeper analysis of the relationship between the independent variables and dependent variables.

Assessing the Results of the Binary Logistic Regression Analysis
Additional insight is provided by conducting a binary logistic regression analysis. From here, the proposed hypotheses are evaluated as follows: • H 1a : Manager/director age positively affects intentions to implement and apply ICTs within an enterprise. Did not gain support.

Assessing the Results of the QUEST Algorithm
Further, based on the QUEST decision tree results, the proposed hypotheses are assessed as follows: • H 1a : Manager/director age positively affects intentions to implement and apply ICTs within an enterprise. Did not gain support. • H 4 : Intellectual capital positively affects intentions to implement and apply ICTs within an enterprise. Did not gain support. • H 5 : Market assessment positively affects intentions to implement and apply ICTs within an enterprise. Failed to be rejected.

Assessing the Results of the SVM Classifier
Next, based on the SVM classifier, all of the proposed hypotheses can be noted as failed to be rejected. The SVM classifier parameter included all the predictors, and the results indicate a moderately high percentage of accuracy (82.2%). Therefore, the hypotheses are evaluated as follows: • H 1a : Manager/director age positively affects intentions to implement and apply ICTs within an enterprise. Failed to be rejected.  It is interesting to note that various statistical methods produced different results with varying levels of detail. It is important for data mining processes to avoid bias and to set up parameters properly. In this study, the QUEST algorithm was chosen over other classifiers, such as CHAID (chi-squared automatic interaction detection) and CART (classification and regression tree), as it was deemed most appropriate for a dataset that includes ordinal, nominal, and quantitative data. Additionally, the support-vector machine (SVM) provided significant insights into the use of machine learning in data mining. Similarly, the feed-forward neural network produced similar results regarding the evaluation of the hypotheses.

Other Studies
Feed-forward neural networks have gained significant attention for their flexibility and ability to process large datasets, often achieving training accuracy between 78% to 95%, depending on the dataset [77]. In this study, the accuracy of the trained model falls within the noted range. Neural networks can be applied in various fields, including mechanics, physics, chemistry, genetics, optimization, transport, material testing, and more [78,79]. The role of neural networks in this study was to provide a contrast to the main approach, which was SVM. Compared to neural networks, SVM is powerful even with limited information and modest dataset sizes, making it more appropriate for this study's dataset of 214 employees [80]. SVMs have been successfully used for face detection, text categorization, data classification, image classification, and other applications across various industries [81]. Therefore, the approach taken in this study is appropriate, given the flexibility and wide range of applications of SVM. There are a lot of studies that applied SVMs to datasets from various sources. This study manages to present a basic-to-complex approach in data mining and a contrasting example of applying neural networks.

Limitations, Future Research, and Implications
The study has several limitations that should be taken into consideration in future research. The limitations of this study include the following:

•
Mainly English literature in the theoretical background section; • Unexplained variance in the results, which indicates that there may be other factors that affect the dependent variable (this is expected with social studies); • Pie charts and histograms were not developed; • Analysis was conducted only on enterprises within Serbia. However, transitional economies tend to have the same issues when it comes to competitiveness. • Extensive equations were not included (the methods derive from well-established equations).
Considering the noted limitations, the study still contributes to the existing body of the literature in the domain of statistical methods in data analysis/data mining.
For future research, the following is recommended: • Consider the limitations noted above; • Use a structured approach to future research; • Introduce additional predictor factors such as human resource management, employee productivity, and marketing management; • Analyze datasets consisting of enterprises from other countries and compare them to this current paper's findings; • Conduct meta-analysis when similar studies are published; • Apply CHART, CHAID, and other classifiers in future studies.
Practical implications of the results presented in this manuscript include: • Governments (increasing strategic incentive programs for ICT adoption across industries would improve national competitiveness). • Schools and universities can address this paper by presenting the necessity and imperative role of ICTs in the modern global economy.

•
Fellow scholars can address this study for their own research. This paper provides an adequate basis and guidelines for future analysis in this domain.
Overall, the study expands the existing body of literature from two significant aspects. Firstly, it applies statistical methods to a robust dataset. Secondly, it addresses trending aspects of the modern business environment. No other similar studies were conducted, and the findings can be used in future research.

Conclusions
In this paper, various statistical methods were applied to analyze potential predictor variables in order to understand their influence on ICT implementation intentions in enterprises. The results showed that different statistical methods provided different outcomes, highlighting the importance of selecting appropriate methods in data mining. Descriptive statistics provided an overview of the analyzed dataset, while regression analyses (linear and binary logistic) provided additional insights. The QUEST algorithm indicates that opinions on ICT's importance for economic growth and opinions on ICT's importance for the enterprise affect ICT implementation intentions. The SVM classifier demonstrated the potential of machine learning for data mining. Finally, a feed-forward neural network was trained with the dataset in contrast to the previously conducted methods. Future studies are recommended to analyze multiple classifiers across different datasets to further broaden the application of statistical methods.
Overall, this research provides valuable insights into the potential factors influencing ICT implementation intentions and establishes a foundation for future studies in the field.

Institutional Review Board Statement:
The study did not require a Review Board Statement.

Informed Consent Statement:
The survey was anonymous, and no personal data were collected. No experiments were conducted on humans.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Survey items, codes, attributes, available answers, and variables are presented in Table A1.

Appendix B
The results of the descriptive statistics are presented in Table A2.

Awareness of ICT importance
Do you think that information-communication technologies (ICTs) are important for the enterprise?