Theory and Practice of Integrating Machine Learning and Conventional Statistics in Medical Data Analysis

The practice of medical decision making is changing rapidly with the development of innovative computing technologies. The growing interest of data analysis with improvements in big data computer processing methods raises the question of whether machine learning can be integrated with conventional statistics in health research. To help address this knowledge gap, this paper presents a review on the conceptual integration between conventional statistics and machine learning, focusing on the health research. The similarities and differences between the two are compared using mathematical concepts and algorithms. The comparison between conventional statistics and machine learning methods indicates that conventional statistics are the fundamental basis of machine learning, where the black box algorithms are derived from basic mathematics, but are advanced in terms of automated analysis, handling big data and providing interactive visualizations. While the nature of both these methods are different, they are conceptually similar. Based on our review, we conclude that conventional statistics and machine learning are best to be integrated to develop automated data analysis tools. We also strongly believe that machine learning could be explored by health researchers to enhance conventional statistics in decision making for added reliable validation measures.


Introduction
Recently, machine learning has been fueling active discussions among clinicians and health researchers, particularly for decision making in e-diagnosis, disease detection and medical image analysis [1][2][3][4][5]. A few common questions are "can machine learning replace conventional statistics?", "are they the same?" and "how statistics be integrated with machine learning?". This review focuses on the concept of conventional statistics and machine learning in health research and the explanation, comparison and examples may answer the aforementioned questions.
It is seen from various research that conventional statistics have dominated health research [6][7][8][9][10][11][12][13][14][15]; however, machine learning, since its inception, is widely being used by data scientists in various fields [16][17][18][19][20][21][22][23][24][25][26][27]. Examples of common conventional statistical analyses are hypothesis testing (t-test, ANOVA), probability distributions (regression) and sample size calculation (hazard ratio), whereas in machine learning, the common concepts are model evaluation, variable importance, decision tree, classification and prediction analysis. While many industries are capturing the vast potential of machine learning, healthcare is still slow in attaining the optimum level to make sense of newer technologies and computational methods. This could be due to uncertain reliability and trust in machines to analyze big data and make timely decisions on patients' health. The production of large amounts of healthcare data [28,29], such as administrative data, patients' medical history, clinical data, lifestyle data and other personal data, makes their analyses unmanageable using basic statistical software tools, which subsequently leads to the need of technologically advanced applications with cost-efficient high-end computational power. More importantly, the applications must meet the target or be designed with user-friendly interfaces and data protection utilities to aid the end users, who are the biostatisticians, researchers and clinicians. The experts need automated systems to optimize diagnoses and treatments, enhance prognostications, predict health risks and develop long-term care plans. In line with this, we need to answer the important question of whether machine learning is completely different from conventional statistics or they are correlated with each other.
To begin with, conventional statistics have a history of over 50 years, beginning in the early 17th and 18th centuries, when mathematical theories were introduced by various scientists. In the 18th century, the importance of advanced statistics in medicine was a prominent topic, where more theories were integrated to invent inferential statistical models. Later, the use of computational power in statistical analysis was given priority, hence advanced software tools were developed. Machine learning was introduced in 1952, and recently it has advanced into deep learning and is used as the basis of artificial intelligence [30][31][32]. Figure 1 describes the evolution of conventional statistics and machine learning in health research. are model evaluation, variable importance, decision tree, classification and prediction analysis. While many industries are capturing the vast potential of machine learning, healthcare is still slow in attaining the optimum level to make sense of newer technologies and computational methods. This could be due to uncertain reliability and trust in machines to analyze big data and make timely decisions on patients' health. The production of large amounts of healthcare data [28,29], such as administrative data, patients' medical history, clinical data, lifestyle data and other personal data, makes their analyses unmanageable using basic statistical software tools, which subsequently leads to the need of technologically advanced applications with cost-efficient high-end computational power. More importantly, the applications must meet the target or be designed with user-friendly interfaces and data protection utilities to aid the end users, who are the biostatisticians, researchers and clinicians. The experts need automated systems to optimize diagnoses and treatments, enhance prognostications, predict health risks and develop long-term care plans. In line with this, we need to answer the important question of whether machine learning is completely different from conventional statistics or they are correlated with each other.
To begin with, conventional statistics have a history of over 50 years, beginning in the early 17th and 18th centuries, when mathematical theories were introduced by various scientists. In the 18th century, the importance of advanced statistics in medicine was a prominent topic, where more theories were integrated to invent inferential statistical models. Later, the use of computational power in statistical analysis was given priority, hence advanced software tools were developed. Machine learning was introduced in 1952, and recently it has advanced into deep learning and is used as the basis of artificial intelligence [30][31][32]. Figure 1 describes the evolution of conventional statistics and machine learning in health research.

Past Reviews, Rationale for the Review and Intended Audience
There are recent reviews on the comparison between conventional statistics and machine learning [33][34][35][36][37]. These reviews have presented the definitions of the two terms and the advantages and disadvantages over each other. However, to the best of our knowledge, the conceptual integration between these two fields using examples in health research have not been discussed previously. Moreover, the similarities between these two fields have not been investigated thoroughly. This review may clarify the confusion among clinicians as to whether machine learning can be integrated with conventional statistics in health research. Clarifying the integration between conventional statistics and machine learning may be able to convince health researchers to explore this approach in the future. The intended audience of this review is not only healthcare researchers, but statisticians and data scientists as well.

Review Content
This review contains five sections: (i) concepts in conventional statistics and machine learning, (ii) advantages and disadvantages of conventional statistics and machine learning, (iii) a case study of breast cancer survival analysis using a few techniques comparing conventional statistics and machine learning, (iv) simplified machine learning algorithms and their relationship with conventional statistics and (v) a discussion explaining the integration of conventional statistics with machine learning and the significance of machine learning, derived from fundamental conventional statistics. Section (iii) is explained using a proven breast cancer prediction model [38], which has attracted a broad range of readers both from the medical domain and computer science. The terms conventional statistics (CS) and machine learning (ML) are used throughout the review.

Survey Methodology
The review was conducted using published works related to: i. history of conventional statistics and machine learning in medicine ii. comparison between conventional statistics and machine learning iii. use of machine learning in various fields iv. analysis of medical data using conventional statistics and v.
use of machine learning and artificial intelligence in medical analysis.
The digital libraries and search engines used to extract the literature are Google Scholar, Web of Science and PubMed. The literature search was followed by selecting relevant literature using inclusion and exclusion criteria as listed below.

Inclusion Criteria
(i) all papers with year of publication between 2015 to 2022 (ii) all open access papers that are freely available (iii) the keywords used for the search are conventional statistics, machine learning, medical data, comparison and health research. The entries by using these keywords were from various medical domains, machine learning analyses and statistics in healthcare research, not focusing only on one type of disease.

Exclusion Criteria
(i) all papers not relevant to our topic (ii) all papers that are not freely accessible (iii) all papers with year of publication before 2015 The initial total number of papers is 511. The selection process is explained in Figure 2. The final total number of selected literature is 102.

Hypothesis Testing and Statistical Inference for Classification
Hypothesis testing is the interpretation of results by making assumptions (hypotheses) based on experimental data. The statistical tests (e.g., t-test, ANOVA) are used to interpret the results based on measures such as p-value (significant difference). Biostatisticians and medical scientists perform statistical analysis using conventional software tools [39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56] because healthcare providers' main objective is to focus on analysis based on hypothesis testing in the context of patient care to check if treatments or drugs yield positive outcomes or how to control certain risk factors for a particular disease. Therefore, they barely explore or pay attention to the use of advanced computer science applications and automated predictive tools such as Predict, CancerMath and Adjuvant. [57]. The basic concepts of hypothesis testing are explained in Table 1. Step 1: Identify predictors from related literature

Hypothesis Testing and Statistical Inference for Classification
Hypothesis testing is the interpretation of results by making assumptions (hypotheses) based on experimental data. The statistical tests (e.g., t-test, ANOVA) are used to interpret the results based on measures such as p-value (significant difference). Biostatisticians and medical scientists perform statistical analysis using conventional software tools [39][40][41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56] because healthcare providers' main objective is to focus on analysis based on hypothesis testing in the context of patient care to check if treatments or drugs yield positive outcomes or how to control certain risk factors for a particular disease. Therefore, they barely explore or pay attention to the use of advanced computer science applications and automated predictive tools such as Predict, CancerMath and Adjuvant. [57]. The basic concepts of hypothesis testing are explained in Table 1. In conventional statistics, the approach used is the conclusion or "inference" in the form of mathematical equations and measures to make predictions. For instance, an inferential work to assess unknown evidence from observed data could be achieved via a hypothesis-testing framework. The aim of hypothesis testing is to reject the null hypothesis if the evidence found is true and clinically significant. For example, in deciding which surgical treatment, "does breast-conserving therapy or mastectomy promote better survival among breast cancer patients?" is an inferential question and the answer is unobservable. In this scenario, patients are considered the observation, whereas the treatment types and survival data are the independent variables, which decide the inference [34]. The results of the analysis classify the dependent variable (surgical treatment) based on the patterns of independent variables.

Regression
Regression analysis is a set of statistical methods to estimate the relationship between a dependent variable and a set of independent variables. Regression has been widely used in healthcare research to analyze and make predictions on various diseases. Selection of a particular type of regression depends on the type of dependent variable, such as continuous and categorical. Linear regression is used to determine the relationship between a continuous dependent variable and a set of independent variables. This analysis estimates the model by minimizing the sum of squared errors (SSE). Moreover, nonlinear regression also requires a continuous dependent variable, but this is considered advanced as it uses an iterative algorithm rather than the linear approach of direct matrix equations [58].
A categorical dependent variable is analyzed using logistic regression. This analysis transforms the dependent variables which have values of distinct groups based on specific categories and uses Maximum Likelihood Estimation to estimate the parameters. Logistic regression is further divided into binary, ordinal and nominal categories. A binary variable has only two values, such as survival status (alive or dead), an ordinal variable has at least three values in order, such as cancer stage (stage 1, stage 2, stage 3), whereas a nominal variable has at least three values which are not categorized in any order, such as treatment (chemotherapy, radiotherapy, surgery) [38].

Predictive Analytics
Prediction works with the concept of modeling using machine learning algorithms. This prediction model requires a reliable relationship between the observations (patients) and variables (independent variables). Prediction models generate accuracy measures to determine the quality of data and predict the final outcome using the observations (patients), input data (independent variables) and output data (dependent variable). The basic concepts of predictive analysis are explained in Table 2. Table 2. Concepts in predictive analysis (classification).

Approach
Concept Procedure

Classification
Research question: Is label 1 considered as the target outcome? Answer: Yes, label 1 is the target outcome Decision rule: • A trained classifier that analyzes an unlabeled observations' variables and values, which results in a predicted label (1).
Step 1: Split dataset into training and testing datasets Step 2: Train the data using a specific algorithm Step 3: Test the remaining dataset using the trained algorithms to predict the results accurately

Representation Learning
Representation learning is the process of training machine learning algorithms to discover representations which are interpretable. Different representations can entangle various explanatory factors in a specific dataset. The outcome of representation learning should ease the subsequent task in the decision-making process. For example, representation learning handles and groups very large amounts of unlabeled training data in unsupervised or semi-supervised learning. The grouping of the unlabeled data is used for the corresponding task, such as feature selection and decision tree, to predict outcomes. The challenging factor of representation learning is that it has to preserve as much information as the input data contains in order to attain accurate predictions. Healthcare research utilizes representation learning mostly in image recognitions, such as biomedical imaging-based predictions [59].

Reinforcement Learning
Reinforcement learning (RL) trains machine learning models to make a sequence of decisions, unlike supervised learning, which relies on a one-shot or single dependent factor. Its main objective is to endow an individual's skills to make predictions through experience with the environment around them and develop evaluative feedback. This unique feature of reinforcement learning helps in providing prevailing solutions in various healthcare diagnosis and treatment regimens which are usually characterized by a prolonged and sequential procedure. Reinforcement learning follows a few techniques for sequential decision making, namely, efficient techniques such as experience-level, model-level and task-level and representational techniques such as representation for value function, reward function and task or models). Applications of RL in different healthcare domains, such as chronic diseases and critical care, especially sepsis and anesthesia, are explained in detail in [60,61].

Causal Inference/Generative Models
Causal inferencing plays a vital role in understanding the mechanisms of variables to find a generative model and predict outcomes which the variables are subjected to. The characteristic of causal inference is to find answers for questions about the mechanisms by which the variables come to take on values. For example, epidemiologists gather dietary-related data and find the factors affecting life expectancy to predict the effects of guiding people to change their diet. Examples of causal models are models with free parameters (fixed structure and free parameters) and models with fixed parameters (free parameters with values). A large number of variables, small sample size and missing values are considered serious impediments to proper data analysis and production of accurate decision making in the medical domain. Causal inferencing is used in healthcare research mainly for clinical risk prediction and improving accuracy of medical diagnosis, despite the issues with data [62].

Data Management
Conventional statistics are more suitable for simpler datasets, whereas machine learning can manage complex datasets. In general, conventional statistical analyses are performed if the research has prior literature about the topic of interest, the number of variables involved in the study is relatively small and the number of observations (samples) is bigger than the number of variables. This may assist the scientist in understanding the topic conveniently by selecting the important variables from prior knowledge, as well by applying appropriate analytical models to check the association between variables (independent) and outcomes (dependent). The conventional statistical approach gives more priority to the type of dataset, for example, those including a cohort study, which follow a specific hypothesis [35]. On the other hand, prediction analysis based on machine learning algorithms learn from data without relying on rules-based programming, which does not make any prior assumption, but is rather based on the original data provided. Machine learning algorithms can handle multi-dimensional big data, but conventional statistics can handle only one specific format of data at a time. Furthermore, machine learning algorithms can handle data from different data sources such as external databases or online data repositories.

Computational Power, Interpretation/Explainability and Visualization of Results
The analytical strategy of biostatisticians depends on their pathophysiological knowledge and experience where data-driven prediction analysis challenges this paradigm of thought, and the increasing computational power may unmask the associations not realized by the human mind.
In conventional statistical analysis, scientists use basic software tools, which lack the capability to handle big data and visualization of results. Machine learning black box algorithms have the ability to uncover subtle hidden patterns in multi-model data [63]. However, interpretability is domain-specific, hence the visualization techniques play a vital role in explaining the results to higher stakeholders. Conventional statistical software tools produce basic visualization, whereas the advanced data analytics tools produce domain-specific, customized, inherently interpretable models and results [63]. Machine learning is often very complex and difficult to be interpreted by clinicians because it uses computational programming and not a user-friendly tool such as SPSS. Conventional statistics are easily interpretable and have lower capacity, thus present a smaller risk of failing to generalize non-causal effects.
Conventional statistics are considered more computationally efficient and more readily acceptable in the medical domain. Contrarily, the results and visualizations produced by ML algorithms are different from the CS methods, and no proper guideline is available on the ways to explain the graphs for interpretation of final results. Machine learning requires high computational power in terms of processing power and storage. Moreover, ML algorithms are updated regularly into newer versions, which requires updates in coding. Furthermore, ML models have the ability to over-predict (overfitting), where the predicted model is closely related to the provided dataset. This could constrain the possibility to generalize the model in different datasets to produce better accuracy. Therefore, validity is required in both cases to finalize decisions. ML algorithms are able to provide required results and decisions automatically from precise training data based on their built-in functions from the programming tools. Nevertheless, when dealing with large amount of data, more hybrid models can be designed to resolve the issues arising in data science for knowledge extraction, especially in healthcare. There is a myriad of algorithms and software for ML techniques to build prediction models on diseases. In medical informatics, R, Matlab, Waikato Environment for Knowledge Analysis (WEKA) toolkit and Python [64][65][66][67][68][69][70] are a few of the widely used programming languages and software in conducting prediction analysis. In the need of viable decisions and interpretation, healthcare providers and researchers can consider leveraging explainable ML models, instead of focusing only on the results. If the interpretability in a domain-specific notion can be followed by researchers, the level of trust on black boxes among clinicians could be improved.

Dimensionality Reduction
Dimensionality reduction involves reduction of either dimension of the observation vectors (input variables) into smaller representations [71]. Technically, dimensionality reduction transforms original dataset A of dimensionality N into a new dataset B of dimensionality n [72]. Machine learning models follow various dimensionality reduction techniques based on the types of data in a specific research analysis. The larger the number of input variables, the greater the complication in the predictive models; thus, dimensionality reduction helps to select the best input variables to predict the models. A few methods of reduction are Principal Component Analysis (PCA), Kernel PCA (KPCA), tdistributed Stochastic Neighbor Embedding (t-SNE) and UMAP. Dimensionality reduction techniques remove irrelevant input variables from the dataset, which could increase the accuracy of machine learning models. It also helps to eliminate multi-collinearity, which enhances the way of interpreting the variables. In line with this, the dataset with the relevant input variables saves storage space, and less computing power is needed to analyze the data [73].

Frequently Used Models or Methods for Data Assessment
The most frequently used models for the association study in conventional statistics is logistic regression or Cox regression models for binary outcomes, linear regression for continuous outcomes and more extensive models such as generalized linear models based on the distribution of data. This scenario is typically popular in studies addressing public health significance, especially when the analysis involves a population study [74][75][76][77]. Statisticians believe that, in order to draw a firm conclusion or inference, the number of observations in an association study plays an important role [34]. This is a direct approach in hypothesis testing.
Machine learning models are able to capture high-capacity relationships and they are amenable to more operational tasks rather than direct research questions; thus, more research gaps could be solved through the one-stop analysis [38]. Various medical data analyses used a machine learning approach to make decisions [78][79][80][81][82][83][84][85][86][87][88]. Biostatisticians are in a need of an updated methodology that uses a machine learning approach to conduct analysis on a variety of medical data [89]. In this case, the similar concepts in CS and ML need to be emphasized. Machine learning algorithms serve as alternatives to the conventional statistics for common analyses, such as determining effect size, significant factors, survival analysis and imputations. While conceptually they are similar, they are distinct in terms of methods. The core differences between CS and ML concepts are described in Table 3.

Case Study to Compare Conventional Statistics and Machine Learning
A breast cancer dataset from the University Malaya Medical Centre (UMMC), n = 8066, diagnosed between 1993 and 2017, was used to perform prediction analysis using both conventional statistics and machine learning. Written informed consent was obtained from the participants included in this study. This dataset was extracted from the cancer registry within the electronic medical record system of UMMC called iPesakit. A total of 23 independent variables and survival status (dependent variable) were used to determine the most important prognostic factors of breast cancer survival. The data description is provided in Table 3. The methods and results from three different types of analysis are compared. SPSS was used to perform conventional statistics and R was used to perform machine learning. The R codes used for machine learning analysis stated in the case study of this paper are deposited on GitHub [90].

Imputation and Data Pre-Processing
Imputation applies both to conventional statistics and machine learning during data cleaning. Single or multiple imputations can be performed using conventional statistical software and programming tools such as R. In this case study, imputation was performed on the dataset to fill the missing values only for conventional statistical analysis. This is because the machine learning algorithms are able to handle the data with missing values. The dataset was split into testing (30%) and training (70%) for machine learning.

Significant Factors (CS) and Variable Importance (ML)
The objective of this analysis was to compare conventional statistics and machine learning (variable importance) to determine the similarities and differences in the results using the same dataset. Table 4 shows the results using significant factor analysis in SPSS. The results from the chi squared test (categorical variable) and Mann-Whitney U test (continuous variables) show that all the independent variables are statistically significant (p-value < 0.05).  Figure 3 shows the variable importance plot using random forest VSURF and random-ForestExplainer packages in R [72]. The variables are ranked based on variable importance mean from highest to lowest. A threshold was set up to 0.01 and six variables were selected as the most important prognostic factors of breast cancer survival.
The difference between significant testing and variable importance is that the order of the importance is determined in variable importance, but only the status of significance (statistically significant or not) could be determined using the significant factor analysis. Figure 3 shows the variable importance plot using random forest VSURF and ran-domForestExplainer packages in R [72]. The variables are ranked based on variable importance mean from highest to lowest. A threshold was set up to 0.01 and six variables were selected as the most important prognostic factors of breast cancer survival.

Survival Analysis
Survival analysis in machine learning follows exactly the same concept as the conventional statistics, which is the Kaplan-Meier (KM) estimator. The time series data, date of diagnosis, date of death and date of last follow-up are used to calculate the overall survival rate. The methods used are different; in machine learning, the KM estimator is encapsulated into a single package called survival in R. Programming codes are used to plot the survival curve directly by specifying the variables. In contrast to conventional statistics, it is not an algorithm, but a type of data analysis where the time series data are selected to plot survival curves with a life table and hazard ratio. Both conventional statistics and machine learning follow the same rules to predict survival rate. The survival curves are shown in Table 5. Survival curves are created for three variables: tumor size, cancer stage and positive lymph nodes. The survival curves from SPSS and R produced quite similar results in terms of survival rate for various categories in each variable, but with differences in numerical values (survival percentages).  Table 5. Survival curves are created for three variables: tumor size, cancer stage and positive lymph nodes. The survival curves from SPSS and R produced quite similar results in terms of survival rate for various categories in each variable, but with differences in numerical values (survival percentages).  Table 5. Survival curves are created for three variables: tumor size, cancer stage and positive lymph nodes. The survival curves from SPSS and R produced quite similar results in terms of survival rate for various categories in each variable, but with differences in numerical values (survival percentages).

Simplified Machine Learning Algorithms and Their Relationship with Conventional Statistics
The mathematical equations in conventional statistics are encapsulated to form algorithms in machine learning. These algorithms are used to perform predictions using supervised and unsupervised machine learning. The integration between the mathematics behind conventional statistics and machine learning are explained using the techniques, model evaluation (supervised learning), variable importance (supervised learning) and hierarchical clustering (unsupervised learning). A proven breast cancer prediction model [38] has been used to explain the concepts of the algorithms.
Model evaluation in machine learning is similar to power analysis in conventional statistics for assessing the quality of data. It is the key step in machine learning, as the ability of the model to make predictions on unseen or future samples enhances the trust on the model to be used in a particular dataset. The measurement for model evaluation is the accuracy in percentage (estimate of generalization of a model on prospective data). Six different supervised machine learning algorithms (decision tree, random forest, extreme gradient boosting, logistic regression, support vector machine, artificial neural networks) are simplified. These algorithms have been widely used in medical informatics [65,67,[91][92][93].

Decision Tree
Decision tree has been widely used in medical informatics [63,71], as it is the basic concept used by other algorithms such as random forest and gradient boosting, but with certain differences in the processes to predict the final output.
The decision tree algorithm follows the model of a tree structure, where it has a root node, decision node and terminal node. The root node starts with the most important independent variable followed by decision nodes (other independent variables). The terminal node indicates the dependent variable, which is the final predicted output.
The processes in the decision tree are summarized into three steps: (i) choosing features, (ii) setting conditions to split and (iii) stopping the splitting process to produce a final output. A tree structure is built based on the observation falling in each region and the mean value of prediction.
The splitting process is continued until a user-defined stopping criteria (the number of observations per node) is reached. In the case of more than two variables, the regions cover all the variables with multiple axes.

Random Forest
Random forest is an ensemble learning algorithm, which is derived from decision tree. It follows the rule of decision tree, but constructs a multitude of decision trees at training time and outputs the class with the maximum vote. Random forest is the state-of-the-art algorithm in medical informatics, as it has the ability to manage multivariate data [72].
Random forest is known as an improved version of decision tree, as it constructs more than one tree to select the best output, whereas decision tree constructs only one tree. The number of trees constructed during the training process is not default, as the users can specify it based on the number of samples. The number of trees is directly proportional to the number of samples.

Extreme Gradient Boosting
Gradient boosting follows the principle of random forest but with an added interpretation to predict the final output. This algorithm also constructs multiple trees called boosted trees. A prediction score is assigned for each leaf in the boosted trees (gradients), whereas random forest only contains the final decision value for one tree. Several studies have used the gradient boosting approach to analyze medical data [73,92].
This algorithm also considers the weak and strong prediction values during training before making the final decision, unlike decision tree and random forest, which only select the tree with the best class, without considering the other classes. This method in gradient boosting is known as the impurity measure. The scores of all the leaves in the trees are summed up to produce the gradient values, and the final prediction is made based on the mean value, called gradient boosting.

Logistic Regression
Most studies use regression for prediction analytics in medicine [61]. Logistic regression predicts categorical output, for example, the survival status (alive or dead). The predictions are made based on the probabilities shown by a curve. This process is repeated for all the samples. The curve is shifted to calculate new likelihoods of the samples falling on that line. Finally, the likelihood of the data is calculated by multiplying all the likelihoods together and the maximum likelihood is selected as the final result.

Support Vector Machine
Just like all other algorithms, support vector machine (SVM) segregates data into different classes, but it involves discovery of hyperplanes. The hyperplane divides the data into two groups (classes). The points closer to the decision boundary or hyperplane are called support vectors. The final prediction is made based on the values of independent variables and the support vectors corresponding to the hyperplane. The number of hyperplanes depends on the number of independent variables. The SVM structure is complicated, with more than three features, but its ability to process multiple variables with multiple hyperplanes at a time to predict the final outcome is one of the advantages of this algorithm.

Artificial Neural Networks
Neural networks are an artificial representation of the human nervous system. It can be explained using the structure of neurons and how they work. The dendrites collect information from other neurons in the form of electrical impulses (input). The cell body generates inferences based on the inputs and decides the actions to be taken. The outputs are transmitted through exon terminals as electrical impulses to other neurons.
The same concept is implied in artificial neural networks (ANN). The inputs refer to the independent variables and samples provided to the algorithm. The inputs are multiplied by weights to calculate the summation function. The higher the weight an input has, the more significant the input is to predict the final output. The activation function predicts the probabilities from the training data and generates a final outcome. This is known as a single-layer perceptron. There are three types of layers in ANN, which are input layer, hidden layer and output layer.
Model evaluation is followed by variable importance in machine learning. Variable importance (importance score) is an alternative to identifying the significant factors (pvalue) in conventional statistics using confidence interval measure and hypothesis testing.
After performing model evaluation, the elements (variables) of the input data need to be explored further in regard to how they contribute to the accuracy measure. Hence, machine learning algorithms are built-in with a technique called variable importance or feature importance to analyze the variables or features in the input data. The distribution of these variables contributes to the prediction of the final outcome using machine learning models.

Integration of Conventional Statistics with Machine Learning
Statistics is a branch of mathematics that consists of a combination of mathematical techniques to analyze and visualize data 90. On the other hand, machine learning is a branch of artificial intelligence that is composed of algorithms performing supervised and unsupervised learning. The comparison or integration between conventional statistics and machine learning has gained momentum over the last few years [94,95]. It is plausible that data integrity with protection is the most challenging task in healthcare analytics [96][97][98]. Hence, from this review, it is found that the integration between these two fields could unlock and outline the key challenges in healthcare research, especially in handling the valuable asset called data. Individuals should not be subject to a final decision based solely on automated processing or machine learning using algorithms, but integration of statistics and human decision making is essential at an equal rate. The integration between statistics and machine learning is shown in Figure 4.

Significance of Machine Learning to Healthcare, Education and Society
The review on the integration between conventional statistics and machine learning is the key factor to convince clinicians and researchers that machine learning algorithms are based on core conventional statistical ideas; thus, they could be used to supplement data analysis using conventional statistics. From this review, we believe that machine learning, which follows the fundamentals of conventional statistics, has a positive impact on healthcare. The significance of machine learning to healthcare is explained ( Figure 5).

Significance of Machine Learning to Healthcare, Education and Society
The review on the integration between conventional statistics and machine learning is the key factor to convince clinicians and researchers that machine learning algorithms are based on core conventional statistical ideas; thus, they could be used to supplement data analysis using conventional statistics. From this review, we believe that machine learning, which follows the fundamentals of conventional statistics, has a positive impact on healthcare. The significance of machine learning to healthcare is explained ( Figure 5). Prior to the emergence of the data deluge, healthcare providers made clinical decisions based on formal education and their experience over time in practice. Decision analysis in healthcare has been criticized because the experience and knowledge of the decision makers (clinicians) on patient characteristics are not the same or standardized. The linear process of the decision-making model involves four steps, which are data gathering, hypothesis generation, data interpretation and hypothesis evaluation. All four steps require data from different departments, clinicians from different expertise and various data analytical methods to make the final decision. Experienced clinicians may not deliberately go through each step of the process and may use intuition to make decisions, instead of facing obstacles handling several hypotheses with different personnel. As this is applied to experienced clinicians, a novice clinician would have to understand and rely on the analytical principles and theory behind a decision analysis process in a particular situation handling a patient. In this case, the healthcare sector is in need of clinical decision support tools to enhance and standardize clinical decision making.
Machine learning algorithms are widely used to develop clinical decision support tools. These algorithms compile the four steps (data gathering, hypothesis generation, data interpretation and hypothesis evaluation) of traditional decision making into one. The advantages of machine learning algorithms in medical informatics depend on the objectives of the research and the types of data used. ML algorithms such as decision tree, random forest, gradient boosting, regression, support vector machine and artificial neural networks are suitable for medical informatics, as they are able to handle big data, a combination of numerical and categorical data and missing values. Moreover, these algorithms generate visualizations, which could be transformed automatically (integrated into tools) to be used by the clinicians as guidelines for patients.
In any machine learning analysis, domain experts are still required to enhance the reliability of the machine and make sense of the results. In medical informatics specifically, the decision of clinicians on a particular patient's health condition plays an Prior to the emergence of the data deluge, healthcare providers made clinical decisions based on formal education and their experience over time in practice. Decision analysis in healthcare has been criticized because the experience and knowledge of the decision makers (clinicians) on patient characteristics are not the same or standardized. The linear process of the decision-making model involves four steps, which are data gathering, hypothesis generation, data interpretation and hypothesis evaluation. All four steps require data from different departments, clinicians from different expertise and various data analytical methods to make the final decision. Experienced clinicians may not deliberately go through each step of the process and may use intuition to make decisions, instead of facing obstacles handling several hypotheses with different personnel. As this is applied to experienced clinicians, a novice clinician would have to understand and rely on the analytical principles and theory behind a decision analysis process in a particular situation handling a patient. In this case, the healthcare sector is in need of clinical decision support tools to enhance and standardize clinical decision making.
Machine learning algorithms are widely used to develop clinical decision support tools. These algorithms compile the four steps (data gathering, hypothesis generation, data interpretation and hypothesis evaluation) of traditional decision making into one. The advantages of machine learning algorithms in medical informatics depend on the objectives of the research and the types of data used. ML algorithms such as decision tree, random forest, gradient boosting, regression, support vector machine and artificial neural networks are suitable for medical informatics, as they are able to handle big data, a combination of numerical and categorical data and missing values. Moreover, these algorithms generate visualizations, which could be transformed automatically (integrated into tools) to be used by the clinicians as guidelines for patients.
In any machine learning analysis, domain experts are still required to enhance the reliability of the machine and make sense of the results. In medical informatics specifically, the decision of clinicians on a particular patient's health condition plays an important role in giving suggestions to the patient. The automated decision support tools may help clinicians in decision making to save time and costs, and to follow a standard procedure to prevent conflict in final decisions.
The field of medicine relies heavily on knowledge discovery and understanding of diseases associated with the growth in information (data). Diagnosis, prognosis and drug development are the challenging key principles in medicine, especially in complex diseases, such as cancer [91]. Based on the principal of evidence-based medicine, decision making based on data and validation should be more agile and flexible to better translate the basic knowledge of complexities into growing advances. The integration of conventional statistics and machine learning to clinical applications should be carefully adopted with a collaborative efforts that includes all major stakeholders for the positive influence of machine learning in medicine [91].
Comparison between the basic workflow of conventional statistics and machine learning is explained in Figure 6. important role in giving suggestions to the patient. The automated decision support tools may help clinicians in decision making to save time and costs, and to follow a standard procedure to prevent conflict in final decisions. The field of medicine relies heavily on knowledge discovery and understanding of diseases associated with the growth in information (data). Diagnosis, prognosis and drug development are the challenging key principles in medicine, especially in complex diseases, such as cancer [91]. Based on the principal of evidence-based medicine, decision making based on data and validation should be more agile and flexible to better translate the basic knowledge of complexities into growing advances. The integration of conventional statistics and machine learning to clinical applications should be carefully adopted with a collaborative efforts that includes all major stakeholders for the positive influence of machine learning in medicine [91].
Comparison between the basic workflow of conventional statistics and machine learning is explained in Figure 6.

Automation of Machine Learning in Healthcare Research
The machine learning approach could be transformed into an updated guideline for academicians and researchers. The medical academic sector may use the methodologies for teaching and learning programs to educate medical students on the importance of machine learning. Moreover, researchers in the same field can follow the techniques and machine learning models to conduct research and cohort studies in any healthcare domain.
Biostatisticians may consider using machine learning techniques and automated tools [92,93] together with conventional statistics in order to improve the performance of analytics and reliability of results. The integration between statistics and machine learning may assist biostatisticians to provide novel research outcomes.
The automation of machine learning in healthcare analysis has been applied in a recent study by our research group [99]. The automated tools may assist biostatisticians to provide novel research outcomes.
A guideline to transform statistics and machine learning (derived from the fundamental mathematics of conventional statistics) into an automated decision support tool is illustrated in Figure 7.

Automation of Machine Learning in Healthcare Research
The machine learning approach could be transformed into an updated guideline for academicians and researchers. The medical academic sector may use the methodologies for teaching and learning programs to educate medical students on the importance of machine learning. Moreover, researchers in the same field can follow the techniques and machine learning models to conduct research and cohort studies in any healthcare domain.
Biostatisticians may consider using machine learning techniques and automated tools [92,93] together with conventional statistics in order to improve the performance of analytics and reliability of results. The integration between statistics and machine learning may assist biostatisticians to provide novel research outcomes.
The automation of machine learning in healthcare analysis has been applied in a recent study by our research group [99]. The automated tools may assist biostatisticians to provide novel research outcomes.
A guideline to transform statistics and machine learning (derived from the fundamental mathematics of conventional statistics) into an automated decision support tool is illustrated in Figure 7. This guideline could be a standardized pipeline for the data science community to analyze medical data and to develop artificial intelligence-enabled decision support tools for clinicians and researchers.
In a decision support tool, (i) the data gathering is replaced by the automated data capture from electronic medical records (EMR) or databases from multiple heterogeneous sources, concomitantly; (ii) hypothesis generation is the specification of input variables (independent and target) and the final outcome based on the research question or a question for clinical decision (output); (iii) data interpretation is done using algorithms such as random forest, support vector machine and neural networks, which have their specific formulas to read the data, clean the data, capture the required variables, analyze the data based on the specified requirements and perform comparative analytics automatically using different algorithms; (iv) finally, hypothesis evaluation is done by producing interactive charts to visualize the final outcomes to make decisions efficiently. All these steps are performed in a streamlined environment often referred to as automated clinical decision making, which saves the effort of engaging different experts and analytical platforms. The experience which clinicians traditionally use to make decisions is replaced by the legacy data the algorithms leverage to make decisions.
Most important of all, the data management or completeness of data for automated decision making plays an essential role when it comes to statistics and machine learning. The integration between statistics and machine learning can be used to train automated This guideline could be a standardized pipeline for the data science community to analyze medical data and to develop artificial intelligence-enabled decision support tools for clinicians and researchers.
In a decision support tool, (i) the data gathering is replaced by the automated data capture from electronic medical records (EMR) or databases from multiple heterogeneous sources, concomitantly; (ii) hypothesis generation is the specification of input variables (independent and target) and the final outcome based on the research question or a question for clinical decision (output); (iii) data interpretation is done using algorithms such as random forest, support vector machine and neural networks, which have their specific formulas to read the data, clean the data, capture the required variables, analyze the data based on the specified requirements and perform comparative analytics automatically using different algorithms; (iv) finally, hypothesis evaluation is done by producing interactive charts to visualize the final outcomes to make decisions efficiently. All these steps are performed in a streamlined environment often referred to as automated clinical decision making, which saves the effort of engaging different experts and analytical platforms. The experience which clinicians traditionally use to make decisions is replaced by the legacy data the algorithms leverage to make decisions.
Most important of all, the data management or completeness of data for automated decision making plays an essential role when it comes to statistics and machine learning. The integration between statistics and machine learning can be used to train automated models with imputed missing values in the data, which would improve the generalizability and robustness of models [100].
The integration between statistics and machine learning does not only contribute to data augmentation, but also to medical diagnostics using multi-model data. In the future, this approach together with deep learning methods is suggested to be used in bioinformatics analysis using genomic data or a combination of genomic and clinical data to enhance the automated decision-making process. Deep learning, being one of the unprecedented technical advances in healthcare research, assists clinicians in understanding the role of artificial intelligence in clinical decision making. Hence, deep learning could serve as a vehicle for the translation of modern biomedical data, including electronic health records, imaging, omics, sensor data and text, which are complex, heterogeneous, poorly annotated and generally unstructured, to bridge clinical research and human interpretability [101,102].

Conclusions
Conventional statistics are the fundamentals of machine learning, as the mathematical concepts are encapsulated into simplified algorithms executed using computer programming to make decisions. Machine learning has the added benefit of automated analysis, which can be translated into decision support tools, providing user-friendly interfaces based on interactive visualizations and customization of data values. Such tools could assist clinicians in looking at data in different perspectives, which could help them make better decisions. Despite the debate between conventional statistics and machine learning, the integration between the two accelerates decision-making time, provides automated decision making and enhances explainability. This review suggests that clinicians could consider integrating machine learning with conventional statistics for added benefits. Both machine learning and conventional statistics are best integrated to build powerful automated decision-making tools, not limited to clinical data, but also for bioinformatics analyses.