Supervised Machine Learning Algorithms for Bioelectromagnetics: Prediction Models and Feature Selection Techniques Using Data from Weak Radiofrequency Radiation Effect on Human and Animals Cells

The emergence of new technologies to incorporate and analyze data with high-performance computing has expanded our capability to accurately predict any incident. Supervised Machine learning (ML) can be utilized for a fast and consistent prediction, and to obtain the underlying pattern of the data better. We develop a prediction strategy, for the first time, using supervised ML to observe the possible impact of weak radiofrequency electromagnetic field (RF-EMF) on human and animal cells without performing in-vitro laboratory experiments. We extracted laboratory experimental data from 300 peer-reviewed scientific publications (1990–2015) describing 1127 experimental case studies of human and animal cells response to RF-EMF. We used domain knowledge, Principal Component Analysis (PCA), and the Chi-squared feature selection techniques to select six optimal features for computation and cost-efficiency. We then develop grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. The dataset has been tested with ten different classifiers, and the outputs are estimated using the k-fold cross-validation method. The assessment of a classifier’s prediction performance is critical for assessing its suitability. Hence, a detailed comparison of the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) for each classification method were observed. Our findings suggest that the Random Forest algorithm exceeds in all groups in terms of all performance measures and shows AUC = 0.903 where k-fold = 60. A robust correlation was observed in the specific absorption rate (SAR) with frequency and cumulative effect or exposure time with SAR×time (impact of accumulated SAR within the exposure time) of RF-EMF. In contrast, the relationship between frequency and exposure time was not significant. In future, with more experimental data, the sample size can be increased, leading to more accurate work.


Introduction
Advancing technologies that depend on wireless communication systems compel users to face increased levels of exposure to radiofrequency electromagnetic field (RF-EMF). Throughout the past decade, mobile phone use has dramatically expanded; hence, the RF-EMF exposure level to the environment has increased as a consequence [1]. This development has raised concerns on the potential hazards to human health. More than other body cells, the brain cells are vulnerable to a high specific absorption rate (SAR) because of the close proximity of the mobile phone to the users' head in conventional usage. Hence, the potential impacts of cell phone usage on human cells, including the central nervous system (CNS) should be investigated [2]. Machine learning can be used to identify patterns of this impact and a promising way for faster, effective, and more reliable data analytics. The present study intends to investigate robust predicting techniques for identifying the impact of RF-EMF on human and animal species.
For several decades, the concerns have been elevated on the safety on long-term use of mobile phones . The CNS is the principal concern for impacts of RF-EMF [2,27,28], as, generally, mobile phones are used in close proximity to human head [29]. The biological effects of RF-EMF exposure on human health remain vague due to inconsistent and contradictory findings of various studies [2,28].
In 2011, the World Health Organization (WHO) and the International Agency for Research on Cancer (IARC) have characterized radiofrequency radiation (RFR) originating from mobile phones as a "Possible Human Carcinogen" (Group 2B) [30] based on comprehensive in vitro, in vivo, and epidemiological studies. The Interphone Study [31] provides some evidence to imply the increased risk of glioma for heavy adult users >1640 h and the Hardell et al. study [32] shows enhanced risk of malignant brain tumors for users concerning cellular and cordless phones. In contrast, another study [33] proposes that there is no increase in risk, with several reviewing groups advising that mobile phone use is safe for adults as well as children (SCENIHR [34], ICNIRP [35]). Besides, ICNIRP [35] suggests that many experiments showed effects that neither had been independently replicated nor reproduced.

Background
The volume of data on the planet and around our lives appears to be ever-expanding. Big data is a phrase that defines an enormous volume of data (both structured and unstructured) that we produce on a day-to-day basis. Advanced analytic methods are performed on big data sets to extract useful information [36]. Yet, it is not the quantity but our interpretation, through the analysis of data, which is powerful and matters. Large data sets can be computationally analyzed to obtain trends, patterns, and associations. These analytics assist us in discovering what has been changed and how we should respond. For the first time, various organizations are beginning to adopt advanced analytics and, therefore, are puzzled how to utilize it.
Machine learning (ML) is the utilization of artificial intelligence (AI) [37] that produces systems with the capability to learn and enhance from experience. ML methods may operate in iterations where it attempts to discover the hidden pattern in data [38]. Discovery analytics toward big data can be facilitated by different types of analytical tools, including text analytics, data mining, statistical analysis, Structured Query Language (SQL) queries, data visualization, natural language processing, and artificial intelligence [39]. These tools have been around for quite a long time, and a considerable number of them have also been developed since the 1990s. The contrast today is that, unquestionably, more user organizations are utilizing them in association with the availability of big data. It is essential to know the analytic elements that are associated with the problem before determining which tool type is suitable for their requirements. This study aims to address new prospects for utilizing ML in Bioelectromagnetics space, allowing for the users to make intelligent judgments as they adopt it. In contrast to conventional analysis, ML mechanisms have been exploited to obtain patterns from big data that might not be feasible otherwise. Hence, algorithms can iteratively acquire hidden information from data [40].
Studying the occurrence of non-thermal biological effects of RF-EMF is crucial for distinguishing between the predictive nature of findings generated from experimental investigations in in-vitro (cell-based) and whole animals, and those arising from clinical or epidemiological studies. The impacts of past exposures and conditions can be shown in clinical or epidemiological studies. In contrast, in-vivo and in-vitro studies can be used to predict and eventually limit impacts from arising in the future [41]. Nevertheless, it cannot be expected that humans similarly react to RF-EMF as do cell cultures or animals. Numerous investigations of weak radiofrequency electromagnetic fields and radiation have concentrated on animals, plants [42][43][44], human behavioral, and cell cultures. Nevertheless, straightforward biological frameworks can contribute to our knowledge of the underlying interaction mechanisms and which proteins in living things are vulnerable to RF-EMFs. This information is essential for the advancement of the dose-response association on guidelines, as required by scientific bodies, such as the International Commission on Non-Ionizing Radiation Protection (ICNIRP) [45], IEEE, International Agency for Research on Cancer (IARC), and World Health Organisation (WHO) [1].
The production of reactive oxygen species (ROS), which is intervened by radiofrequency radiation (RFR), is considered as one of the essential bioeffect structures [46]. Mitochondria in stria marginal cells (MCs) are susceptible to ROS attack and they are meant to be very sensitive to oxidative damage [47]. A recent research finding by Yang et al. (2020) [48] into short-term exposure of mobile phone RFR, on MCs in vivo, indicates no DNA damage in marginal cells. However, the reactive oxygen species (ROS) production in the 4 W/kg exposure group was higher than that in the control group (p < 0.05). Various investigations [49][50][51][52] have revealed that RF-EMF exposure of animals enhances the blood-brain barrier (BBB) permeability, debilitates intracellular calcium homeostasis, changes neurotransmitters, and increments neuronal loss and harm in brain tissue.
Our recent meta-analysis [41] cross-examined published experiments that considered the non-thermal RF-EMF exposure effects (cytogenetic, gene, and protein expression analysis) on cell types with various doubling times, including lymphocytes, epithelial, endothelial, and spermatozoa from rat, mouse, and humans. Our investigation revealed that 45.3% of experiments concluded that an expansion in such potential has an effect on cells exposed to RF radiation, while 54.7% concluded that no such effects (p = 0.001) are observed. Nevertheless, it cannot be expected that humans similarly react to RF-EMF as do cell cultures and animals.
There is extensive clinical and epidemiological proof [41] to propose that even low degrees of radiofrequency may cause harmful consequences for the functioning of cells. Two such significant epidemiological investigations are: population-based cohorts followed for a longer time, and case-control investigations analyzing precise cases of disease and matched controls that do not have the condition [41].
ML additionally improves the utilization of prediction tools to aid further health examinations (in-vitro, in-vivo, and occupational and environmental epidemiology) and allows the researchers to see how environmental properties may influence an ultimate decision. Figure 1 demonstrates the potential features or variables or attributes of bioelectromagnetic experiments (in-vitro, in-vivo, and epidemiological studies) that can be utilized by ML algorithms to predict the behavior.

Motivation
The advancement of emerging technology is perceived as a means to enhance and strengthen society. Advancing technologies that depend on wireless communication have begun showing higher degrees of radiofrequency electromagnetic field (RF-EMF) exposure. This enhanced the enthusiasm in the area of bioelectromagnetics, which is the examination of the impact of RF-EMF on living organisms. Currently, it is the technological era where the maturation of technology guides humans to understand the world more deeply. The insight into critical factors, which determine the impact of weak RF-EMF on living organisms, helps in a broader way to capture the underlying pattern of the data better.
The use of reliable prediction techniques to identify the effect of weak RF-EMF on organisms is turning out to be increasingly essential. An essential factor affecting the choice of algorithm is the model complexity. In classification frameworks, a model is trained and utilized to obtain predictions of an event of interest. Our previous studies used ML algorithms to predict the impact of weak RF-EMF on plant species (Table 1). This study aims to present the merit of utilizing ML algorithms (supervised learning, i.e., prediction) to develop higher accuracy classifiers for predicting the potential impact of weak RF-EMF on human and animal cells in in-vitro studies without performing in-vitro laboratory experiments. We intend to ascertain the possibility of a significant impact of the features or variables (such as frequency of weak RF-EMF, specific absorption rate (SAR), and exposure time) of weak RF-EMF exposure on human and animal cells.
The main contributions of this paper include the following:

1.
Extract data from 300 peer-reviewed scientific publications (1990-2015) describing 1127 experimental investigations in cell-based in vitro models (human and animal species).

2.
Identify the most suitable features or attributes to be utilized in prediction models to provide insight into key factors that determine the possible impact of RF-EMF in in-vitro studies while using domain knowledge, Principal Component Analysis (PCA), and Chi-squared feature selection techniques.

3.
Develop a grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. This will produce five different feature groups or distributions for each laboratory experiment.

4.
Develop a prediction model to observe the possible impact without performing in-vitro laboratory experiments. This is the first time that the supervised machine learning approach has been used for the characterization of weak RF-EMF exposure scenarios on human and animal cells.

5.
Compare each classifier's prediction performance while using seven measures to obtain the decision on its suitability, while using the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) for each classification method. 6.
Identify a robust correlation between exposure time with SAR×time (impact of accumulated SAR within the exposure period) and SAR with the frequency of weak RF-EMF on human and animal species. In contrast, the relationship between frequency and exposure time was not significant.
The rest of the paper is organized, as follows: Section 2 introduces the dataset, including its features, and how the data is collected and pre-processed, feature selection techniques, prediction models (supervised ML algorithms), features grouping strategy and evaluation measures of binary classifiers used. Subsequently, the classifier performance results are presented in Section 3 with the analysis of the prediction model and feature selection techniques carried out. Section 4 provides a related discussion. Section 5 explains potential future improvements in the area, and, finally, the paper concludes in Section 6.

Materials and Methods
In this study, nine principal classification algorithms or classifiers have been utilized, for producing accurate prediction models and observing trends of human and animal cell responsiveness to non-thermal weak RF-EMF using previously published experimental data. This study follows a few steps: data collection and preparation, optimal feature selection (attribute selection), classifier (algorithm) selection, parameter and model selection, training selected classifier, and evaluation. The ten supervised ML algorithms that were used for this analysis are (Table A1 in Appendix A): Random Forest, Bagging, J48, Decision Table, BayesNet, k-Nearest Neighbour (kNN), JRip, Support Vector Machine (SVM), Naive Bayes and Logistic Regression, and six different features (species, frequency of RF-EMF, SAR, exposure time, SAR×exposure time, and cellular response (presence or absence)). By applying dimensionally reduction techniques or feature selection methods, six major features were chosen out of all collected features. We removed two features or attributes using (i) domain knowledge, (ii) Principal Component Analysis (PCA), and (iii) the Chi-squared feature selection method. Using these techniques, we aim to gain more profound insights into the features (such as year, species, frequency of weak RF-EMF, SAR, exposure time, SAR×exposure time, and cellular response (presence or absence)) of weak RF-EMF exposure scenarios on human and animal cells. The outputs are estimated using the k-fold cross-validation method for each classifier. The most efficient classifiers have been chosen by considering the prediction accuracy and computation time.

Feature Selection Methods for Classification
The act of recognizing the most significant features or variables that provide the best predictive capability in modelling data is called feature selection. This is one of the key ideas in ML, which tremendously impacts the model or classifier performance. This could mean, after undergoing the feature selection process, adding or eliminating features to the model that do not enhance its performance. Features will be selected automatically or manually to provide the best to the output, or prediction features, which we are interested in. However, choosing which features we should use to build a predictive model is a challenging problem that may need require in-depth knowledge of the problem domain.

Principal Component Analysis (PCA)
Principal Component Analysis (PCA) is an unsupervised, non-parametric statistical strategy that is predominantly utilized to reduce the dimension in a data set that consists of many features (variables or attributes) that are correlated to each other. PCA is not a classifier, and it reduces the number of features to help achieve computational and cost-efficiency. PCA does not adequately reduce data if there is a weak association between features or variables. Only when features in a dataset are highly correlated, PCA should be utilized. In contrast, using PCA is not significant if the majority of the correlation coefficients are smaller than 0.3 [55].
It is essential to normalize data before performing PCA. When the data are normalized, all of the variables produce a similar standard deviation. Consequently, all of the variables have a similar weight and PCA determines the essential ones. PCA is an approach to manage profoundly correlated variables, so that there is no compelling reason to remove those. In the event that N variables are profoundly correlated, then they will all place on the same Principal Component (Eigenvector) [36]. PCA is not appropriate for some classification scenarios. Assume that there are two classes of data; however, the within-class variance is high when contrasted with between-class variance; here PCA may discard the important data that isolates two classes. Consequently, if the data are noisy, and the noise variance is more than the variance between the means of the two classes, at that point, PCA will keep the noise parts, and let discard the distinctive segment (this is normal since PCA is unsupervised) [55]. In this study, we use PCA for feature selection before using ML (supervised learning) algorithms.

Chi-Squared Feature Selection (χ 2 )
This is another filter-based strategy. In this method, the Chi-square metric between the target, and the numerical variable will be calculated and then choose the features with the maximum Chi-squared scores (χ 2 ). If the number of observations in the class is close to the expected number of observations in the class then two features are independent; hence, the Chi-squared value is small. This is given by, where O i is the number of observations in the class and E i is the number of expected observations in class i [36]. The Chi-squared Ranking Filter technique is employed to determine the features that are essential for the prediction. In our analysis, we used domain knowledge, Principal Component Analysis (PCA), and Chi-squared techniques for the feature selection process.

Supervised Machine Learning
Machine Learning algorithms can be classified into two significant methods: supervised ML and unsupervised ML. Classification and regression methods are known as supervised ML, while clustering and association methods are known as unsupervised learning. An approach that lies in between supervised and unsupervised ML method is called semi-supervised learning.
The most practical applications utilize supervised ML algorithms (classification algorithms) for prediction. Supervised ML takes a known set of input variables, x (the training set), the known responses to the data or output variable (Y), and an algorithm that learns the mapping function or trains a model from the input to the output variables, Y = f (X). In this method, all of the data are labelled, and the algorithms attempt to figure out how to predict the output from the input data. Thus, the mapping function can be approximated adequately. With this, a classifier (ML algorithm) can predict the output variables (Y) for that for new input data (x). Since we know the outcome of the training data, we call this as supervised ML technique. The algorithm iteratively makes predictions on the training data and learning ends when the algorithm delivers a satisfactory level of performance [36].

Data Collection
We extracted data from 300 peer-reviewed scientific articles that were published between 1990 and 2015 that included 1127 distinct laboratory experiments to predict the potential responsiveness of human and animal cells to RF-EMF. We eliminated laboratory experiments that reported outcomes when (i) no complete dosimetry is disclosed, (ii) SAR values are greater than 50 W/kg, or (iii) exposure durations are greater than seven days and (iv) publication is not published in peer-reviewed scientific journals. Subsequently, the cellular response (presence or absence) is observed from 1127 human, rat/mouse, and other species cells. Seventy different tissue/cell types have been used to evaluate the effect of weak RF radiation from mobile phones. All of the extracted data are from peer-reviewed publications, which were published in PubMed or IEEE database.
The data employed in this analysis have been shown in our recent study (Tables 11-42, Halgamuge et al., 2020 [41]) that extracted high levels of understanding from raw data using different classification algorithms and performance evaluation methods. The collected dataset comprises of five attributes of RF-EMF and 1127 experimental case studies or instances, such as: species (human and animal cells/tissue), frequency of weak RF-EMF, SAR, exposure durations, and cellular response (presence or absence).

Data Pre-Processing and Inclusion Criteria
Data pre-processing was performed prior to training the supervised ML algorithms or classifiers. A portion of the data, from 300 peer-reviewed scientific publications published (1990-2015) that included 1127 distinct experimental case studies, was held as the testing part, and the remaining data were used to build classification models (training).
Data inclusion criteria and data pre-processing criteria are as shown in our previous study [53]. We initially used six features or attributes (Table 2) for the analysis; then, we used domain knowledge, PCA technique, and Chi-squared Feature Selection method to select the optimal attributes for the classifier. Feature selection is the process of choosing features in a dataset to model the problem to be answered and understand the underlying relationships of the data. Although we had a very high data size to feature ratio (1127:6), which might not lead to overfitting on the training data, we performed the feature selection technique using (i) domain knowledge or expert knowledge, (ii) Principal Component Analysis (PCA) technique, and (iii) the Chi-squared feature selection method to select the optimal features for the classifier.

Data Analysis
In this work, we utilize the binary classification method that classifies the data into two groups, e.g., whether or not the non-thermal low power RF-EMF's impact on the cellular response was observable (presence or absence). Independent variables, such as the frequency of weak RF-EMF, specific absorption rate (SAR), exposure time, and species impact on sensitive human and animal cells. A principal assumption of ML is that the training data is the representation of the distribution from which test data (future data) will be picked. The data are independent and distributed identically [36], which remains an assumption of this study. The analysis is performed using MATLAB (MathWorks Inc., Natick, MA, USA) R2019b and Weka tool (Waikato Environment for Knowledge Analysis, Version 3.9, University of Waikato, New Zealand), on a computer with macOS High Sierra (Version 10.13.6, Apple, Cupertino, CA, USA), on a computer with 1.7 GHz Intel Core i7 CPU, 4 GB 1600 MHz DDR3 RAM.
The optimal feature selection protocol is useful for identifying critical parameters that should be applied in in-vitro laboratory experiments. We used domain knowledge to select key features or attributes in our previous study [53]. However, in this study, we used domain knowledge, Principal Component Analysis (PCA) technique, and the Chi-squared feature selection method to select six optimal features for the classifier.
Cross-validation is a resampling methodology that is used to assess machine learning algorithms in a limited dataset [56]. In this work, we use the k-fold cross-validation, k = 10, method. Therefore, it splits the data into ten equal parts and then uses the first nine parts for training, and the final fold is for testing purposes. The cross-validation joins (averages) the proportions of fitness in prediction to determine a precise estimate of model performance.

Evaluation Measures of Binary Classifiers
We analyze RF-EMF sensitivity of human and animal cells while using classification algorithms. After performing the feature selection procedure, test cases were chosen to demonstrate certain aspects of the proposed method. Consequently, the k-fold cross-validation method was used to employ for each classifier. Ten classification algorithms were used to make the best predictions for the given dataset (please see Appendix A to see why each algorithm works differently). Then we analyze the correctly classified percentages of each classification algorithm.
A confusion matrix is also associated as an error matrix and is a table that is frequently used to illustrate the performance of a classifier or classification algorithm on a set of test data for when the true values are known. This provides the number of true positives TP, true negatives TN, false positives FP, and false negatives FN. We obtained the confusion matrix for each classifier, and estimated the rate of each classifier that we have utilized to predict the actual human and animal cell sensitivity and to understand if it varies using test data. The root means squared error (RMSE), mean absolute error (MAE), a weighted average of precision, recall, and F-measure are estimated using the k-fold cross-validation approach. Furthermore, correctly classified instances can be divided as TP and FP. Additionally, the incorrectly classified instances can be grouped into TN and FN. Performance evaluation measurements were used to avoid accuracy inconsistency. The confusion matrix provides a further analysis than the insignificant proportion of accuracy (correct classifications).
Binary classifiers are statistical and computational models that isolate a dataset into two groups: positives and negatives [57]. The assessment of a classifier's prediction performance is critical to get the decision on its suitability. To date, numerous approaches have been developed and introduced to measure prediction performance. Usually, we utilize accuracy, error rate, and computation time for measuring classifier performance in terms of model development. When we consider the real performance of a classifier, accuracy is not a stable metric. If the dataset is unbalanced, accuracy will produce misleading results. Different extra measures are valuable for the assessment of the final model. Class imbalance, or a distinction in the quantities of positive and negative instances, is common in scientific areas, including the life sciences [58]. The classification of imbalanced datasets is a generally new hurdle in the field of machine learning [59]. Binary classifiers are routinely assessed while using different performance measures, for example, sensitivity and specificity, and performance is represented using Area under the Receiver Operating Characteristics (ROC) curve (AUC) plots. The ROC plots are visually attractive and they give a summary of a classifier execution over a wide scope of specificities [59]. ROC plots could be deceiving when applied in imbalanced classification situations; although, in our case, we have a balanced binary classification problem, where 45.3% indicated cell changes and 54.7% indicated no changes. The visual interpretability of ROC plots with regards to imbalanced datasets can be misjudging concerning decisions regarding the reliability of classification performance with a wrong understanding of specificity. Precision-Recall (PRC) plots, then again, can present with a precise prediction of future classification performance because of the way that they assess the portion of true positives among positive predictions [59]. Hence, in this study, we analyzed: (i) accuracy (PCC-Percent Correct Classification), (ii) error rate (RMSE), (iii) precision, p is the percentage of predictive items which are correct, p = TP = (TP + FP), (iv) sensitivity or recall (true positive rate), TP = (TP + FN), (v) 1− specificity (false positive rate, FP/(FP + TN), (vi) area under the ROC Curve, and (vii) precision-recall (PRC Area).

Results
Obtaining an understanding of the data is one of the goals of developing ML models. In order to predict the possible impact of RF-EMF on human and animal cells in in-vitro studies, feature selection techniques, different classifier model evaluation techniques, such as model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) using the k-fold cross-validation method were used in this study. The knowledge into key components of analysis was obtained, which decide the effect of weak RF-EMF on living organisms, in order to grasp the basics of the data better, and this study is a part of it.
An overview of the utilized laboratory experiments that provided a positive association (cellular response-presence) between weak RF-EMF and for human cells (Table 3) and animal cells (Table 4).

Feature Selection Methods for Classification
Irrelevant or less essential features can severely affect model performance. We developed a feature selection protocol using essential domain knowledge of impact of RF-EMFs on living organism (using five different groups, as shown in Table 5). We also capture the other two approaches (Principal Component Analysis (PCA) technique and Chi-squared feature selection method) when performing feature selections techniques before utilizing in prediction models.
The SAR×exposure time is the impact of accumulated SAR within the exposure period, so we used that feature for this analysis. Finally, our analysis selects six key features (specie, frequency of weak RF-EMF, SAR, exposure time, SAR×exposure time, cellular response (presence or absence) for our dataset. Some features were removed in this analysis, for example, exposure system (GTEM cell, TEM cell, waveguide, etc.), modulation techniques of mobile communication (AM, FM, GSM, etc.), and cell line (human blood lymphocytes, breast cancer cell line, human spermatozoa, etc.). Table 5. Grouping or clustering strategies to allocate these selected features into five different laboratory experiment scenarios. This will produce five different feature groups or distributions for each laboratory experiment.

Prediction Using Supervised Machine Learning
Various additional measures are useful for the evaluation of the final model. Receiver Operating Characteristics (ROC) curves can be utilized to choose the most appropriate prediction model. Hence, in this study, we utilized accuracy, error rate (RMSE), precision, sensitivity, or recall (true positive rate), 1− specificity (false positive rate), area under the ROC Curve, and precision-recall (PRC Area). Table 5 shows grouping or clustering strategies for allocating selected features into five groups for different laboratory experiment scenarios. First, we analyzed the accuracy of all classification algorithms for all groups, separately. The k-fold cross-validation was employed for each classifier. The Random Forest algorithm outperformed (83.56%, 0.3 s) in terms of high prediction accuracy and low computation time. Accuracy values greater than 75% are demonstrated in Table 6 (PCC > 75%). We observed that the computation time was very low (less than a minute) in all algorithms for all combinations of features. Hence, the computation time for each classification algorithm was not analyzed.
Moreover, RMSE for the best performing algorithms was plotted in Figure 2 where RMSE value <0.42.
Subsequently, we analyzed Area under the ROC curve. The ROC curves are generally used to determine, graphically, the connections/trade-offs between sensitivity and specificity for every possible combination of tests. The area under the ROC Curve can be categorized based on the values: an area of 1 shows a perfect test and an area lower than 0.5 shows a worthless test. A rough guide for classifying the accuracy of a diagnostic test is the traditional academic point system is shown in Figure 3: excellent (0.9-1), good (0.8-0.9), fair (0.7-0.8), poor (0.6-0.7), and fail (0.5-0.6). This clearly demonstrates seven algorithms (Random Forest, Bagging, J48, Decision Table, BayesNet, kNN, and JRip) perform better, on the other hand, SVM, Naive Bayes, and Logistic Regression algorithms show as worthless tests, as the Area under ROC curve was less than 0.5 (Table 7). Hence, for the rest of the analysis, we only used these seven classification algorithms. The possible explanations for this result might be that each algorithm works a bit differently and each follow different computation complexities. Please see Table A1 in the Appendix A. Moreover, some of the algorithms work well in all numeric data when compared to the mixed data.    We selected the top seven classification algorithms that were performed in terms of Area under the ROC Curve and accuracy (Figure 4) out of ten algorithms that we used in this study.
Subsequently, we estimated the classification model performance while using Group details that are shown in Table 5. This study shows negligible fluctuation with the top seven classification algorithms Area under the ROC Curve (0.93-0.8) ( Figure 5 and Table 8), except Group E, demonstrating that the outcomes are crucial. Hence, this result demonstrates that the frequency of the weak RF-EMF (Hz) feature is critically important for prediction, and to better obtain the underlying pattern of the data.   Although these results reveal that the general performance of the seven classifiers, it is still interesting to know how these assessments of a classifier's prediction performance of each algorithm are met. Hence, more importantly, the performance evaluation measures of binary classifiers are further computed while using the confusion matrix using k-fold = 60. Table 6 demonstrates the confusion matrix (weighted average) for classification model performance. Detailed comparison of the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve, and precision-recall (PRC Area) for each classification method were shown here.
Precision explains how many of the positively classified instances were suitable for all algorithms or classifiers. Sensitivity (recall) shows how suitable analysis is for detecting the positives while specificity demonstrates how beneficial a test is at avoiding false alarms. Hence, all of these measures are valuable. By considering all measures, seven algorithms (Random Forest, Bagging, J48, Decision Table, BayesNet, kNN, and JRip) show high prediction performance; on the other hand, three algorithms (SVM, Naive Bayes, Logistic Regression) show unsuitability for this dataset. Computational time (CPU time) appears to be low in all classifiers due to the smaller sample size. Figure 6 demonstrates correlations among features for RF-EMF on human and animal cells (maroon indicating strong correlation and blue signaling weak correlation). The features selected for this analysis were frequency, SAR, exposure time, and SAR×exposure time. A robust correlation was seen between exposure time with SAR×time and SAR with the frequency of weak RF-EMF. In contrast, the relationship between the frequency and exposure time was not notable. Using ML techniques, this study demonstrated more profound insights into the features of weak RF-EMF exposure scenarios on human and animal cells. Except for the complexity of the selected algorithm, Figure 7 clearly demonstrates computation time depends on processor speed (CPU) and memory capacity (RAM size) of computer that we use to run ML algorithms. Computer with higher processor speed and RAM size provide low computation prediction time. This is essential when we use a bigger data set with more features. Figure 7. Influence of computer processor speed (CPU) and memory capacity (random-access memory (RAM) size) on prediction accuracy and computation time for Study 1, Study 2, and Study 3 (this study) shown in Table 1.

Discussion
We develop up a prediction strategy to examine the possible impact of RF-EMFs on human and animal cells without performing in-vitro laboratory experiments. This is the first occasion when the supervised machine learning approach has been utilized for the characterization of weak RF-EMF exposure scenarios. In our study, we use ten different classifiers, and the outputs are estimated using the k-fold cross-validation method. The results of our study indicate that seven algorithms (Random Forest, Bagging, J48, Decision Table, BayesNet, kNN, and JRip) perform better, while SVM, Naive Bayes, Logistic Regression algorithms are shown as worthless tests, as the Area under ROC curve was less than 0.5. Our findings suggest that the Random Forest algorithm exceeds in all groups in terms of all performance measures and shows AUC = 0.903, where k-fold = 60. There are a few potential clarifications for this result. The data do not require to be re-scaled or transformed in the Random Forest method. Primarily, Random forest tackles outliers by binning them. It also handles unbalanced data. It can balance the error in class populations with unbalanced data sets. Principally, each decision tree has a high variance, though low bias. Nevertheless, since it averages all of the trees in a random forest, it also averages the variance. Hence, the Random Forest classification method has low bias and average variance model. Another possible explanation for this is that Random Forest attempts to limit the total error rate. For example, if we have an unbalanced dataset, the big class provides a low error rate, and small class provides a significant error rate. This finding also supports our previous research [53] into a prediction model that shows the Random Forest classification algorithm outperforms, with highest classification accuracy, by 95.26%.
The execution efficiency of the Random Forest algorithm increases with the number of trees. A large number of trees diminishes the danger of overfitting and variance in the model. After some point, in the Random Forest algorithm, the excess of trees can make model training inefficient by increasing the computation time [60], which results in substantial execution costs. This study does not cover memory usage for the chosen dataset. Nevertheless, a generous number of trees expends a bigger RAM space [60] when we utilize the Random Forest strategy.
We extract data from 300 peer-reviewed scientific publications (1990-2015) describing 1127 experimental investigations in cell-based in vitro models (human and animal species). A small sample was chosen because of the limitation of the in-vitro experiments that were published during the chosen period. Hidden information can be gained if we have sufficient data. ML helps to understand and verify the structure of data through mining information from data. The mechanics of learning should be automatic, as there are lots of data to be supplied by individuals themselves. Related applications (such as medical, irrigation, natural disasters) will not come from PC programs, ML specialists, or from the data itself, however, from the individuals who work with the data [36]. The utilization of data, especially data regarding individuals, has substantial ethical implications, and data mining specialists must be mindful of the ethical issues [36]. Nevertheless, when sensitive data are disposed, there is a chance that models will be built that depends on factors that can be appeared to fill in for racial or sexual attributes.
We recognize the most appropriate features or attributes to be used in prediction models to give understanding of crucial factors that decide the possible impact of RF-EMF in in-vitro studies utilizing domain knowledge, Principal Component Analysis (PCA), and Chi-squared feature selection techniques. Picking a classifier relies upon the requirements of the application. Features or attributes of classified data sets directly impact the classifier performance or the prediction rate. This is essential when using large datasets with a high number of features. We observe a very high data size to feature ratio (1127:6), which might not lead to overfitting on the training data. However, there is, in contrast to our study, a study [38] that reported a very low data size to feature ratio when predicting corn yield with ML approach.
It is becoming increasingly difficult to ignore the impact of selecting small sample sizes on prediction accuracy. Recent research by Vabalas et al. [61] has argued K-fold Cross-Validation (CV) exhibits heavily biased performance estimates with small sample sizes. Despite small sample sizes being standard, other components, which impact bias, include data dimensionality, hyper-parameter space, number of cross-validation folds, and data discriminability. For the most part, the higher the ratio of features to sample size, the higher the likelihood that a machine learning model will fit the noise in the data as opposed to the unknown underlying pattern. Additionally, the higher the quantity of adjustable parameters, the more probable that the machine learning model will overfit the data [62]. No single algorithm dominates while picking a machine learning model. Some work better with larger datasets, and some work better with the high dimensional dataset. Essentially, in this manner, it is critical to examine model viability in a specific data set.
We compare each classifier prediction performance utilizing seven measures to get the choice on its suitability, utilizing the percentage of the model accuracy (PCC), Root Mean Squared Error (RMSE), precision, sensitivity (recall), 1 − specificity, Area under the ROC Curve (AUC), and precision-recall (PRC Area) for each classification method. The assessment of a classifier's prediction performance is essential to obtain the decision on its acceptability. Even though ROC requires exceptional care when using imbalanced datasets, it is a standard and robust measure to evaluate the performance of binary classifiers [59]. Similar to our work, previous evidence [59] suggests that precision-recall (PRC) plots can generate precise predictions of future classification performance, because of the way that they assess the portion of true positives among positive predictions.
Various correlations have been made on different classifiers executed over various datasets to find a sensible classifier for a given application. Even with high performing computers dealing with complex issues, it requires the most fitting classification algorithms to decrease the time and computation resources wastage [63]. Machine learning is an exceptional tool, since it discovers some unexplained correlations in different features in applications [53,63,64]. Nevertheless, the data type (text, numeric, images, audio, and video) [63], feature dimensions, and complexity of algorithms could impact on the performance. We build up grouping or clustering strategies to assign chosen features into five diverse laboratory experiment scenarios. This will deliver five different feature groups or distributions for every laboratory experiment. Tognola et al. found [65] cluster analysis (unsupervised learning) is a reasonable way to find features that are best at identifying the exposure situations. Supervised learning is better tailored to discover features in occupational and environmental epidemiology and public health studies [54].
More research in this space is crucial to learn whether and how some RF-EMF features (e.g., frequency of weak RF-EMF, SAR, exposure time) influence the prediction of reactions in living organisms [53]. Our previous studies used supervised ML algorithms to observe RF-EMF exposure on plants species (i) Bayes Net, NaiveBayes, Decision Table, JRip, OneR, J48, Random Tree, and Random Forest [53] and (ii) Random Forest and kNN [54]; nevertheless, this study observed performance contrasts on human and animal species. Previously developed [54] optimization technique was to characterize the trade-off among prediction accuracy and computation time based on the classification algorithm used (the Best Accuracy-Computation-time pair (BAP)). This is very vital as in many medical applications, where often prediction accuracy holds precedence over processing or computation time. In contrast, computation time is more significant in time-sensitive fields, such as natural disaster prediction.
Long-term RF-EMF exposure studies are, in general, limited in both plant and animal studies. Usually, long-term animal investigations are carried out utilizing rats and mice (both male and female) exposed for two years of RF-EMR varying between 10 and 2000 MHz, and this gives a sensible substitute to human exposure. Despite the success of short-term studies, no pathological or carcinogenic effects have been found in long-term RF-EMR studies at non-thermal levels. This includes histopathology in lifespan and hematology studies at 800 MHz, 835/847 MHz, 2450 MHz (1.3 W/kg [66] and 0.3 W/kg [67]. Nonetheless, a few pathological impacts have been published at thermal levels [68,69]. Besides, a previous study [70] has observed an increased tumor occurrence with long-term RF-EMR exposure at non-thermal levels using animals. Researchers might apply ML algorithms (supervised and unsupervised) to long-term laboratory studies utilizing whole organisms (in-vivo), and epidemiology studies to improve the accuracy of the prediction. Figure 1 shows potential features, attributes, or variables of Bioelectromagnetic experiments (in-vitro, in-vivo, and epidemiological studies) that could be utilized in ML algorithms.
Similar to animal studies, to date, there have been limited investigations exploring the long-term impacts of the RF-EMF exposure on plants, in addition to acquiring a viable conclusion on whether there is a considerable impact or not [71]. Nevertheless, there is a considerable number of short-term exposure studies demonstrate that plants have encountered physiological or morphological changes on RF-EMR (up to 13 weeks) and show statistically significant changes [71]. Conversely, the outcomes from the long-term exposure investigations demonstrate no physiological consequences for plants exposed to RF-EMR due to mobile phone radiation. This comparison of both animal and plant studies demonstrates a crucial point to the discussion on the apparent absence of long-term exposure that could interpret as, perhaps as an adaptation to RF-EMR.
Biological effects of RF-EMR from the mobile phones may depend on the frequency, mean power level and modulation of the EM signal. Numerous studies examined the health effect of the use of mobile phones. These findings are revealed from epidemiological, living organism (in vivo), and tissues in a petri dish or test tube (in vitro) studies. A lesser number of studies investigated the impacts of RF-EMF radiation on plants.
In-vitro findings are necessary to investigate natural and induced events, yet, the energies (SAR) and induced effects due to confounding elements are challenging to avoid. For example, background electromagnetic fields are non-homogenous, and temperatures inside laboratory incubators have been shown to skew results [72]. This fundamental criticism can be connected to various examinations that appear or do not exhibit biological effects. Nonetheless, organisms have in-built systems to repair the damages and maintain homeostasis [73]. The limitation of this study is the generally low sample size (1127 reported experimental case studies) to the robustness of outcomes.
Few epidemiologic studies [74][75][76][77] have associated exposure from mobile phones with neurological and cognitive dysfunctions. More repeated laboratory experiments and field studies are required [78][79][80] for future studies to additionally examine critical physical parameters that impact the biological impacts of RF-EMF. Nevertheless, the cumulative effect of mobile phone radiation is yet to be confirmed.
This study further contributes knowledge to the potential benefit of ML in the Bioelectromagnetics space. With time, a bigger sample size can be collected. Hence, further evaluations in this space are yet to be performed. We recognize a strong correlation between exposure time with SAR×time (effect of aggregated SAR within the exposure time frame) and SAR with the frequency of weak RF-EMF on human and animal species. Interestingly, the connection between frequency and exposure time was not notable. Hence, varying responses (either cellular response presence or absence) made it harder to identify [81] and measure the complex effects of weak RE-EMF. Now is the era where the progression of technology shapes how people perceive everything. Future applications in public health and occupational and environmental epidemiology should utilize ML algorithms. Additionally, the cumulative impact of weak RF-EMF demands inquiry. With time, a more significant sample size can be gathered, consequently, further assessments in this space are yet to be achieved. However, none of these findings can be directly associated with human.

Future Directions
The potential adaptability of ML algorithms in the field of Bioelectromagnetics research for human and animal cells has been explored in this study. Decision making employing predicting techniques could be the best approach. Yet, there are many factors to be investigated with regards to computation and cost-efficiency. This can be further extended by utilizing these techniques in other topics, such as in-vivo and epidemiological studies using living beings (cells, animal, plant, and human populations), as mentioned in our previous study [54]. Thorough knowledge of correlation factors between features in these studies is also essential.

Data, Data Size, Data Quality, Parallel, and Distributed Computing Challenges
Predicting future events by utilizing ML can be limited by poor data quality and data governance challenges. Training a classifier with poor data presents the genuine chance of producing a framework with inherent bias and unreliable or unsatisfactory results. Data researchers need to take care that the data they utilize to train their models to be as reliable and as unbiased as could be.

Feature Selection Strategy
Feature selection is one of the critical factors in ML, which hugely impacts model performance or classifier performance. Which features should we employ to build a predictive model is a challenging query that might need an in-depth knowledge of the problem domain? This could either mean adding features or variables to the model or removing features that do not improve model performance. Features will be chosen automatically or manually to deliver the best prediction accuracy or outputs that we prioritize. This is something to be further investigated, as predictions with more comprehensive input features is essential. In our data set, we had a very high data size to feature ratio (1127:6), which might not lead to overfitting of the training data. However, many possible future applications, such as occupational and environmental epidemiology studies, inherently provide more features in their datasets with low data size to features ratios. Hence, feature selection is an essential requirement; otherwise, built models may not hypothesize well enough to extract potentially hidden observations.

Machine Learning, Deep Learning, and Artificial Intelligence for Future Bioelectromagnetics
Deep learning additionally has great potential in its use in the medical field. It is "deep", since it forms data through a wide range of layers. Hence, with a more substantial amount of data, it usually requires a high-performance computing (HPC) facility with many graphics processing units (GPUs), which are essential for calculations that are necessary for deep learning. More or less, artificial intelligence (AI) includes instructing computers to think in ways that a human might think. This is one of the emerging technologies of the modern era, and many are rushing to integrate AI with their systems. Hence, adopting AI into the bioelectromagnetics space exists as an exciting avenue to explore. The inherent adaptability of ML in the bioelectromagnetics field for human and animal cells (in-vitro) has been demonstrated and, hence, increased the likelihood that ML that could be implemented to other topics such as in-vivo and occupational and environmental studies, using animal, plant, and human populations. The still uncertain cumulative impact of weak RE-EMF demands inquiry, in-terms of laboratory experiments, in both occupational and environmental epidemiology. ML is a viable strategy for discovering features best characterizing the RF-EMF exposure scenarios; hence, it might be beneficial to better tailor occupational and environmental epidemiology and public health studies accordingly, as indicated in our previous research [54].

Conclusions
The progress of emerging technology and digital transformation are recognized to increase and intensify in the coming years. Modernized technologies that rely on wireless communication may cause increased levels of radiofrequency electromagnetic field (RF-EMF) exposure. This resulted in research interest in the space of bioelectromagnetics, which aims to investigate the consequence effect of RF-EMF on living organisms. Hence, using robust predicting methods to identify the impact has become increasingly more critical. Strong correlations were observed between SAR and exposure time of weak RF-EMF, while an insignificant relationship was observed between frequency and exposure time. As reported in our previous study (ML algorithms to predict the effect of weak RF-EMF on plants), this study (ML algorithms to predict the effect of RF-EMF on human and animal cells) also supports that the Random Forest algorithm outperforms most traditional learning algorithms in the bioelectromagnetics space. The results show that good predictive accuracy can be achieved when using feature selection methods. This study further confirmed that supervised ML is a viable strategy for discovering features best characterizing the RF-EMF exposure scenarios. Technologies are changing with time and, therefore, utilizing and recognizing the time of the study as a feature is significant. In spite of the low sample size of the study (1127 reported experimental case studies-human and animal cells in in-vitro studies) that restricted its statistical potential, this analysis demonstrates that ML algorithms can be utilized to effectively predict the impact of weak RF-EMF on human and animal cells. Feature selection is an essential strategy employing ML in bioelectromagnetics research, especially in occupational and environmental studies using animal, plant, and human populations. This is the first time that the supervised ML approach has been employed for the characterization of weak RF-EMF exposure scenarios on human and animal cells. Machine learning techniques (supervised, semi-supervised and unsupervised algorithms) contributes to innovative and practical RF-EMF exposure prediction tools. The inherent adaptability of ML in the bioelectromagnetics field for human and animal cells (in-vitro) has been demonstrated. It increases the likelihood that ML could be implemented in other areas, such as in-vivo and occupational and environmental studies, while using animal, plant, and human populations. This investigation further contributes to knowledge of the potential advantage of ML in bioelectromagnetics. This analysis may potentially improve our understanding of which features (data variables) should be gathered in the future to explain the causes of high or low weak RF-EMF exposures. In future, with more experimental data, the sample size can be increased, leading to more accurate work.
Funding: This research received no external funding.

Conflicts of Interest:
The author declares no conflict of interest.

Lazy
The appropriate value of K, based on cross-validation, can be selected.
The kNN (k-number of neighbours) uses the nearest neighbour search algorithm. Using cross-validation, the algorithm chooses the best k value between 1 and the value mentioned as the kNN parameter Numeric, nominal, binary, date, unary, missing values Aha (1991) [82] Random Forest Trees Random forests algorithm builds a forest of random trees. This considers a mixture of tree predictors (where each tree depends on the independent values of a random vector sampled) and employs similar distribution for all trees in the forest. When various trees in the forest become huge, the generalization error for forests converges as far as possible to a limit. The error of the forest tree classifiers relies upon the power of the individual trees and the correlation between the trees. In this method, the data does not require to be re-scaled or transformed. Primarily, Random forest tackles outliers by binning them