In Vivo Studies on Radiofrequency (100 kHz–300 GHz) Electromagnetic Field Exposure and Cancer: A Systematic Review

The increasing exposure of the human population to radiofrequency electromagnetic fields has increased concern about its possible health effects. The aim of this systematic review is to provide an update of the state of the research on this topic, through a quantitative analysis, to assess the increased risk of tumor incidence in laboratory animals (rodents) without limitations of species, strain, sex or genotype. The review was conducted according to the PRISMA guideline and individual studies were assessed by referring to the OHAT Risk of Bias Rating Tool for Human and Animal Studies. A total of 27 studies were considered eligible for the evaluation of tumor incidence; a meta-analysis was carried out on 23 studies to assess the possible increased risk of both malignant and benign tumors onset at the systemic level or in different organs/tissues. A significant association between exposure to RF and the increased/decreased risk of cancer does not result from the meta-analysis in most of considered tissues. A significant increased/decreased risk can be numerically observed only in heart, CNS/brain, and intestine for malignant tumors. Nevertheless, the assessment of the body of evidence attributes low or inadequate evidence for an association between RF exposure and the onset of neoplasm in all tissues.


Introduction
Over the past decades, exposure to radio frequency (100 kHz-300 GHz) electromagnetic fields (RF-EMF) has steadily increased, causing growing concerns for human health. Consequently, extensive research on the effects of EMF exposure on different biological targets (reproductive system, immune system, nervous system, etc.), through observational studies and experimental studies on different models, were carried out by laboratories all around the world.
The use of electromagnetic fields has focused on the creation of capillary networks for telecommunications and "wireless" connections: one of the major concerns regards the possible carcinogenic effects related to chronic EMF exposure at an intensity such as not to induce acute and/or perceptible effects [1].
In 2011, the International Agency for Research on Cancer (IARC) [2] classified RF-EMF as "possibly carcinogenic to humans", allocating them to Group 2B of its classification system. The possible carcinogenic effects of RF-EMF have been investigated since the early 1980s on various animal (in vivo studies) and cellular (in vitro studies) models, evaluating both the direct onset of tumors and the alterations of tumor-related parameters. In this framework, in vivo studies have an important role in supporting the evidence derived from epidemiological studies aimed at evaluating the possible carcinogenic effects of RF exposure on the human population. The data of these studies, often contradictory, highlighted the need to carry out overall evaluations of the results, through international panels and reviews in order to perform a health risk assessment to support decision-makers and inform the general public. 2

of 36
To our knowledge this paper is the first attempt of a systematic review (with metaanalysis) on RF carcinogenic effects in in vivo studies.
RF-EMF animal studies on carcinogenesis cover a wide range of experimental situations, in terms of study design, exposure modality and biological endpoints. This peculiarity has ambivalent effect: on one hand, it is difficult to make a univocal classification of the studies and, consequently, to compare the results for a comprehensive analysis, on the other hand, the diversity of studies manages to cover a wide range of experiments, providing a reasonably good insight into the effects of RF-EMF exposure on carcinogenesis in laboratory animals.
Different exposure scenarios were employed in terms of frequency, dose of treatment, exposure modalities (i.e., full-body vs. localized exposure and restrained vs. free animals moving within large cages), duration and daily timing. Different frequencies were used, from a few hundred MHz up 3.7 GHz (for carcinogenesis studies), with different modulation schemes, (continuous wave (CW), pulsed, mobile signals). The most used frequencies were those of mobile communications and 2.45 GHz employed for Wireless Fidelity (Wi-Fi) systems and microwave ovens.
One of the main critical issues in the RF-EMF experimental in vivo studies is the assessment of the effective dose induced in the EMF exposed object/subject provided in terms of Specific Absorption Rate (SAR, W/kg) [3]. Moreover, the effect of RF-EMF exposure was studied both using RF-EMF alone and in synergy with other well-known carcinogens.
The aim of this review was to evaluate the effects of the RF-EMF in vivo exposure on tumor incidence at the systemic level or in different body organs/tissues.

Materials and Methods
The protocol of this systematic review is based on the guidance provided by the Cochrane Collaboration [4], the National Toxicology Program-Office of Health Assessment and Translation (NTP-OHAT) [5], and the guidelines "Preferred Reporting Items for Systematic Reviews and Meta-Analyzes" (PRISMA-P) [6]. This protocol is registered on PROSPERO [CRD42020191105] and published in a peer-review journal [7].
The protocol considers both carcinogenesis and co-carcinogenesis in in vivo studies; in this review, only the carcinogenesis analysis is discussed and presented, whereas the co-carcinogenesis analysis will be object of a next paper.

Eligibility Criteria
The review question was defined in terms of PECO (population, exposure, comparison, outcome): • Population: rodents of both sexes, of all ages and species and of all genetic backgrounds (wild type, transgenic and tumor-prone animal models); • Exposure: exposure to the electromagnetic field in the frequency range from 100 kHz to 300 GHz (all the modulations included), accurately characterized through dose assessment [8,9]; • Comparison: the "sham" sample, i.e., animals treated under conditions similar to those of exposed ones except for RF-EMF exposure, with particular reference to restraint conditions and stressing manipulations; papers describing experiments with cage control only or using the group exposed at the lowest dose level as a comparison were excluded; • Outcome: the onset of neoplasms in laboratory animals exposed to RF, in terms of incidence of primary tumors; tumor incidence and survival were the main endpoints (outcome measures) on which this systematic review was focused; • Articles reporting exclusively tumor-related parameters (i.e., genotoxicity, oxidative stress, etc.) were excluded from the analysis. Papers not written in English language and were not peer-reviewed and were not original (review, letters and comments) were excluded too.

Search Strategy
The search strategy of the primary works involved PubMed and EMF Portal as database sources, integrated with: • the list of references of descriptive reviews on the same subject, published over the years or carried out by international panels of experts [2,[10][11][12][13][14][15][16][17][18][19][20][21]; • the list of references of the selected papers; • No limits were set on the year of publication; • The query used for PubMed search and the criteria adopted on EMF Portal are attached to the protocol as supplement (suppl_1) [7].
Records identified from all the mentioned sources were imported into the EndNote X9 bibliographic management software. Its specific functions were used for both removing duplicates and the classification of works based on relevance and keywords.
The search strategy was peer-reviewed as part of the publication process of the protocol.

Selection Process
All potentially relevant articles were screened for eligibility in two stages: a first stage in which the articles were selected, on the basis of title and abstract, by three authors, and a second stage, in which the full texts of the remaining papers were independently reviewed by two groups of investigators, with each composed of one biologist and one expert in EMF dosimetry. Disagreements and technical uncertainties were discussed and resolved among review authors.
In order to define the dosimetry evaluation criteria and the possible problem of "publication bias" related to the historical period, an assessment of the years of publication was made before proceeding to the extraction data of the included articles.

Data extraction and Data Extraction Format
The data extraction form (Excel file) was defined and agreed upon before to start the analysis.
The extracted data included: • Study design (number of experimental groups, control group(s), number of animals per group, randomization and blinding); • Animal model: species, strain, sex and genotype of animals (wild type (WT)/transgenic); • Exposure duration (LTE: long-term exposure, MTE: medium-term exposure, STE: short-term exposure); • Timing of treatment (i.e., hours per day, days per week and total period); • Exposure details (i.e., frequency, modulation, dose, exposure modalities in terms of whole body vs. localized exposure and restrained vs. freely moving animals type of exposure system); We also extracted data on potential conflict of interest in all included studies. The main purpose of this first data extraction scheme was to organize the information to carry out the RoB evaluations of the individual papers and to prepare a summary table (database for meta-analysis). In this table, each article was reported as many times as the number of treatment groups.

Classification of Tumors
A specific classification of the tumors was required because some of them can be both malignant and benign, and this is not always specified by the authors. The taxonomy adopted and the reasons behind it are reported in Table 1.

Tumors (Malignant or Benign) Classified as Malignant by Precautionary Approach, Unless the Authors Have not Indicated Otherwise
Hemangioma, Granular Cell Tumor, Renal Mesenchymal Tumor, Hepatoblastoma, Nephroblastoma.

Tumors (Rare with Low Percentage of Cases of Malignancy) Classified as Benign, Unless the Authors Have not Indicated Otherwise
Pheochromocytoma, Interstitial Cell Tumor.

Risk of Bias (RoB) Evaluation
In order to evaluate the possible methodological limits, sources of error, which could influence the reliability of the summary result, a critical reading of all papers was carried out by two groups of reviewers independently to assess RoB following the criteria provided by the OHAT manual "Risk of Bias Rating Tool for Human and Animal Studies" [22]; for each paper, the following elements were evaluated:

1.
Adequate randomization of administered dose or level of exposure, evaluating whether each animal had an equal chance of being assigned to a control or a treatment group; 2.
Allocation of animals to treatment groups unknown to operators; 3.
Evaluation of the experimental protocol or analysis of possible confounding variables not adequately identified and characterized; 4.
Blinded treatment and analysis of groups of animals (blind or double-blind); 5.
Evaluation of the exposure conditions, which had to be well defined and documented; 6.
Use of standardized methods for determining the results (effects): specific and reliable tests and adequate statistical methodology; 7.
Reporting of all expected outcomes; 8.
Calculation of animal losses (attrition bias), due to death, during the experimental period, for reasons other than those foreseen by the experimental protocol; 9.
Considering the relevance of the topic and knowing that many studies were funded by companies with significant commercial interests in the mobile telecommunications sector, it was decided to include in the Reporting Domain of the RoB, the possible Conflict of Interest as item 9.
Each of these 9 elements was evaluated according to the following scheme: "++ definitely low risk of bias" when there is evidence to exclude methodological errors in the study; "+ probably low risk of bias" when the evidence suggests that, even if methodological errors are present, their extent is such as not to influence the results of the study; "− probably high risk of bias" there is evidence of possible errors or gaps in the definition of the element such as not to guarantee the quality of the results; "−− definitely high risk of bias" there is evidence of methodological errors or serious gaps in the definition of the element that could have affected the results.
Three quality categories were defined (1: high quality, 2: intermediate quality, 3: low quality) where the papers were allocated on the basis of the evaluation of the 9 elements defined above; to define the category greater weight was given to items 3 (adequacy of the experimental protocol), 5 (adequate dosimetry) and 6 (reliability of the methods used to evaluate the outcome).
With regard to point 3 ("Evaluation of the experimental protocol or analysis of possible confounding variables not adequately identified and characterized"), it was decided to assign the judgment "− probably high risk of bias" or "−− definitely high risk of bias" to articles where the sham control group was shared by more than 3 treatment groups. In case of rare events, in fact, sharing one control group among more treated groups can be risky, because the presence of a "zero" or a "one" can be completely random and may "force" the overall result in one direction (increased risk) or another (reduced risk). In general, a control group larger than treated group (or case groups) is recommended.
Furthermore, it was decided to assign a "−−" to studies directly financed by companies (item 9).

Meta-Analysis: Strategy
All the data are reported as the number of events and non-events into the two groups of exposed and sham (2 × 2 table), so the meta-analysis was carried out by computing the Risk Difference (RD), the Odd Ratio (OR) and the Risk Ratio (RR), as effect size measure [23].
The meta-analysis performed is an Individual Participant Data (IPD) meta-analysis. The Random Effects Model has been chosen to calculate the absolute and relative weight [24][25][26][27][28]. Parameters adopted to define homogeneity/heterogeneity and significance were I 2 , tau, z-value, p-value (significance p < 0.01).
From the summary table, a table for each organ/tumor was created with the most relevant information such as: • the SAR value (without uncertainty); • the type of exposure duration (LTE-longer than 52 weeks, MTE-longer than 9 weeks, STE); • the number of exposed animals with tumor (treated incidence), the total number of animals in the exposed sample, the number of sham animals with tumor (sham incidence), the total number of the animals in the sham sample; in the papers employing both sexes, the incidence data in males and females were added together, so analysis by sex was not performed; • the animal type and the genetic background (WT/prone).
Animal species and genetic background were used for the subgroup analysis: this assessment was performed in order to validate the hypothesis of homogeneity of the included studies; dose and "exposure time" were used for the regression analysis. The "exposure time" data is expressed in terms of total hours of exposure, and it was obtained as the simple product of the number of exposure hours per day by the number of actual exposure days.
Meta-Essentials tool (version 1.5) [29] was chosen and used to carry out the metaanalysis. The tool consists of a set of Excel workbooks (one for each type of independent variable), prepared by a team from the Rotterdam School of Management, Erasmus University, The Netherlands, under an ERIM Support Program and licensed under Creative Commons Attribution-Non-Commercial-ShareAlike 4.0 International ([29,30]). The tool (the folders of our interest, binary data) is complete and totally transparent of operations and algorithms providing the possibility of making changes and validation.
The results are shown reporting the summary effect size RR, with the relative variability limits, the significance, the forest plot and the funnel plot for publication bias.
If the funnel plot, upon visual inspection, showed that more imprecise studies with non-harmful effects were missing, this was considered an indication of publication bias.

Quality Assessment (Confidence Ratings and Evidence of Health Effects)
To evaluate the quality of evidence, that is, the confidence in the estimates of observed effect, we primarily applied the guidance from NTP-OHAT [5]. The assessment was performed for the entire body of evidence by each outcome; possible disagreements and uncertainties were discussed among review authors and the agreement was reached by consensus. We started from a "high quality" grade, a general feature for randomized in vivo studies [5], and six items were considered to degrade this quality of evidence: (i) experimental design, (ii) Risk of Bias, (iii) inconsistency, (iv), indirectness of evidence, (v) imprecision and (vi) publication bias. Within each of the relevant domains, concern for quality of evidence was assessed using the ratings: "none", this evaluation leads to no lowering of the rating; "serious", that results in a lowering of the quality by one level; and "very serious", that results in a lowering of the quality by two levels. Two items-consistency between species and presence of a dose response-were considered to upgrade the quality of evidence. The quality was classified according to the OHAT categories as high, moderate, low or very low. Finally, the Evidence of Health Effects was evaluated according to the same tool.

General Description of the Selected Carcinogenicity studies
A total of 294 primary articles (114 articles obtained from EMF Portal, 112 obtained from PubMed and 166 obtained from other sources) were selected and uploaded to EndNote after removing duplicate records. The databases were last consulted in April 2022.
The first screening by title and abstract was performed according to the defined exclusion criteria.
The technical reports of the National Toxicology Program (NTP) [31,32] on carcinogenesis effects of RF exposure on rats and mice, respectively, were included, even if not published yet; these reports, released in 2018, although controversial, are considered among the most complete studies currently available on the impact of RF exposure on carcinogenesis [21].
After the first screening, a total of 237 papers were excluded, and the remaining 57 were examined using full-text analysis. A further 11 papers were excluded for the following reasons:

•
Missed or incomplete EMF dosimetry (n = 3; [33][34][35]); • Absence of the sham control group (n = 3; [36][37][38]); • Studies based on animals in which tumor cells are implanted ("implanted tumor") before exposure to RF in order to evaluate the effects on the development of neoplasms. The aim of this review is to investigate the genesis of each type of tumors not their development (n = 3; [39][40][41]); • Absence of specific data (n = 2; [42,43]). In particular, [42] examines the onset and growth of neoplasms exclusively by palpation and the data are only given in terms of cumulative tumor appearance without specifying the organ of onset and [43] provides data on various histological parameters not strictly related to the onset of neoplasms, where only the lack of onset of leucosis (a tumor process affecting the progenitor cells of leukocytes) is observed, and tumor incidence is not reported.
After this last selection, a total of 46 articles were eligible: 23 papers were carcinogenesis studies, 19 were co-carcinogenesis studies and 4 analyzed both carcinogenesis and co-carcinogenesis.
Most of these articles (35) have been published within the decade 2000-2010, when the European Community decided to fund many projects on this topic in the Framework Programs; this opportunity favored the standardization of the exposure protocols (and the exposure systems) and therefore the quality and the homogeneity of the studies.
Given that the data extraction proceeded by separating the carcinogenesis articles from those of co-carcinogenesis, the papers dealing with both treatments were included in both groups composed of 27 and 23 articles, respectively. The flow chart with the results of the bibliography acquisition process is shown in Figure 1. Regarding the type of animals (POPULATION) employed in the selected papers total of 12 papers (30 treatment groups) described experiments performed on rats, a the remaining 15 papers (36 treatment groups) used mice ( Figure 2a).
All studies using rats were carried out on 'wild type' strains (Fisher, Wistar Sprague Dawley); whereas, with regards the experiments on mice, 10 papers (18 tre ment groups) reported experiments performed on prone mice, 3 papers (18 treatme groups) showed experiments on 'wild type' mice and 2 papers (10 treatment grou reported experiments on both 'wild type' and prone animals (Figure 2b).
Regarding the type of animals (POPULATION) employed in the selected papers, a total of 12 papers (30 treatment groups) described experiments performed on rats, and the remaining 15 papers (36 treatment groups) used mice (Figure 2a).  [63] performed experiments at 2450 MHz (pulse of 10 μs, 800 pps), de Seze et al. [64] carried out experiments at 3700 MHz (pulses of 2.5 ns, 100 pps), Jauchem et al. [67] reported exposures to an Ultra-Wide Band signal (pulses of 2.5 ns, 1 kHz) and Toler et al. [74] presented exposures at 435 MHz (1 μs, 1 kHz) (Figure 3a). Moreover only five papers (eight treatment groups) presented experiments with localized exposures of the animals' head (all with SAR values lower than 2 W/kg); the remaining papers (22 articles and 58 treatment groups) concerned experiments with whole body exposures.
Regarding the dose, SAR values ≤ 0.1 W/kg were used in 3 papers (8 treatment groups), SAR values in the interval 0.1 < SAR ≤ 2 W/kg were used in 19 papers (36 treatment groups), SAR values in the interval 2 < SAR < 6 W/kg were used in 9 papers (16 treatment groups) and, finally, SAR values greater than 6 W/kg were used in 4 papers (6 treatment groups) (Figure 3b All studies using rats were carried out on 'wild type' strains (Fisher, Wistar or Sprague Dawley); whereas, with regards the experiments on mice, 10 papers (18 treatment groups) reported experiments performed on prone mice, 3 papers (18 treatment groups) showed experiments on 'wild type' mice and 2 papers (10 treatment groups) reported experiments on both 'wild type' and prone animals ( Figure 2b).
A total of 14 papers (42 treatment groups) described experiments on animals of both sexes, 11 papers (22 treatment groups) only on female animals, whereas two papers (two treatment groups) only on males.
Regarding the dose, SAR values ≤ 0.1 W/kg were used in 3 papers (8 treatment groups), SAR values in the interval 0.1 < SAR ≤ 2 W/kg were used in 19 papers (36 treatment groups), SAR values in the interval 2 < SAR < 6 W/kg were used in 9 papers (16 treatment groups) and, finally, SAR values greater than 6 W/kg were used in 4 papers (6 treatment groups) ( Figure 3b).
Regarding the duration of exposure, 20 papers (57 treatment groups) reported LTE experiments, 5 papers (6 treatment groups) reported MTE experiments and only 2 papers (3 treatment groups) exhibited very short exposures (Figure 3c).  Regarding the dose, SAR values ≤ 0.1 W/kg were used in 3 papers (8 treatment groups), SAR values in the interval 0.1 < SAR ≤ 2 W/kg were used in 19 papers (36 treatment groups), SAR values in the interval 2 < SAR < 6 W/kg were used in 9 papers (16 treatment groups) and, finally, SAR values greater than 6 W/kg were used in 4 papers (6 treatment groups) (Figure 3b).
Regarding the duration of exposure, 20 papers (57 treatment groups) reported LTE experiments, 5 papers (6 treatment groups) reported MTE experiments and only 2 papers (3 treatment groups) exhibited very short exposures (Figure 3c).
Moreover 15 papers (38 treatment groups) reported experiments with daily exposures less than 4 h, 11 papers (26 treatment groups) reported experiments with daily exposures greater than 12 h and only 1 paper (2 treatment groups) reported experiments with daily 6-h exposures.
Regarding the type of assessed OUTCOME measures, all papers reported the incidence data provided in terms of the number of animals developing cancer; 22 papers (50 treatment groups) also reported the survival data.

RoB of the Selected Papers
The results of the overall assessment of the RoB and the quality category of the carcinogenesis studies included in the analysis are reported in Table 2.

Incidence Analyses
A table for each organ/tumor was created from the summary table and, in agreement with most authors, some organs have been grouped according to the anatomical system: eye, harderian gland, ear, nose and mouth have been inserted into the sensorial system; prostate, testicles, glans and epididymis have been inserted into the male uro-genital system; uterus, ovaries and clitoris have been inserted into the female uro-genital system; brain and cranial nerves have been inserted into the central nervous system (CNS). These tables, containing "raw" data, are shown in S1.
Furthermore, considering the importance of the CNS and the brain, the latter was also analyzed separately; in addition, a detailed analysis of brain tumor type was carried out using data from the studies that detailed their typing.
After the definition of the groups for the meta-analysis on the basis of the organ/tumor, three more papers were excluded for the substantial difference in the treatments with respect to the other papers: • de Seze et al. [64]: 3.7 GHz pulsed signal administered for two 8-min intervals per day, 5 times per week for a total of 8 weeks; • Jauchem et al. [67]: UWB signal administered for 12 min/week for a total of 12 weeks; • Saran et al. [77]: 900 MHz GSM modulation signal, administered for two 30 min/day for 5 days. • Furthermore, the article by Jin et al. [68] was also excluded from the meta-analysis as it only reports inflammatory phenomena and does not detect the onset of tumors. • A qualitative descriptive analysis of these papers is separately reported.
In S2 (one figure for each organ/tumor), all the data of the meta-analysis are shown: the "raw" incidence data, the effect size measure RR of each treated-sham comparison, with the relative variability limits, and the relative significance; the forest plot and the funnel plot for publication bias.
All the summary information on the possible increase in the risk of the onset of malignant and benign tumors, consequent to RF exposure, is reported in Tables 3 and 4, and in terms of RR and RD, evaluated organ by organ. In addition to the results, the following data, for each sample (organ/tissue), is reported: number of treated-sham comparisons (number of elements in the sample, column 2 of Tables 3 and 4), number of papers from which the studies were extracted (column 3) and the ratio between the total number of exposed animals and the total number of sham animals (column 5). It was decided to insert in column 4 the information, supplementary to column 5, regarding the number of papers where more than two treatment groups share the same sham; moreover, the maximum number of treatment groups with the same sham was reported. For example, in the first line of Table 3 (Adrenals), out of 24 elements (treated-sham comparisons) extracted from 8 papers, four papers present different treatment groups (up to six) compared with a single sham: for 24 treatment groups (for a total of 3538 animals) there are only eight sham groups (for a total of 1166 animals). As can be seen in Tables 3 and 4, the number of sham animals was always much lower than the number of exposed animals. (*1) Number of papers with single sham shared with more than two studies/Max number of treated groups sharing the same sham. The number of independent sham groups is generally equal to the number of papers; only 3 papers use 2 sham groups vs more than six exposed groups. (*1) Number of papers with single sham shared with more than two studies/Max number of treated groups sharing the same sham. The number of independent sham groups is generally equal to the number of papers; only 3 papers use 2 sham groups vs more than six exposed groups.
There was a substantial agreement of the results obtained with both RR and RD summary effects. The non-significance of almost all results was observed with the exception of: CNS, brain, heart and intestine for malignant tumors, CNS, brain, male uro-genital system and kidney for benign tumors.
It should also be noted the apparent discrepancy in the bone marrow results was that despite both RR and RD agree on a risk decrease, RD would be statistically significant, unlike RR. Another apparent discrepancy is the result of the leukemia sample where RR and RD disagree and moreover the significance results only for RD (this disagreement is due to the algorithm for the calculation of RD, when one "zero" value is present in the comparison).
The data relating to malignant tumors of heart and brain, which showed significance in the results of the meta-analysis, are analyzed in detail in the Section 3.3.1.
A definitive synthesis of the incidence of (all) tumors in relation to RF exposure, providing a "sum" data for malignant and a "sum" data for benign tumors was not possible. Despite a fair homogeneity in the exposure conditions (duration longer than one year, total-body exposure and SAR levels below 4 W/kg) indeed, most studies have considered only a few organs, and this has determined a lack of homogeneity regarding the endpoint evaluation. The "sum" data obtained from the only studies presenting data in all organs, would be more conclusive but affected by a high "attrition bias", so the incomplete input of the data would make them incorrect.

Heart and CNS/Brain Analysis
The results of meta-analysis on tumor incidence in the samples of heart and brain (CNS results are similar and showed in Figure S2.5 in File S2), complete with forest plot and funnel plot, are shown in Figures 4 and 5, respectively.
Considering the importance of the brain as a target organ and the significance of the increased risk, it was decided to perform the analysis by tumor type according to the classification provided by most of the authors. Two papers ( [72,85]) did not provide information on the type of tumors; the other ones (7 out of 9 papers, for a total of 17 comparisons out of a total of 26) provided classification criteria that allowed the grouping of tumors (both malignant and benign) into tumors of the glia and meninges. These samples were analyzed by considering separately the incidences of malignant glia and meninges tumors (including the granular cell tumor), and benign meninges tumors (this latter sample coincides with all the benign tumors, Figure S2.27 in in File S2). The results of this detailed analysis are shown in Table 5.
As a further study, an analysis of only malignant tumors of the spinal cord (no benign tumors were evidenced in this tissue) was also performed. The results are shown in the Figure 6. In this tissue the combined RR is 1.441 with no significance (p = 0.162).

Subgroup Analysis
The subgroup analysis was performed for the covariates, species and genetic background, because the samples had a fair degree of homogeneity in all the other elements. The results are presented in Tables 6 and 7, for malignant and benign tumors, respectively. For each sample, the "combined" RR with the relative p-between are reported. The subgroup analysis was carried out only for samples in which treated-sham comparisons derived from more than two articles.
There are no significant differences in the genetic background comparison for either malignant or benign tumors, whereas in the comparison between species, breast and spleen malignant tumors and skin benign tumors show significant differences.     (*1) Number of papers with single sham shared with more than two studies/Max number of treated groups sharing the same sham. The number of independent sham groups is generally equal to the number of papers; only 3 papers use 2 sham groups vs more than six exposed groups.

Regression Analysis
In Table 8, the results of the regression analysis for the covariates dose and exposure time are reported for malignant tumors, whereas the results for benign tumors are shown in Table 9. The results are related to the RR variable and include: the coefficient of the regression line, the p-value relative and the R 2 (%). Samples with significant regression are in bold (Leukemia and Mammary). The regression analysis results do not provide useful elements to define a dose-effect or duration-effect relationship in any of the analyzed samples. The exposure times significances reported in the Tables 8 and 9, (see malignant breast cancer, leukemia and benign adrenal glands tumors (adrenals and thyroid) are attributable to the high range of variability of the variable "duration of exposure" (546-25.000 h). Within this range, however, the occurrences are concentrated in a few repeated values. These results, indeed, do not show any correlation with the data of Summary Effect Size (and the relative significance).
To clarify this concept, the regression line of Leukemia sample is shown (Figure 7): four elements have a maximum duration of a little more than 2000 h, the other 13 elements have durations greater than 13,700 h, there are no intermediate "duration values". These data confirm the high discontinuity of exposure times and significantly reduce confidence in the obtained result. The regression analysis results do not provide useful effect or duration-effect relationship in any of the analyzed s significances reported in the Tables 8 and 9, (see malignant b benign adrenal glands tumors (adrenals and thyroid) are attri variability of the variable "duration of exposure" (546-25. however, the occurrences are concentrated in a few repeate deed, do not show any correlation with the data of Summary significance).
To clarify this concept, the regression line of Leukemia s four elements have a maximum duration of a little more th ments have durations greater than 13,700 h, there are no inter These data confirm the high discontinuity of exposure tim confidence in the obtained result.

Survival Analysis
It was not possible to carry out the survival analysis by vival analysis, due to the lack of time intervals common to all

Survival Analysis
It was not possible to carry out the survival analysis by periods, or cumulative survival analysis, due to the lack of time intervals common to all studies. The number of live animals at the end of the exposure period was defined as variable for the survival; such variable refers to different periods due to the different exposure durations (from a few weeks up to 2 years). Studies that observed animals to death without reporting the survival data at the end of exposure were excluded from the analysis. An exception was [64], whose experimental protocol provided for an exposure time of a few minutes/day for 8 weeks (to ultra-broadband signals of high intensity), followed by an observation period of 2 years; in this case, the survival variable refers to the end of the observation period.
The overall meta-analysis was performed on 39 treated-sham comparisons and the RR was obtained with the same procedure used for the incidence analysis. The table of the meta-analysis results and the related forest plot are shown in Figure 8. In this case, the overall RR value is 1.08 (1.03-1.14). Saran et al. [77] exposed sensitive to X-ray transgenic mice (Patched1 heterozygous knockout mice) at 900 MHz at 0.4 W/kg. Previous results from the same group [86] demonstrated that these animals, when exposed to X-ray, in the first days of life, have a significant incidence of medulloblastoma and rhabdomyosarcoma. Although exposure to RF occurred during the aforementioned sensitivity time window, the authors found no effects on carcinogenesis and survival.

Qualitative Summary of the Excluded Works from the Meta-Analysis
Jauchem et al. [67] exposed C3H/HeJ female mice, a susceptible strain developing mammary tumors, to an ultra-broadband signal at a SAR of 0.0098 W/kg. The authors found no effects on the onset of breast tumors and on survival.
De Seze et al. [64] exposed male SD rats to ultra-wideband signals to 3.7 GHz at a SAR of 0.83 W/kg. The protocol included two exposures of 8 min/day, 5 days/week, for a total of 8 weeks; animals were observed up to 2 years of life. The authors observed a reduction in the survival of exposed animals (4 months over 2 years) and a significantly higher incidence of subcutaneous tumors in exposed animals compared to sham. This effect could be related to the peak high SAR value (> 3 MW/kg) administered via a continuous pulse train of 2.5 ns for 2 intervals per day of 8 min each.

Quality Assessment (Confidence Ratings and Evidence of Health Effects)
According to the protocol [7], the evaluation of the quality of evidence was performed starting from a "high quality" grade and the eight items as defined in the Methods paragraph were considered; the following Tables 10 and 11 were obtained for malignant and benign tumors, respectively.    Design: Serious: most information is from ++ and +, but there are a few − (because in each sample there is a high number of shared sham groups); Very Serious: the shared sham group (from NTP study) introduce an anomal data repeated as many times as the number of treated groups. RoB: Some Corcern: some studies show "−" in some relevant item; Conflict of interest item is not considered. Inconsistency: No if I 2 < 50%, Serious (−1) I 2 > 50% (up to 75%). Indirectness of Evidence: No: (most information is from wild type rodents). Imprecision: No, because of the high number of animals and because the boundaries of the CI of the pooled effect size are on the same side of the null value or the ratio; between the CI interval and the null value (RR) is less than 110%. Publication Bias: Yes: the authors declare the publication of incomplete data due to the fact that their data support the original data of the NTP study; No other study publishes data on the heart despite having analyzed it.

Discussion
In this systematic review, we summarized the current knowledge on carcinogenesis in laboratory animals exposed to electromagnetic fields, in the frequency range 100 kHz-300 GHz. For this purpose, we carried out a qualitative descriptive analysis of the 27 articles that were considered eligible on the basis of the exclusion criteria defined in the protocol [7]. It was feasible to carry out a meta-analysis of the possible increase in the risk of the onset of tumors on 23 of the eligible papers.
The in-depth reading of the papers with more than one treatment group highlighted that the number of sham animals is always lower than the number of exposed animals; therefore, in our analysis, sham control was shared with multiple treatment groups. As already pointed out, this practice, although very common in in vivo studies, determines an over estimation of events/non-events in sham controls which can lead to unreliable results in a meta-analysis aimed at assessing the risk of rare events, as in this review. In any case, a substantial agreement of the results obtained with both RR and RD variables was found with the exception of leukemia for malignant tumors. Almost all results were non-significant with the exception of: CNS, brain, heart and intestine for malignant tumors, and CNS, brain, male uro-genital system and kidney for benign tumors.
Based on these considerations, the samples that showed significant results in the metaanalysis deserve a detailed investigation. The significant results for benign brain tumors derive from nine treated-sham comparisons extracted from only two papers ( [31,80]), with each showing two brain benign tumors in the sham groups out of 817 animals and 180 animals, respectively. In the treated-sham comparisons, these values are repeated three and six times, respectively, against the incidence values in the treated groups that ranged from three to eight in [80] and from one to five in [31] (Table S1.27 in File S1).
Regarding the significance found in the onset of benign male uro-genital system and kidneys tumors, the presence of "zeros" in some exposed groups [31,32] is compared with the presence of three and four tumors in the sham groups for the male uro-genital system and with the presence of two and six tumors in the sham groups for the kidney (Tables S1.29 and S1.32 in File S1), leading to factitious decreases in risk. The same consideration can be applied to the reduction in the risk of developing malignant tumors in the intestine where the combined RR is 0.585 (0.399-0.857) with p < 0.01 (Table S1.11 and Figure S2.11 in File S1).
The increased incidence of malignant heart tumors risk ( Figure 4) was an expected result due to the data on heart schwannoma in [31,32,80], but not reported by any of the selected papers. In this meta-analysis, in each organ/tumor sample, only included were the papers reporting the presence of one tumor in at least one of the comparison groups: many papers examined heart histologically without finding any primary tumor ( [63,[65][66][67][74][75][76]79,81,[83][84][85]) and their results were not reported in the heart sample. As a consequence, the significance of the results of the heart sample can be attributed only to the data of [31] study (six treated-sham comparisons with a single sham) and of [80] (three treated-sham comparisons with only one sham of higher number); whereas, the incidences found in [32] (values of zero and one in all groups) can be attributed to randomness. The summary effect size measure is strongly affected by the presence of the "zero" in the two sham groups of [31,32] (repeated 12 times), although the authors consider it in line with the incidences found in the Historical Controls (0-2%).
Moreover, [80] shows a relevant incidence data (12 vs. 4) only in animals exposed to the lowest SAR level (0.001 W/kg) among all the considered exposure doses. In addition, it should be considered that data of this sample derive from only two studies and, therefore, the hypothesis of independence of the elements is much more labile than other samples (organs/tumor).
It should also be highlighted that: • The studies of [31,32] are not peer-reviewed and [80], by admission of the authors themselves, published only the data relating to heart to support the results of [31], stating to publish the complete data (on the other organs) at a later time. For these reasons the heart sample results affected by publication bias; • In this sample, a dose-effect response is not demonstrated, despite a very wide range of variability in SAR levels (from 0.001 W/kg to 6 W/kg), (see Table 8); • The authors of [31,80] report no statistical significance of their results in the overall assessment (collecting both sexes data, i.e., excluding the differences between the sexes).

•
The Authors of [31], regarding the increase of malignant tumor onset in the heart of male rats, declares: "In many cases isolated non-neoplastic or neoplastic lesion increases occurred in single or lower exposure groups, lacked a clear exposure response, or incidences were similar to incidences seen in control groups in past NTP studies. This reduced the confidence that these lesion increases were attributable to the cell phone RFR exposure." All these considerations contribute to considerably reducing the suspicion of a direct correlation between exposure to RF and the increased risk of developing cardiac neoplasms.
The CNS and brain samples represent the target of greatest interest in all the carcinogenesis papers of this review (as many as 20 articles out of 27 concern the effects of mobile telephony). The slight increased incidence risk of malignant tumor in the CNS (RR = 1.405) and in the brain alone (RR = 1.392) was an unexpected result, as no in vivo carcinogenicity study has ever found a statistically significant incidence data for brain and CNS tumors.
The significance of the data of this sample (9 papers, 26 treated-sham comparisons) is due to the weak positivity of most of the comparisons ( Figure 5): 18 comparisons present a relative risk in a positive direction (RR > 1, corresponding to an increase in risk), whereas only eight have a RR ≤ 1.
Moreover, in this case, the presence of the "zero" in the sham group of the [31] study (repeated six times) strongly influences the overall result so that, by removing the six treated-sham comparisons of [31] from the samples (both brain and CNS), the value of the summary RR decreases and loses the statistical significance. In this new condition, the brain sample, for example, consists of 20 comparisons, coming from eight papers, and has an RR = 1.267 [1.003-1.603] with p = 0.034.

Conclusions
This systematic review analyzed the experimental data extracted from 27 eligible articles regarding the onset of neoplasms in laboratory rodents exposed to EMF-RF; a quantitative analysis (meta-analysis) was conducted on 23 papers. Each study was examined for possible methodological limits and the RoB was evaluated.
A total of 25 organs/tumors were analyzed for malignant tumors and 16 for benign tumors to assess the confidence in the body of evidence of the carcinogenic effects. Starting from a "high quality" grade, a general feature for randomized in vivo studies [5], all items underwent a quality downgrade due to "serious" or "very serious" limitations in the experimental design, mainly caused by a low number of animals in sham groups. A further downgrade was determined by the classification of all studies as "some concerns" for bias, even without taking into account the conflict of interest.
The results obtained after subgrouping analysis by species (rats vs. mice) allowed an upgrade of the certainty of the evidence for many types of malignant and benign tumors. The lack of a dose-response relationship in all the analyzed samples did not allow for further upgrades.
Overall, these evaluations have determined a confidence rating from very low (heart sample for malignant tumors and CNS sample for benign ones) to moderate, resulting in inadequate or insufficient health evidence for a definitive assessment of the association between EMF-RF exposure and carcinogenesis in vivo.
This lack of certainty in the conclusions mainly derives from a very cautious GRADE approach, which does not appear entirely justified in this case given that the considered articles present a good homogeneity, both in the methods and in the results, providing adequate answers for the aims of this study. In this regard, it should be considered that, although in recent years the use of systematic reviews has been extended to experimental laboratory studies, the main guidelines [4,5] were developed considering the clinical trials. The different approach between clinical and laboratory works has highlighted some methodological difficulties for the application of grade procedures, which could be better analyzed in order to improve the guidelines for the future systematic reviews on animal studies.
Furthermore, it should be considered that the inclusion of only English-language papers may have represented a limitation of this systematic review.
In conclusion, the inadequate/insufficient health evidence found does not allow this systematic review to give additional information for the integration of present regulatory frameworks. Otherwise, this review updates the state of the art of research on in vivo RF-EMF experiments related to carcinogenesis and, for future research in this field, it emphasizes the need of an appropriate experimental design that takes into account the animal number and the sample number used for the sham control groups.
Future work will be the update of this review as required in [4]; in fact, the question of this review is of continuing importance to decision makers and the availability of new data or new methods would have a meaningful impact on the review findings. Moreover, a review update provides an opportunity for the scope, eligibility criteria and methods used in the review to be revised.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ijerph20032071/s1. File S1: Raw data in terms of malignant and benign tumor incidence for each considered organ/tumor; File S2: Incidence data and results of meta-analysis. Malignant tumors from Figure   Acknowledgments: Authors thank Paola Giardullo for the support during protocol definition phase, paper selection and data extraction.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. C.M., R.P., L.A. and P.V. are salaried staff members of ENEA. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.