Genome-Wide DNA Methylation in Policemen Working in Cities Differing by Major Sources of Air Pollution

DNA methylation is the most studied epigenetic mechanism that regulates gene expression, and it can serve as a useful biomarker of prior environmental exposure and future health outcomes. This study focused on DNA methylation profiles in a human cohort, comprising 125 nonsmoking city policemen (sampled twice), living and working in three localities (Prague, Ostrava and Ceske Budejovice) of the Czech Republic, who spent the majority of their working time outdoors. The main characterization of the localities, differing by major sources of air pollution, was defined by the stationary air pollution monitoring of PM2.5, B[a]P and NO2. DNA methylation was analyzed by a genome-wide microarray method. No season-specific DNA methylation pattern was discovered; however, we identified 13,643 differentially methylated CpG loci (DML) for a comparison between the Prague and Ostrava groups. The most significant DML was cg10123377 (log2FC = −1.92, p = 8.30 × 10−4) and loci annotated to RPTOR (total 20 CpG loci). We also found two hypomethylated loci annotated to the DNA repair gene XRCC5. Groups of DML annotated to the same gene were linked to diabetes mellitus (KCNQ1), respiratory diseases (PTPRN2), the dopaminergic system of the brain and neurodegenerative diseases (NR4A2). The most significant possibly affected pathway was Axon guidance, with 86 potentially deregulated genes near DML. The cluster of gene sets that could be affected by DNA methylation in the Ostrava groups mainly includes the neuronal functions and biological processes of cell junctions and adhesion assembly. The study demonstrates that the differences in the type of air pollution between localities can affect a unique change in DNA methylation profiles across the human genome.


Introduction
Long-term exposure to air pollutants has an adverse health impact and affects the genome and epigenome [1]. DNA methylation is one of the most studied epigenetic mechanisms that regulates gene expression and affects genome stability [2]. DNA methylation can serve as a useful biomarker of exposure, and an analysis of DNA methylation gives a better understanding of the effects of environmental exposure, as well as the role of epigenetic mechanisms, on human health [3,4]. CpG loci, as significant biomarkers of exposure, were summarized in a recent review [5]; they were further identified as a biomarker of tobacco smoke exposure in adults [6] and in smoking pregnant women [7,8], and were linked with alcohol consumption [9]. CpG loci were also found as environmental biomarkers for cumulative exposure to lead [10].
There is a wide range of methodological approaches for measuring the level of DNA methylation. A commonly used method is quantitative global DNA methylation, which represents an effective, fast and cheap solution [11,12]. However, this method cannot distinguish between important hypo-and hypermethylated sites. Other methods are used in epigenome-wide association studies (EWAS) [13]; these are currently mostly based on gene-specific genome-wide DNA methylation (microarray approaches).
The effect of air pollution was widely studied in children, who are more sensitive than adults, and in newborns, to clarify the effect of prenatal exposure [14,15]. In these studies, the authors found an association between prenatal exposure to NO 2 and particulate matter (PM) and significant CpG loci that regulate activity of genes associated with asthma and other pulmonary diseases. DNA methylation alterations induced by benzo(a)pyrene (B[a]P) contributed partially to abnormal DNA methylation in air pollution-related lung cancer samples. These changes may affect the development and progression of lung cancer [16].
Czech studies focused on the impact of air pollution on the human genome have a long tradition [17]; a contributory factor is that one of the air pollution hot-spots in Europe is situated in the Moravian-Silesian district, in particular in the Ostrava region. In Ostrava, the highest concentrations of B[a]P and fine PM < 2.5 µm (PM2.5) recorded in the Czech Republic are consistently being detected [18]. Thus, inhabitants of this locality have often been studied in molecular epidemiological research [19]. DNA methylation profiles were examined in asthmatic and healthy children [20]. Whole-genome gene expression was assessed in newborns from districts with different levels of air pollution [21], and city policemen [22,23] who worked in cities with different concentrations of air pollutants. The assessment of cancerogenic risk after the ambient air inhalation in inhabitants of industrial and non-industrial localities was also performed [24]. In the city of Ostrava, depending on the locality, the prevalence of respiratory diseases, such as asthma, or recurrent airway inflammation, is higher than the national average [25][26][27].
This study is a continuation of a long-term research project on air pollution's impact on the human genome. In this part, using microarray EPIC BeadChips (Illumina), we evaluated the genome-wide gene-specific DNA methylation in city policemen from three localities in the Czech Republic: Ostrava, Prague and Ceske Budejovice (CB). The study can be regarded as both an occupational and environmental exposure investigation because the city policemen were living in the study localities and, simultaneously, they spent most of their working time outdoors, where they were mostly exposed to engine emissions from car traffic, and to pollutants from local industry. Here, we aimed to find specific DNA methylation patterns based on the quantitative data of the annotated 856,865 CpG sites, which could be characterized by work and life in environments affected by different types of air pollution.

Characteristics of Study Subjects
The data on 125 city policemen (250 samples) were obtained from detailed questionnaires describing personal information, exposure history and lifestyle factors. All subjects were nonsmoking males, who had been residents in the study cities for at least 3 years. Table 1 shows the most important data, characterizing the potential differences between the groups. The age spectrum of policemen covered the entire productive age, ranging from 21 to 63 years. No differences (ANOVA, p = 0.61) in age between the city groups were observed (40.4 vs. 39.4 vs. 38.0, respectively). The BMI of subjects covered normal weight to obese with similar distributions in all localities. Mean values were almost identical for all study localities (28.6 vs. 28.4 vs. 28.2, respectively) and without statistical significance (ANOVA, p = 0.85). In the last row of Table 1, the occupation duration is shown. Most of the subjects had a long occupation history as city policemen, with no significant differences between localities (ANOVA, p = 0.13; mean values are 13.94 vs. 11.78 vs. 10.22, respectively). Only six subjects in Prague, four subjects in Ostrava and four subjects in CB, have worked as policemen for less than 2 years and they have all completed at least secondary education. The health data in the questionnaires showed that all policemen were of normal health without specific dietary issues or excessive alcohol consumption.

Air Pollution Monitoring
The concentrations of air pollutants (Table 2A), used for the characterization of the air pollution in the study cities in the three-month period before sample collection, were obtained from the Czech Hydrometeorological Institute (CHMI; Annual tabular overview). Further details regarding measuring frequencies and stations are described in Section 4.2.
In Ostrava, the concentrations of PM2.5 were the highest during both seasons compared to other localities (p < 0.05 for Ostrava vs. others). Interestingly, the concentrations of PM2.5 in Prague and CB were comparable (p = 0.42 and p = 0.15 for winter and summer, respectively).
The concentrations of B[a]P did not significantly differ between Ostrava and Prague in winter (p = 0.29). However, it was not the mean but rather the median in the Ostrava and Prague winter that showed a greater difference (2.6 vs. 1.6). We found a significant difference between Ostrava and Prague in summer (p < 0.05), and also between Ostrava and CB (p < 0.05 for both seasons).
There were no differences in the NO 2 concentrations in winter between Ostrava and other localities (p = 0.38 and p = 0.05 for Prague and CB, respectively). However, during the summer season, significant differences were found in all localities (p < 0.05 for both Ostrava vs. Prague and Ostrava vs. CB). The highest NO 2 concentration was observed in Prague in summer.
For a broader comparison of long-term exposure, we summarized the mean annual concentrations of these air pollutants in the last four years (Table 2B).
The concentrations of pollutants were relatively stable over the years. In 2019, the lowest levels of air pollution were noted in all cities compared to previous years. The differences between the localities correspond with the data shown in Table 2A. The highest concentrations of NO 2 were detected in Prague when compared with Ostrava.

General DNA Methylation Profiling
First, we analyzed the proportion of blood cell types in samples grouped to localities to exclude the possible effects of sample collection. The results are shown in Figure A1 (Appendix A). The proportion of B cells, Tc cytotoxic cells (CD8T), neutrophils and natural killer cells had a stable, comparable profile in all groups. The results for the proportion of monocytes were on the border of significance. We found a significant variance (p < 0.05) in Th helper cells (CD4T) between Ostrava and CB. Due to the low number of samples from CB, we decided not to correct this discrepancy.
Before comparing DNA methylation profiles between the three localities (Ostrava, Prague and CB), we verified the effect of season (regardless of locality, N = 125 for each season) on the differences in DNA methylation. We did not observe any season-specific clustering of DNA methylation profiles, nor did we detect any differentially methylated CpG loci (DML) (Figure 1).     We used all 250 samples from both seasons for the comparison of DNA methylation profiles between localities. Based on the results of air pollution monitoring, Ostrava was considered as a polluted locality, while Prague and CB were used as control regions, although the localities have specific air quality profiles given by various sources of air pollution (Table 2A, The results of the DNA methylation pattern clustering, analyzed by PCA, and the cluster heatmap applied on significant DML for all compared groups, are shown in Figure 2. A total of 13,643 CpG loci (FDR, p < 0.01) were significantly differentially methylated (8859 hypermethylated and 4784 hypomethylated) when samples of (a) Ostrava-Prague were compared. A total of 31 CpG loci (FDR, p < 0.01) differed in samples of (b) Ostrava-CB (22 hypermethylated and 9 hypomethylated), while only 3 DML (FDR, p < 0.01) were found in samples of (c) Prague-CB (1 hypermethylated and 2 hypomethylated).
The CpG loci shown in Figure 2 were summarized in a basic overview, consisting of their relation to CpG islands (open sea, island, N/S shore and N/S shelf), a localization of the chromosome, a reference gene, and a level of significance (presented in Table 3). From these loci, two (cg12088417 and cg27210166) for group (a) were hypomethylated, potentially affecting RPTOR, a gene encoding the regulatory-associated protein of MTOR complex 1. Strongly hypermethylated CpG locus (cg18843803, log 2 FC = 1.95), which can affect the protein-coding gene TSHZ3 (teashirt zinc finger homeobox 3), was identified in group (b). Only three differentially methylated loci were found in group (c). Strong hypomethylation of cg17265515 (log 2 FC = −1.59) can regulate the gene ERICH1 (glutamate-rich 1). All DML are listed in Table S1.
We mostly found a different proportion of DML in CpG islands in group (b) compared to all evaluated CpG. In group (a), there were more Open Sea loci than in all evaluated CpG.
In group (a), in which we could annotate methylation regions, the location of DML was characterized in relation to putative promoter regions, 3 UTR, exons, introns, etc. We detected a three-times-higher frequency of DNA methylation in the promoter regions than in all annotated genomic regions ( Figure 3C). All proportions are shown in Figure 3A-C.

Differentially Methylated Groups of Loci and Geneset Enrichment
For the Ostrava-Prague comparison, we detected differentially methylated groups (DMG) with numerous DML. DMG with more than seven CpG are listed in Table S2. Some of those located in Island/Shore regions with high relevance to health impacts are shown in Table 4A and discussed in Section 3. For further analysis, we selected groups with more than 15 CpG sites (Table 4B). A total of 20 CpG sites were annotated to the RPTOR gene, which is also shown in Table 3, with the most differentially hypomethylated loci (cg12088417, cg27210166). The COL23A1 encodes a transmembrane nonfibrillar collagen. It is potentially regulated by 17 DML, of which 15 are hypermethylated. We also found the 16 CpG loci annotated to the gene that plays a key role in cardiac action potential (KCNQ1). PTPRN2, annotated to 16 mostly hypomethylated DML, encodes the tyrosine phosphatase, a major islet autoantigen in type-1 diabetes.
For 13,643 CpG loci in the Ostrava-Prague comparison, 5881 genes were annotated (based on ETREZID). These genes were used for gene ontology and gene set enrichment ( Figure 4A,B). We obtained significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways with more than 50 potentially regulated genes in pathways ( Figure 4A), among which signaling cascades prevail. The significant potentially affected pathways (adj. p-value < 0.01) with the highest number of genes were the PIK3A-Akt signaling pathway (hsa04151, 126 genes) and the MAPK signaling pathway (hsa04010, 110 genes). The most significant pathway (adj. p-value = 1.45 × 10 −6 ) was Axon guidance (hsa04360, 86 genes). The genes usually formed an overlap of several biological pathways, creating higher functional units-biological processes. Functional clusters ( Figure 4B) based on measuring distances between the genes were formed mainly for adherens junction assembly and neuronal functions and development, which covered more than 100 regulated genes in each individual process. Other significant pathways (adj. p-value < 0.05) are summarized in Table S3.

Discussion
We report the unique study of gene-specific genome-wide DNA methylation profiles in 125 policemen working in cities in the Czech Republic with various types of air pollution. The assessment of the effect of air pollution is not straightforward; therefore, we tried to reduce the impact of lifestyle factors as much as possible by selecting similar cohorts ( Table 1). The study is part of a complex project, "Healthy Aging in an Industrial Environment", the main aim of which is to find specific biochemical, genetic and epigenetic biomarkers of effect, mainly caused by air pollution exposure in different localities and in various population groups. In policemen, we sampled the same cohort in two rounds (in spring and in autumn, at the end of the winter and summer seasons, respectively). The experimental design, based on the analysis of different seasons in Ostrava policemen, was used in a recent study of semen quality and sperm DNA integrity [30]. The authors showed that sperm chromatin damage and the percentage of immature sperm were highly sensitive to air pollution. We also assessed the effect of seasons on DNA methylation patterns, but as no effect and no differentially methylated loci were detected (Figure 1), we included all samples of the policemen (N = 250) in the analysis. We hypothesize that the short-term effects of different levels of air pollution in individual seasons do not influence the DNA methylation pattern. This assumption is based on the fact that there are periods in human life when DNA methylation profiles are susceptible to change (prenatal, early childhood and older age) [31], while DNA methylation is generally relatively stable. On the other hand, another study described the effects of a 24-h short-term exposure to PM2.5 on genome-wide DNA methylation, using the same platform (Illumina Infinium Human Methylation EPIC BeadChip) that we applied in our study [4]. However, the authors used controlled indoor exposure for a short time, and they detected only a small number of DML.
For the characterization of localities and impact of air pollution on DNA methylation patterns, we compared three pollutants in the three-month period before both sampling points, as in a previous study [20]. Moreover, we used annual mean concentrations for four consecutive years. In Ostrava, the highest levels of PM2.5 and B[a]P were detected, while PM2.5 concentrations were comparable in Prague and CB. No differences in NO 2 concentrations in winter were observed between Ostrava and the other two localities. The highest long term annual mean concentrations of NO 2 were identified in Prague, where the traffic intensity and NO 2 levels along/near roads showed positive correlation [32]. The authors of this study used passive samplers at 65 locations in Prague, and found that 32% of locations exceeded the EU annual limit of 40 µg/m 3 .
We concluded that the localities were mostly characterized by diverse types of air pollution, rather than by high differences in the concentration of air pollutants. The main cause of air pollution in Prague is road traffic, where intensities exceed all other cities in the Czech Republic. Ostrava is characterized by common winter inversions; the city is located close to the Polish border with heavy industry and coal mines producing high levels of air pollutants, along with local industrial emissions. However, air quality in the city has shown an improving trend over the last decades [18].
By comparing the proportion of blood cell types, we found a significant variance in CD4T between the Ostrava and CB samples (Appendix A). We hypothesize that the discrepancies between the groups may be related to low sample numbers in CB. On the other hand, the differences in cell proportion can also be an important regulatory mechanism in response to the environment [33,34].
After merging the sampling periods and through the analysis of 250 samples, we found 13,643 CpG loci for the Ostrava-Prague comparison. Only 31 DML were detected for the Ostrava and CB comparison; we further observed three DML when the Prague and CB samples were compared. We detected almost twice the hypermethylated DML than hypomethylated ones, and a three-times-higher frequency of promoter methylation than in all the annotated genomic regions. The loci in promoters are more biologically significant with the potential to affect gene expression [35].
Our previous study was focused on the whole-genome gene expression response to long-term exposure to air pollution in men from Ostrava and Prague during several seasons [22]. Although no clear relationship between concentrations of air pollutants and gene expression profiles was found, the reduced gene expression was observed in the Ostrava men compared to the controls. In the Ostrava cohort, an increased expression of XRCC5 was found [23]. XRCC5 encodes the protein Ku80, a key player of the DNA repair mechanism NHEJ, in which it binds the DNA breaks [36]. In this study, we found the two hypomethylated loci (cg23433242 and cg01633232; Table S1) annotated to XRCC5. Regardless of whether we found differential DNA methylation in other NHEJ genes as in [23], we can hypothesize that epigenetic modification of a single gene might be sufficient to affect DNA repair functions [37].
Comparing Ostrava and Prague, the most hypomethylated DML was found to be CpG locus cg10123377. This CpG locus was also found in the study of patients with systematic lupus erythematosus in CD8T+ cells [38]. Two of the hypermethylated loci (cg11478607 and cg26946806) located in Island and S Shore, respectively, are annotated to the GSTT1-encoding xenobiotic metabolizing enzyme, which plays an important role in human carcinogenesis [39] and impacts the markers of genotoxicity [40]. Single-nucleotide polymorphisms (SNPs) in GSST1 were widely studied in relation to the interaction effects of long-term air pollution exposure on the risk of acute myocardial infarction and hypertension, with high susceptibility to air pollution being a promoter of coronary vulnerability [41].
Comparing Ostrava and CB, we found the strongly hypermethylated CpG locus (cg18843803), which can affect the protein-coding gene TSHZ3. This gene controls the development of diverse components of the circuitry required for breathing. The protein Teashirt 3 regulates the development of neurons involved in both the respiratory rhythm and airflow control [42]. The reduced expression of this gene and the consequent caspase upregulation may be correlated with the progression of Alzheimer's disease [43].
In more detail, we focused on groups of differentially methylated loci annotated to the same gene in Ostrava samples, compared to the Prague samples (above seven CpG in Table S2; and a selection of genes with the highest proportion of groups of DML in Table 4A,B).
The highest number of groups of DML was annotated to the RPTOR gene (N = 20). RPTOR is a key component in mTOR pathway, a cell-signaling pathway commonly deregulated in human cancer, which plays roles in mRNA translation, autophagy, cell growth and immune responses (it restricts proinflammatory and promotes an anti-inflammatory response) [44,45]. RPTOR responds to nutrient and insulin levels to regulate cell growth [46]. The deregulation of this gene and pathway is associated with various human diseases, including cancer and diabetes [47,48]. The mTOR pathway (hsa04150) was also found among the significantly affected KEGG pathways (Table S3) that contain 59 genes which are potentially regulated by DML.
DMG, with the second highest number of DML, regulates COL23A1. Collagen XXIII is considered as a biomarker for the detection and recurrence of non-small-cell lung carcinoma and the reappearance of prostate cancer [49,50]. A total of 16 CpG loci, mostly hypermethylated, are annotated to the KCNQ1 gene, which encodes the voltage-gated potassium channel required for the repolarization phase of the cardiac action potential [51]. A hypermethylation of KCNQ1 is associated with poor semen parameters or male infertility [52]. Another study showed, by identifying temporal differences in the imprinting status and methylation effects, that the intronic KCNQ1 locus mediates susceptibility to type-2 diabetes [53]. A major autoantigen associated with insulin-dependent diabetes mellitus [54], PTPRN2 (protein tyrosine phosphatase), was identified for the 16 mostly hypomethylated CpG loci. Methylation in PTPRN2 is associated with childhood asthma and chronical obstruction pulmonary disease in adulthood [55], and it is also related to residential proximity to major roadways in the placenta samples of pregnant women [56].
Nine hypermethylated CpG loci in the Shore/Island region regulate NR4A2, a member of the nuclear receptor family of intracellular transcription factors (adopted from GeneCards). NR4A2 plays a key role in the maintenance of the dopaminergic system of the brain [57], and it is also highly expressed in peripheral blood leukocytes [58]. NR4A2 is involved in autoimmune and neurodegenerative diseases, especially in Parkinson's disease, where the reduced expression of this gene in peripheral blood can be considered as a potential biomarker for diagnosis, and a promising approach to therapy [59][60][61][62]. Another nine hypermethylated CpG sites can reduce the expression of CDK2AP1, which encodes cyclin-dependent kinase 2 (CDK2), which is important in the reducing of cell proliferation, contributes to cell cycle termination, and is a known tumor suppressor [63].
The most significantly affected pathway, the Axon guidance (hsa04360), contains 86 potentially regulated genes. Changes in the expression or function of these proteins might induce pathological changes in neural circuits that predispose to, or cause, neurological diseases [64]. Several of the genes in this pathway, which were found close to demethylated loci, were identified in patients with early-onset Alzheimer's disease [65]. The largest cluster presented in Figure 4B gathered the most significant biological processes, with considerable potential gene regulation in neuronal functions. The second firm cluster includes the functions of cell junctions and adhesion assembly. These biological functions are potentially affected by DNA methylation in the Ostrava sample groups compared to Prague.
This study is first that involved a large set of policemen from different air-polluted cities, in whom genome-wide gene-specific DNA methylation was evaluated. It is well known that DNA methylation is a useful biomarker of prior environmental exposure and future health outcomes. However, it can be also affected by many factors of lifestyle and lifetime exposure. Although we could not determine all these factors and separate a specific role of air pollution, we identified the potential biomarkers that will be further studied. For a follow-up study, we have extensive questionnaire data of the complete exposure history of every individual, which will be linked with DNA methylation results to obtain more comprehensive outcomes in this molecular-epidemiological study.

Study Subjects
The study subjects were 125 male city policemen working in three cities in the Czech Republic (Prague, N = 55; Ostrava, N = 54; CB, N = 16), all involved in the project in two rounds in 2019: spring (March/April) and autumn (September/October). The cohorts of policemen were chosen as groups that spend most of their working time outdoors. Different cities are characterized, among other factors, by different levels and types of air pollution. Prague is the capital of the Czech Republic and its most densely populated area. Road traffic most significantly contributes to the level of air pollution in Prague. Ostrava is the third largest city, with a long history of coal mining and heavy industry. To date, the Ostrava region is known as one of the European hot spots of air pollution due to its geographical location close to the industrial region of Poland, frequent inversions in winter, heavy traffic and local industrial emissions. The third city, CB, was usually selected as a control locality in our previous studies [21,66,67], due to its close proximity to the Sumava National Park and the large agricultural area in the district (more details given in Table 1).

Air Pollution Monitoring
Daily concentrations of selected air pollutants (PM2.5, B[a]P and NO 2 ) during both three-month periods before sampling (winter and summer), as well as the annual averages from 2016 to 2019, were obtained from the Annual tabular overview, CHMI (http://portal.chmi.cz/?l=en (accessed on 27 January 2022)). Data were acquired by automatic air pollution monitoring in each city. The stations were located representatively for patrol activities of city policemen. In the city of Ostrava, we used the main CHMI station in Ostrava-Poruba located in a residential area close to a gas station. In Prague 5, we selected two stations: for B[a]P and NO 2 monitoring, a station in Prague-Reporyje was situated in a school garden; for PM2.5, a station was placed in Prague-Stodulky in a housing estate. In CB, the stations were located in urban and residential areas. The measurement frequency was daily in the case of PM2.5 and NO 2 . For B[a]P, the data were obtained twice a week.

DNA Methylation Analysis
A total of 250 samples of the venous blood of 125 policemen (every policeman provided samples in two sampling periods) were collected into vacuettes containing ethylenediaminetetraacetic acid (EDTA), and frozen at −20 • C for later use. Genomic DNA (gDNA) was extracted using Miller's salting out method [68]. gDNA (1000 ng) was treated overnight with sodium bisulfite using the EZ DNA Methylation Kit (Zymo Research, Irvine, CA, USA) for the conversion of unmethylated cytosines to uracils, while methylated cytosines remained unchanged. The bisulfite-converted DNA (BCD) samples were stored at −20 • C until use. BCD was processed using the Infinium Methylation EPIC Kit (Illumina, San Diego, CA, USA) according to the manufacturer's protocol (Infinium HD Methylation Assay Protocol) including enzymatic fragmentation, precipitation and hybridization, followed by BeadChip washing and staining. Each BeadChip consisted of 8 samples. The chips allowed the detection of over 850,000 methylation sites per sample across the genome at single-nucleotide resolution. The methylation status at each CpG site, scanned by the iScan System (Illumina, San Diego, CA, USA), was estimated by measuring the intensity of the pair of methylated and unmethylated probes.

Statistical Analysis
The descriptive statistics of epidemiological data from questionnaires (age, BMI, exposure history), air pollution (differences in concentrations) and cell type proportion, were carried out using R-stats. Depending on the distribution of the data, the t-test or the nonparametric Mann-Whitney Sum U-test was used for the comparison of individual groups and ANOVA was used for factor analysis. All advanced statistical analyses related to methylation were processed using scripting in an R environment.
Raw microarray data were downloaded as idat files, imported to the R environment and processed with the minfi package [69]. Data were normalized using the quantile method. A series of filtering was performed. Probes with SNPs at CpG sites and the crossreactive probes were also excluded to obtain the resulting number of 794,441 probes [70]. We estimated associations between principal components and slide factors, and used the Combat function (sva package) for batch correction [71].
Beta values for the determination of the level of methylation as the ratio of the fluorescent signals from the methylated vs. unmethylated sites were also calculated using the minfi package. Preprocessing analyses were performed to study the distribution of beta values and the variation of methylation across all samples.
PCA was performed to identify the variance using covariance matrix, and to detect the potential effects of season and locality. We identified differentially methylated loci using the top Table function (limma package) [72]. For multiple testing of the false discovery rate (FDR), the p-values for the contrast of interest were adjusted to be below <0.01, which is regarded to be the most appropriate for microarray analysis [73].
The proportions of genomic regions to gene positions were analyzed using the annotatr package [74]. An annotation of the CpG site to ENTREZID, plots of the KEGG pathways and an enrichment map were obtained using clusterProfiler package v4.0 [75]. The proportions of blood cell types presented in Appendix A were calculated using the ENmix package [76].

Conclusions
This study focused on the comparison of DNA methylation profiles in city policemen working and living in localities that differ in terms of major sources of air pollution. The sampling was conducted repeatedly in two seasons (spring and autumn in the same year). The obtained results clearly demonstrated that there was no effect of season, which corresponds with relatively slow changes in DNA methylation settings in adults. On the other hand, the effects of the different localities with various exposure profiles were clearly visible. For genetic toxicology, the results indicating differences in xenobiotics metabolism and repair and neurodevelopment pathways can be considered the most significant. However, the associations of DNA methylation changes with various diseases, particularly diabetes mellitus or neurodegenerative or respiratory diseases, are often reported. Our observations support the hypothesis that epigenome modification is an effective process for the optimal management of the genome function in response to various types of environments, but it is also a risk factor for future disease development. also appreciate the technical and laboratory staff (Vera Brezinova, Jolana Vankova, Zuzana Novakova, Vlasta Svecova and Olga Stverakova) for their support working on recruitment, questionnaires and the sampling of biological materials.

Conflicts of Interest:
The authors declare no conflict of interest.