Distribution of CRISPR in Escherichia coli Isolated from Bulk Tank Milk and Its Potential Relationship with Virulence

Simple Summary In the dairy farms of many different countries, E. coli is one of the most common causes of mastitis. It is defined as mammary pathogenic E. coli, and is known to cause opportunistic infections by possessing diverse virulence factors. Therefore, the purpose of this study was to investigate the virulence potential of E. coli isolates from bulk tank milk in Korea, and observe its association with clustered regularly interspaced short palindromic repeat (CRISPR) arrays. The results showed that out of 183 isolates, 164 (89.6%) possessed one or more of 18 virulence genes, and belonged to phylogenetic groups B1 (64.0%), A (20.1%), D (8.5%), and C (7.3%). CRISPR arrays of E. coli are classified as either CRISPR I-E (CRISPR 1 and 2) or CRISPR I-F (CRISPR 3 and 4). In this study, only CRISPR 1 (95.7%) and 2 (74.4%) were detected. Among the eight protospacers matching plasmids and phages, three were associated with gene regulation, and one was associated with virulence. Moreover, the different virulence genes showed significantly different patterns of CRISPR distribution and CRISPR sequence-types. This result implies that CRISPR loci may be associated with gene regulation and pathogenicity in E. coli, and that the CRISPR sequence-typing approach can help to clarify and trace virulence potential, even though the E. coli isolates were from normal bulk tank milk. Abstract Escherichia coli is one of the most common causes of mastitis on dairy farms around the world, but its clinical severity is determined by a combination of virulence factors. Recently, clustered regularly interspaced short palindromic repeat (CRISPR) arrays have been reported as a novel typing method because of their usefulness in discriminating pathogenic bacterial isolates. Therefore, this study aimed to investigate the virulence potential of E. coli isolated from bulk tank milk, not from mastitis, and to analyze its pathogenic characterization using the CRISPR typing method. In total, 164 (89.6%) out of 183 E. coli isolated from the bulk tank milk of 290 farms carried one or more of eighteen virulence genes. The most prevalent virulence gene was fimH (80.9%), followed by iss (38.3%), traT (26.8%), ompT (25.7%), afa/draBC (24.0%), and univcnf (21.9%). Moreover, the phylogenetic group with the highest prevalence was B1 (64.0%), followed by A (20.1%), D (8.5%), and C (7.3%) (p < 0.05). Among the four CRISPR loci, only two, CRISPR 1 and CRISPR 2, were found. Interestingly, the distribution of CRISPR 1 was significantly higher in groups A and B1 compared to that of CRISPR 2 (p < 0.05), but there were no significant differences in groups C and D. The prevalence of CRISPR 1 by virulence gene ranged from 91.8% to 100%, whereas that of CRISPR 2 ranged from 57.5% to 93.9%. The distribution of CRISPR 1 was significantly higher in fimH, ompT, afa/draBC, and univcnf genes than that of CRISPR 2 (p < 0.05). The most prevalent E. coli sequence types (EST) among 26 ESTs was EST 22 (45.1%), followed by EST 4 (23.2%), EST 16 (20.1%), EST 25 (19.5%), and EST 24 (18.3%). Interestingly, four genes, fimH, ompT, afa/draBC, and univcnf, had a significantly higher prevalence in both EST 4 and EST 22 (p < 0.05). Among the seven protospacers derived from CRISPR 1, protospacer 163 had the highest prevalence (20.4%), and it only existed in EST 4 and EST 22. This study suggests that the CRISPR sequence-typing approach can help to clarify and trace virulence potential, although the E. coli isolates were from normal bulk tank milk and not from mastitis.


Introduction
Escherichia coli is one of the most common Gram-negative bacteria residing in the intestines of animals in an anaerobic and facultative manner [1]; however, it is also one of the most common causes of mastitis on dairy farms [2]. Generally, E. coli mastitis results in a subclinical pathology caused by an environmental opportunistic infection. However, the presence of diverse virulence properties associated with extraintestinal pathogenesis, such as adhesins, toxins, capsule synthesis, siderophores, invasins, and serum survival, is reported to be crucial for the colonization of the mammary glands via increased bacterial survival and tissue damage [3,4]. In particular, Aslam et al. (2021) [5] reported that the presence of various virulence genes in extraintestinal pathogenic E. coli (ExPEC) contributed to the rise in mammary pathogenic E. coli (MPEC).
Moreover, the virulence potential necessary to cause an infection of the mammary glands is determined by a combination of factors, not the presence of a single factor [6]. Hence, phylogenetic analysis is important because it enriches the understanding of the classification, and determines the virulence of pathogenic E. coli [7,8]. E. coli is derived from different phylogenetic groups A, B1, B2, C, D, E, and F [9], and the majority of strains responsible for ExPEC, such as uropathogenic E. coli (UPEC), newborn meningitisassociated E. coli, and avian pathogenic E. coli, belong to groups B2 and D [10,11]. However, even though MPEC can cause infections outside of the gastrointestinal system, both MPEC and bovine commensals belong to phylogroups A and B1, because MPEC may be recruited from the normal intestinal commensal microbiota [12][13][14].
Clustered regularly interspaced short palindromic repeat (CRISPR) arrays are a bacterial adaptive immune system that neutralizes invading phages and plasmids by cutting foreign DNA at specific locations. It consists of various spacers, which are short sequences between each repeat [15][16][17]. A protospacer, which is a short external sequence at a specific location, is inserted as a spacer in the CRISPR loci of bacteria during an infection [18]. Recently, CRISPR arrays have been applied as a novel typing method for isolates because they are useful in discriminating the pathogenicity of Salmonella [19], E. coli [20][21][22][23][24][25], and Pseudomonas aeruginosa [26]. This study aimed to investigate the virulence potential of E. coli isolated from bulk tank milk, not from mastitis, and to analyze the pathogenic characterization of E. coli using the CRISPR typing method.

Bacterial Strains
Each 50 mL of bulk tank milk was aseptically collected, from 290 farms operated by three dairy companies, and tested in accordance with the standard microbiological protocols published by the Ministry of Food and Drug Safety (2018) [27]. Approximately 1 mL of each bulk tank milk sample was inoculated into 9 mL of modified Escherichia coli broth (Merck, Darmstadt, Germany), and these were incubated at 37 • C for 24 h. A loopful of enriched mEC was streaked onto MacConkey agar (BD Bioscience), and incubated at 37 • C for 24 h. Three typical colonies selected from each sample were confirmed by PCR, as described previously [28]. If isolates of the same origin showed the same antimicrobial susceptibility patterns, only one isolate was randomly chosen, and a total of 183 E. coli were included in this study.

Phylogenetic Groups
Phylogenetic grouping was accomplished by a multiplex PCR-based method using chuA, yjaA, TSPE4.C2, arpA, and trpA genes, and the bacteria were assigned into groups A, B1, B2, C, D, E, F, and clade I, as described previously [9].

Statistical Analysis
Analysis via the Statistical Package for the Social Sciences version 25 (IBM SPSS Statistics for Windows, Armonk, NY, USA) was used in this study. Pearson's chi-square test and Fisher's exact test with Bonferroni correction were conducted to analyze the differences associated with the distribution of virulence genes, phylogenetic groups, and ESTs. A p-value < 0.05 was considered statistically significant.

Distribution of Virulence Genes
The distribution of 33 virulence genes associated with ExPEC in E. coli from bulk tank milk is presented in Table 1. A total of 164 (89.6%) out of 183 E. coli isolated from bulk tank milk carried one or more of eighteen virulence genes. The most prevalent virulence gene was fimH (80.9%), followed by iss (38.3%), traT (26.8%), ompT (25.7%), afa/draBC (24.0%), and univcnf (21.9%). Interestingly, both iss and traT had the significantly highest prevalence in E. coli from company A, and fimH had a significantly higher prevalence in E. coli from companies A and B (p < 0.05). Although kpsMTK5 had a low prevalence (7.1%) in E. coli, this gene also showed a significant difference among the companies (p < 0.05).

Distribution of Phylogenetic Groups and CRISPR Loci
The distribution of phylogenetic groups and CRISPR loci in 164 E. coli isolates with some virulence genes is presented in Table 2. All isolates were assigned into four phylogenetic groups: A, B1, C, and D. The phylogenetic group with the significantly highest prevalence was B1 (64.0%), followed by A (20.1%), D (8.5%), and C (7.3%). Although the distribution of groups B1 and C was not significantly different by company, that of groups A and D showed significant differences between companies (p < 0.05). Among the four CRISPR loci examined, only two, CRISPR 1 and CRISPR 2, were found. However, the prevalence of CRISPR 1 (95.7%) was significantly higher than that of CRISPR 2 (74.4%) (p < 0.05). On the other hand, no significant differences between the companies were observed.
coli isolated from each company. Values with different subscript letters represent significant differences among companies, while superscript letters represent significant differences in total (p < 0.05).

Distribution of CRISPR 1 and CRISPR 2 by Phylogenetic Group
The association of CRISPR 1 and CRISPR 2 with phylogenetic groups of E. coli is shown in Figure 1. The prevalence of CRISPR 1 by phylogenetic groups ranged from 75.0% to 98.1%, whereas that of CRISPR 2 ranged from 60.6% to 92.9%. Interestingly, the distribution of CRISPR 1 and CRISPR 2 in the phylogenetic groups showed significant differences. The distribution of CRISPR 1 was significantly higher in groups A and B1 than that of CRISPR 2 (p < 0.05). The prevalence of CRISPR 1 and CRISPR 2 showed no significant differences in groups C and D.

Distribution of CRISPR 1 and CRISPR 2 by Virulence Genes
The association of CRISPR 1 and CRISPR 2 with six common virulence genes of E. coli is shown in Figure 2. The prevalence of CRISPR 1 by virulence gene ranged from 91.8% to 100%, whereas that of CRISPR 2 ranged from 57.5% to 93.9%. The distribution of CRISPR loci also showed differences by virulence gene. The distribution of CRISPR 1 was significantly higher in fimH, ompT, afa/draBC, and univcnf genes than that of CRISPR 2 (p < 0.05). Moreover, iss and traT genes showed equally high distributions, and no significant differences between CRISPR 1 and CRISPR 2.

CRISPR-Based Typing of Virulence Gene-Carrying Isolates
The distribution of ESTs by six common virulence genes of E. coli is presented in Table 3. A total of 26 ESTs were assigned based on the distribution of spacers from CRISPR 1 and CRISPR 2. The most prevalent EST was EST 22 (45.1%), followed by EST 4 (23.2%), EST 16 (20.1%), EST 25 (19.5%), and EST 24 (18.3%). Interestingly, four genes, fimH, ompT, afa/draBC, and univcnf, had a significantly higher prevalence in both EST 4 and EST 22, while iss and traT genes had a significantly lower prevalence than the other four genes in both EST 4 and EST 22 (p < 0.05). Table 3. CRISPR-based typing by virulence gene in 164 E. coli possessing virulence genes, isolated from bulk tank milk.

Discussion
Mastitis caused by E. coli is one of the most frequent diseases in dairy cattle resulting from environmental infection, and it is usually characterized by changes in milk composition and quality [1,2]. Although the relationship between virulence factors on bovine mastitis caused by E. coli and its clinical severity has not been fully elucidated, many studies have reported the influence of virulence factors on the establishment of clinically severe infections [6,31]. In this study, 18 out of 33 virulence genes associated with Ex-PEC were detected, and 89.6% of isolates from normal bulk tank milk carried one or more virulence genes. In particular, fimH, which is associated with the virulence factor adhesin, was detected the most often (80.9%). Guerra et al. (2020) [32] and Zhang et al. (2018) [33] also reported a 100% and 89.9% prevalence of fimH, respectively, indicating its ubiquity among mastitis-causing E. coli isolates. The fimH gene is a bacterial adhesin that helps E. coli bind to host cells and their receptors, and plays a crucial role in causing bovine mastitis by colonizing the mammary glands, resembling the pathogenesis of urinary tract infections [31,34,35]. Other virulence genes of adhesin, such as afa/draBC, sfaS, bmaE, and papG allele 3, were also detected in this study. The prevalence rates of these genes varied from 0.5% to 24.0%, but the presence of these adhesins also implies the facilitated attachment of bacteria onto host cells, helping the colonization of the region and increasing the possibility of mastitis [6,31].
Toxins encoded by virulence genes, such as univcnf, cnf1, cdt, hlyA, and cdtB, are considered essential in the pathogenesis of mastitis following colonization via adhesins. In this study, the most prevalent toxin gene was univcnf (21.9%), while the prevalence of other toxin genes ranged from 1.1% to 2.7%. Lehtolainen et al. (2003) [36] reported that cytotoxic necrotizing factors (CNF), encoded by univcnf, cnf 1, and cnf 2, are significantly associated with the persistence of mastitis. Moreover, the potential of CNFs to cause tissue damage or mediate bacteremia can lead to acute mastitis with severe systemic symptoms [37]. Therefore, if whole milk was derived from clinical mastitis rather than from a normal bulk tank, a higher prevalence may be confirmed.
Although the prevalence of the genes iss and traT, which are related to serum survival, was 38.3% and 26.8%, respectively, in this study, several reports have described the absence of a correlation between the presence of these genes and the pathogenicity of mastitis [31,38,39]. Therefore, it is difficult to predict whether the presence of iss and traT genes in E. coli from bulk tank milk may increase the risk of mastitis.
Phylogenetic analysis is increasingly being used as a modern method of determining virulence potential [40]. In this study, the phylogenetic group B1 (64.0%) was the most prevalent, followed by A (20.1%). On the other hand, phylogroups B2 and D, which were reported as highly virulent phylogroups regarding ExPEC [40,41], were detected at 0.0% and 8.5% prevalence, respectively. This result is in accordance with those of previous studies that phylogroups B1 and A were the most common groups in normal and mastitis milk samples, while phylogroups D and B2 were rarely detected [42]. According to previous studies, mastitis-causing E. coli isolates may be related to commensal isolates attaining virulence genes, causing infection in hosts with compromised immune systems [43,44].
E. coli contains four CRISPR loci: CRISPR 1, 2, 3 and 4; these are classified as either Type I-E (CRISPR 1 and 2) or Type I-F (CRISPR 3 and 4), depending on the presence of the associated cas genes [45]. In this study, 161 (98.2%) of 164 isolates possessing virulence genes were identified to possess CRISPR 1 and/or CRISPR 2, which comprise highly conserved direct repeat sequences with variable spacer sequences [22]. Meanwhile, CRISPR 3 and 4 loci, which possess a lower spacer distribution, were not detected. It was reported that CRISPR 1 and/or 2 loci have been preserved and stationary within E. coli over a long period [46], whereas CRISPR 3 and 4 loci are a recent creation [22]. Moreover, the hypervariability of CRISPR loci can be applied in phylogenetic analysis, as in previous reports [21,47]. In particular, Touchon et al. (2011) [22] reported that only the phylogenetic group B2 possessed CRISPR 4, implying that the absence of CRISPR 3 and 4 in this study could be linked to the absence of the phylogenetic group B2. Moreover, the absence of a significant difference in the distributions of CRISPR 1 and 2 in the phylogenetic groups C and D is also suggested to be linked with CRISPR loci and phylogeny.
Because both virulence genes and spacers of CRISPR are acquired by horizontal gene transfer via plasmids and phages [48,49], isolates with different virulence genes can have different distributions in CRISPR content, resulting in different ESTs. In this study, the distributions of EST 4 and EST 22 were significantly higher in isolates harboring fimH, ompT, afa/draBC, and univcnf, which play a crucial role in MPEC, compared to isolates carrying the genes iss and traT, which lack a role in pathogenicity. Therefore, these results suggest that the distribution of spacers may be reflected by the presence of virulence genes, as previously reported [24,50].
The CRISPR system of E. coli also functions as a regulatory mechanism and immune system of bacteria [22,[51][52][53][54]. In this study, three (DNA-cytosine methyltransferase, DNAbinding protein, and helicase) of eight protospacers were associated with gene regulation. Bozic et al. (2019) [55] reported that CRISPR I-E (CRISPR 1 and 2) targets bacterial chromosomes, suggesting its major role in the regulation of endogenous genes. Moreover, protospacer 163, which is linked with bacteriophage tail-associated protein, is homologous to the Type VI secretion apparatus [56], which is associated with the increased virulence of many pathogens [57]. Interestingly, in this study, protospacer 163 was only detected in EST 4 and EST 22, which are ESTs with a significantly higher prevalence in isolates carrying four virulence genes (fimH, ompT, afa/draBC, and univcnf ).

Conclusions
In conclusion, the results of protospacer distribution suggest that CRISPR I-E is linked with gene regulation and pathogenicity in E. coli. Moreover, the CRISPR sequence-typing approach helped to clarify and trace virulence potential, by showing significant differences in prevalence based on different virulence genes.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/ani12040503/s1, Figure S1: Grouping of 164 isolates into Escherichia sequence types (ESTs) according to spacer contents; Table S1: Names and sequences of all spacers in each CRISPR locus identified in this study.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; or in the decision to publish the results.