Small-Scale Comparative Genomic Analysis of Listeria monocytogenes Isolated from Environments of Salmon Processing Plants and Human Cases in Norway

: Listeria monocytogenes is a food-borne bacterium that give rise to the potentially life-threatening disease listeriosis. Listeriosis has been mandatorily notiﬁable in Norway since 1991. All clinical L. monocytogenes isolates are sent to the Norwegian Institute of Public Health (NIPH) for typing. Since 2005 Multi-Locus Variable number tandem repeats Analysis (MLVA) has been used for typing but was recently replaced by whole genome sequencing using core genome MultiLocus Sequence Typing (cgMLST). In the present study, L. monocytogenes isolates collected at salmon processing plants in Norway in 2007 (n = 12) and 2015 (n = 14) were ﬁrst subject to MLVA. Twelve clinical L. monocytogenes isolates with matching MLVA proﬁle and sampling time were selected from the strain collection at NIPH. Twenty-one isolates from the salmon processing plants and all clinical isolates (n = 12) were whole genome sequenced and compared using cgMLST and in silico detection of virulence genes. cgMLST revealed four pairs of environmental–human isolates with ≤ 10 allelic differences over 1708 genes, indicating that they may be assigned as clonal, with the implication that they are descended from the same recent ancestor. No relevant difference in carriage of virulence genes was found between environmental or human isolates. The present study shows that L. monocytogenes strains that genetically resemble contemporary isolates from human listeriosis circulate in Norwegian salmon slaughterhouses, and carry the same virulence genes.


Introduction
Listeria monocytogenes is a food-borne pathogen that can cause listeriosis, a severe invasive infection in humans with particularly high lethality (25%-30%) and hospitalization (>92%) rates [1]. It has been estimated that listeriosis resulted in 23,150 illnesses, 5463 deaths and 172,823 disability-adjusted life-years globally in 2010 [2]. A global trend of increasing sporadic listeriosis cases was observed at the beginning of the new millennium, especially in the age group >60 [3][4][5][6], but over the 5-year period during 2015-2019, the trend has been flat after a long period of increase [7]. In addition to the elderly, individuals with impaired immune systems, pregnant women, and new-borns are predisposed to listeriosis [8,9]. In the EU, notifications for L. monocytogenes in food products have increased in numbers since 2009 and is, next to Salmonella, by far the most frequently reported pathogen [10,11]. An extensive European baseline study was conducted in 2010 and 2011, including a total of 10,053 samples of certain ready-to-eat (RTE) foods from 3632 retail outlets in 27 countries [12]. L. monocytogenes were detected in 10.3% of 3052 fresh hot or cold smoked and gravad fish at the end of shelf-life. By the use of whole genome sequencing (WGS), a retrospective study based on the 2010-2011 baseline study indicated that, although all RTE categories were associated with sporadic human listeriosis cases, smoked fish was the dominating source of sporadic human food-borne listeriosis [13]. L. monocytogenes in cold-smoked fish products is still an important cause of food-borne outbreaks [11].
L. monocytogenes is frequently isolated from slaughterhouses processing farmed salmon both in Norway [14][15][16], and abroad [17][18][19]. L. monocytogenes has been found in smoked salmon in several studies [20][21][22]. In Norway, the frequency was 9% in 1991 [22] and 4.9% in 2003 [23]. In a review of 26 publications published between 2000 and 2015, mostly from Europe but also including Japan and North America, it was found that the prevalence in retail cold smoked salmon (CSS) varied between 0 and 61%, with an average of 9.8%, but CSS was never implicated in any confirmed listeriosis outbreaks worldwide up until 2015 [21]. However, after 2015, CSS has been implicated in listeriosis outbreaks in Denmark and France [24,25] and two European multi-country outbreaks [26,27].
The majority of listeriosis cases are sporadic, and the responsible food item is not identified. In Norway, eight foodborne outbreaks have been registered since 2005 [28]. These outbreaks included 47 verified listeriosis cases. Nevertheless, both the number of outbreaks and the number of cases are minimum numbers, because not all outbreaks and cases are discovered and notified. Consequently, epidemiological surveillance becomes increasingly important due to the increase in these sporadic incidents in order to provide critical information about reservoirs and vehicles for the pathogen, and to identify listeriosis food sources, and food safety gaps. Until recently, methods used for typing L. monocytogenes along the food chain and in clinical infections included Pulsed-Field Gel-Electrophoresis (PFGE) and Multi-Locus Variable number tandem repeats Analysis (MLVA). Since the introduction of PFGE in the 1990s and later MLVA, these molecular typing techniques have proven critical for solving outbreaks and identifying clusters that warrant further investigation [29,30]. The discriminatory power of the two methods is, however, inadequate and not always sufficient for outbreak detection and investigation to limit the outbreak and identify the source. It was shown that PFGE was not able to discriminate between pathogenic isolates and pairs of contemporary isolates of human or food origin without any causal relationship [31], thus yielding results of limited value. At present, knowledge of the impact of environmental and food strains on human health is limited. For example, the assumption that all L. monocytogenes strains have a similar virulence, has led to an important lack of data in the dose-response model, as illustrated by Lindqvist and Westöö [32] in the case of RTE salmon and rainbow trout quantitative risk assessment.
WGS has proven to be an invaluable tool for genomic typing of various food-borne pathogens including L. monocytogenes, by providing optimal resolution and enabling comparisons across different epidemiological sectors [13,33]. Coupled to genome-wide geneby-gene comparisons designated as core genome Multi-Locus Sequence Typing (cgMLST), WGS has become the preferred technology for L. monocytogenes typing, offering an enhanced level of resolution compared to conventional methods. cgMLST has recently been implemented into the routine surveillance of human L. monocytogenes isolates in Norway.
The main objective of the present study was to compare L. monocytogenes isolates collected from Norwegian salmon slaughterhouses, with L. monocytogenes isolates from human cases with the use of WGS, in order to explore the existing L. monocytogenes strains in slaughterhouse environments and their possible association with disease in humans.

Bacterial Isolates
Four salmon slaughterhouses designated A, B, C and D, located along the west coast of Norway, were examined for the presence of L. monocytogenes. Plants A-C were visited once during August to September 2015, and plant D was visited in March and September 2015. Sampling was performed as described previously [16]. No L. monocytogenes was found in plant A, but a total of 7 isolates of L. monocytogenes was found at plant B-D, and an additional 7 isolates from plant D isolated during a three-week period in October/November 2015 were acquired from an external laboratory. These 14 samples were genotyped by MLVA. MLVA genotyping was performed as previously described [14] with primer sequences adopted from Lindstedt et al. [34]. All the seven isolates from plant D acquired from the external lab were of the same MLVA genotype. Therefore, only two of these were subject to further analysis, giving nine environmental isolates from 2015. Twelve isolates collected from three other plants in 2007 and previously MLVA typed [14] were re-typed by MLVA and added to the analysis, so that the total number of environmental isolates subject to further analysis was 21.
Medical microbiology laboratories throughout Norway mandatorily forward presumptive L. monocytogenes isolated from clinical specimens to the National Reference Laboratory (NRL) at NIPH for typing. Since 2005 typing has been performed by MLVA [14]. Clinical isolates were selected from the national strain collection at NRL at NIPH. Twelve isolates from 2005 to 2016 corresponding to the different MLVA-profiles identified in the environmental collection were selected for further WGS analysis.

Library Preparation and Sequencing
DNA was extracted from overnight cultures using the Wizard Genomic DNA purification Kit (Promega, Madison, WI, USA) before Nextera XT DNA Library preparation and paired-end (2 × 300) sequencing on the Illumina MiSeq platform aiming for >50× coverage (Illumina, San Diego, CA, USA). RTA v1.18.54 and bcl2fastq v2.17.1.14 were used for base calling, demultiplexing and converting the data to fastq format. Prior to downstream analyses, adapters used for sequencing and low quality reads were trimmed using Trimmomatic v.033 [35] following the recommendation from the developer. Reads aligning to PhiX genome that was used as spike-in during Illumina sequencing was removed by aligning the data using bbmap v34.56 [36].

Core Genome Analysis
All analyses performed were integrated into the Ridom SeqSphere+ version 5.1.0 software (Ridom GmbH, Münster, Germany) using the cgMLST scheme developed by Ruppitsch et al. [37]. Briefly, sequenced reads were trimmed until an average base quality of 30 was reached in a window of 20 bases. De novo assembly was performed using Velvet 1.1.04 with default settings. Allele numbers for each gene were assigned automatically in SeqSphere + and the combination of all alleles in each strain composed the allelic profile. The allelic profiles of the L. monocytogenes strains were based on 1708 genes, 1701 cgMLST genes and seven MLST genes, and were visualised as a neighbour joining tree and a minimum spanning tree using the parameter "pairwise ignoring missing values". Serogroups were predicted automatically in SeqSphere + as described by Hyden et al. [38].

Selection of Reference Genomes
L. monocytogenes EGD-e (NC_003210.1) was used as reference for cgMLST and virulence gene analyses of all isolates.
cgMLST analysis showed that only four of eleven clinical/environmental pairs fell into the same CT (Table 1), and can thus be attributed to the same clone using a cut-off of ≤10 allelic differences over 1708 genes [37]. These pairs were 2HF33-14EP001215 isolated seven years apart and with an allelic difference of four, based on 1708 genes. Further, 3BS29 -1108-0593 were isolated one year apart and with an allelic difference of five, 3BS90-1106-3928 also isolated one year apart and with an allelic difference of five, and finally 3BS44-1110-0057 isolated three years apart and with an allelic difference of ten ( Figure 1B).   In order to investigate potential associations of presence/absence of virulence genes with STs or origin of the isolate (environmental versus clinical), virulence genes were predicted in silico using VirulenceFinder in all genome sequences isolated from the environment and clinical cases, as well as in the EGD-e reference genome (GenBank accession no. NC_003210.1). The EGD-e reference genome carried 82 virulence genes, which was the same number found in ST18, ST20 and ST398 isolates. ST7, ST8, and ST101 isolates harboured 80 or 81 virulence genes, whereas the remaining STs isolates carried between 74 and 79 virulence genes (Supplementary Table S1). Out of the 82 virulence genes, 72 were observed in all isolates and with similar frequency in environmental and clinical isolates. There was no statistical difference in the number of virulence genes found between clinical and environmental isolates (Students' t-test, p = 0.30), and with only very few exceptions, the same virulence gene(s) was missing in environmental and in clinical isolates inside each ST (Supplementary Table S1).

Discussion
It is often found that isolates from the seafood and processing environment are identical to clinical isolates based on different subtyping strategies [14,[40][41][42]. Rocourt et al. [43] reviewed the data accumulated from phenotypic and molecular typing methods and concluded that strains responsible for most major outbreaks between 1981 and 2000 belonged to a small number of clones consisting of closely related strains as evidenced by ribotyping, isoenzyme pattern and DNA profile analysis. In the Danish fish processing industry, strains belonging to one particular persistent RAPD (random amplification of polymorphic DNA) subtype of L. monocytogenes have been isolated over several years in different processing plants [44]. WGS of two such isolates isolated six years apart, from two different plants with no intertrade relationship, revealed that they were almost identical, as their predicted proteomes differed by only two proteins [45]. However, WGS has also revealed that L. monocytogenes strains that are identical by MLVA and MLST and have indistinguishable PFGE and fAFLP (fluorescent amplified fragment length polymorphism) patterns may not be clonal strains [31,46]. Our data corroborates these earlier finding demonstrating the superior potential of WGS/cgMLST compared to MLVA in genetic discrimination between isolates.
Salmon slaughterhouses and processing plants may be colonized by L. monocytogenes, while others are free of the bacteria [14,16,20]. Although L. monocytogenes is likely to be consistently reintroduced into processing plants from a variety of sources, including fish raw material, water and personnel, there are indications that L. monocytogenes at environmental sites and L. monocytogenes in the raw material represent different bacterial populations [47]. This has led to the hypothesis that persistent L. monocytogenes strains represent the predominant source of environmental contamination in processing plants [47][48][49][50]. Earlier studies have demonstrated that salmon processing plants often harbour their own specific populations of L. monocytogenes subclones (e.g., RAPD types) [44,48]. These persistent strains may be specially adapted to the processing plant environment and be extremely difficult to sanitize by standard hygiene procedures [51][52][53]. Persistence of L. monocytogenes in food processing plants has therefore been hypothesized to be an important factor and the root cause of a number of listeriosis outbreaks [54,55].
No relevant difference in carriage of virulence genes were found in relation to isolates of environmental or clinical origin in the present study. By using VirulenceFinder for the in-silico analysis of virulence potential of ST14, ST121, and ST224 isolates from a rabbit meat processing plant, Palma et al. [56] found that the two genes lmo2026 and InlF were absent or found truncated in all ST14 and ST121 isolates. This is consistent with the current analysis, in that lmo2026 was not detected in any ST14 or ST121 isolates, but was also absent in ST2, ST6, ST8, and ST31 isolates. InlF was absent in ST14 and ST121 isolates, and also in ST2 and ST6 isolates. Among other genes not detected by the in-silico analysis, inlK, gtcA, Aut and Ami (all absent in ST2 and ST6 isolates), ActA (absent in ST31 and ST121 isolates), inlJ (absent in ST14 and ST121 isolates), and vip (absent in ST2, ST6, ST7, ST8, and ST31 isolates) where the most prominent. However, this seems more linked to ST than to origin. The four clones CC1, CC2, CC3, and CC9 represents 50% of food isolates and 68% of clinical isolates globally [57]. Important factors to consider are that the high frequency in clinical manifestations of apparently less virulent strains (when based only on the presence/absence of virulence genes) is highly favoured and correlated to their high prevalence in food sources, and, second, that there may be a relatively high degree of redundancy in L. monocytogenes virulence genes. The small dataset analysed in the present study does not allow us to draw general conclusions on the distribution of virulence genes in environmental compared to clinical isolates. Although based on a limited dataset, the present studies give no evidence to support the hypothesis that virulence genes are differentially distributed in clinical and environmental isolates and is thus in accordance with the large-scale analysis of Painset et al. [58]. It must be noted, however, that the VirulenceFinder analysis is only based on numbers of genes present, not variants or truncated genes. It has been shown that differences in virulence can be associated to lineages and CCs, and specific variants and truncated genes are pointed out as the main features associated with these differences [58,59]. Frequent loss or truncation of genes described to be vital for virulence or pathogenicity has been confirmed as a recurring pattern in L. monocytogenes [60].
We acknowledge that the dataset presented here is limited, and that a larger study is needed to establish a possible epidemiological connection between salmon slaughterhouse derived L. monocytogenes and listeriosis. To establish the most discriminatory epidemiological relationship between environmental and clinical populations, cgMLST analysis of the WGS data must be complemented with Single-Nucleotide Polymorphism (SNP) analysis [61,62], but it was beyond the scope of the present study to further infer any such epidemiological links. It should also be remembered that clinical isolates were selected to match environmental isolates (based on MLVA). Although MLVA subtypes of clinical isolates included here represents approximately 35% of clinical manifestations in Norway from 2005 to 2016, we acknowledge that this may bias the results.
WGS combined with cgMLST has proven to be an invaluable tool for genomic typing of various food-borne pathogens including L. monocytogenes, by providing optimal resolution and enabling comparisons across different epidemiological sectors [13,33]. In addition to molecular typing, WGS may be applied for the identification of genes contributing to persistence of L. monocytogenes in the food production plants and to determine the human virulence potential of these strains [13,45,63]. The use of WGS also in food, agricultural, and environmental L. monocytogenes surveillance, will significantly enhance foodborne L. monocytogenes source identification, risk assessment, and understanding of L. monocytogenes transmission routes. It will allow for comparison between clinical and food/environmental L. monocytogenes strains, and further enable the food and agricultural industry to elucidate factors that leads to L. monocytogenes persistence and biofilm formation, as well as tolerance to chemicals, detergents and food preservatives, heath treatment, etc., and thus make it easier to improve food safety.
The present study is especially relevant considering the recent (2015-2019) outbreaks in Denmark, Germany, France, Estonia, Finland, and Sweden which were linked to Norwegian salmon smoked in Poland and Estonia, where a lack of WGS data complicated identification of the source of the contamination [26,27]. Ultimately, epidemiological evidence and multiple WGS analyses should be combined to elucidate the possible role of environmental L. monocytogenes as the subsequent source of listeriosis.

Conclusions
The present study shows that the genetic distance between environmental and clinical isolates may be very short, i.e., ≤10 allelic differences over 1708 genes, indicating that they may be assigned as clonal, with the implication that they are descended from the same recent ancestor. Out of 82 virulence genes, 72 were observed in all isolates and with similar frequency in environmental and clinical isolates, indicating that no relevant difference in carriage of virulence genes were found in relation to environmental or human origin. Even though our data do not show a conclusive epidemiological link between listeriosis patients and commercially produced salmon products in Norway, several outbreaks with salmon as the responsible food source have been documented in the international literature. It is important to stress, however, that it is still a prerequisite in listeriosis source identification and epidemiological surveillance that similar genotypes are found in both incriminated food products and from clinical manifestations and can be coupled to the patients' history of food intake, and that although identical cgMLST profiles are found both in salmon processing plants and in human isolates, it cannot be concluded that salmon from these plants have served as vehicles for human listeriosis. Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: All environmental isolate sequences and assembled genomes used in this study are deposited as a Bioproject at DDBJ/ENA/GenBank under the accession JAHHPW000000000. The version described in this paper is version JAHHPW010000000. The accession no.'s of the WGS are CP050023 to CP050030, CP075871 to CP075878, CP050129, and CP076051. All CLINICAL isolate sequences and assembled genomes of this study were used after agreement with NRL, so that restrictions may apply to the use and reproduction of these data. Data are however available from the authors upon reasonable request and with permission of NRL and NIPH.