Identification of Arctic Food Fish Species for Anthropogenic Contaminant Testing Using Geography and Genetics

The identification of food fish bearing anthropogenic contaminants is one of many priorities for Indigenous peoples living in the Arctic. Mercury (Hg), arsenic (As), and persistent organic pollutants including polychlorinated biphenyls (PCBs) are of concern, and these are reported, in some cases for the first time, for fish sampled in and around King William Island, located in Nunavut, Canada. More than 500 salmonids, comprising Arctic char, lake trout, lake whitefish, and ciscoes, were assayed for contaminants. The studied species are anadromous, migrating to the ocean to feed in the summers and returning to freshwater before sea ice formation in the autumn. Assessments of muscle Hg levels in salmonids from fishing sites on King William Island showed generally higher levels than from mainland sites, with mean concentrations generally below guidelines, except for lake trout. In contrast, mainland fish showed higher means for As, including non-toxic arsenobetaine, than island fish. Lake trout were highest in As and PCB levels, with salmonid PCB congener analysis showing signatures consistent with the legacy of cold-war distant early warning stations. After DNA-profiling, only 4–32 Arctic char single nucleotide polymorphisms were needed for successful population assignment. These results support our objective to demonstrate that genomic tools could facilitate efficient and cost-effective cluster assignment for contaminant analysis during ocean residency. We further suggest that routine pollutant testing during the current period of dramatic climate change would be helpful to safeguard the wellbeing of Inuit who depend on these fish as a staple input to their diet. Moreover, this strategy should be applicable elsewhere.


Introduction
The accumulation of anthropogenic contaminants in food fish has a long and tragic history as well as being a current global health concern. International conventions and risk management policies have been introduced to address individual contaminants, including those derived from burning fossil fuels or industrial infrastructure or processes that involve electrical transformers and plasticizers in the case of mercury (Hg) and persistent organic pollutants, respectively [1 -3]. However, such regulations are typically not comprehensively enforced and can lag decades behind problems arising from industrial innovations, as reflected in present concerns over nanomaterial and microplastic Canada. According to ITK and as well as otolith-determined age-growth plots [35] (with other data not shown), the sampled fish are anadromous, migrating up rivers prior to sea ice formation in the autumn and down again in the spring to feed in coastal waters. The focal species, Salvelinus alpinus (Arctic char), S. namaycush (lake trout), Coregonus clupeaformis (lake whitefish), and the ciscoes, including C. autumnalis (Arctic cisco) and C. sardinella (least cisco), were chosen because they are targeted for local consumption both on the island and adjacent mainland sites, although the ciscoes are principally by-catch. It was our hypothesis that population genomic patterns combined with a geographical analysis would correlate with levels of anthropogenic contaminants and thus be a surrogate for impractical and expensive chemical analysis of hundreds of random fish samples. It is important to note that this exploratory evaluation represents an initial assessment, designed to determine the merit of such a novel strategy, rather than a definitive prescription to safeguard the health of consumers.

Indigenous Knowledge, Mapping of Fishing Locations, and Sampling Methodology
Fishing sites were identified by Gjoa Haven ITK and community fishers in an area of about 67,000 km 2 on or adjacent to King William Island and south of the Adelaide Peninsula in the Kitikmeot Region of Nunavut Canada. Indeed, the project was initiated by the Hunters and Trappers Association and supported by Elders in the community, with these groups continuing to guide the process. New approaches to the gathering of ITK facilitated the identification of fishing locations as well as traditional methods and practices [36]. Fish were sampled in December-June under ice and August-September in open water using nets or occasionally with hand lines or spears as described previously [37]. Additional "non-traditional fishing sites" were used to sample Arctic char and lake whitefish to enrich the genetic analysis. Licenses to fish for scientific purposes were obtained in accordance with Section 52 of the general fishery regulations of the Fisheries Act, Department of Fisheries and Oceans Canada (DFO). These and animal care permits were issued by the Freshwater Institute Animal Care Committee of DFO (S-18/19-1045-NU and FWI-ACC AUP-2018-63).
After capture, fish were assigned a unique numeric barcode identifier [38], photographed, measured for fork length, weighed, and otoliths were dissected and subsequently dried for age analysis [35,39]. Muscle and other tissue samples including fin clips for contaminant and genomic DNA isolation were placed in barcode-tagged sterile whirl-paks ® , empty tubes, or tubes containing 70% ethanol, respectively, and frozen at -20 • C and shipped with freezer packs to Queen's University for analysis, as previously detailed [32,37]. The remainders of the dissected fishes were returned to local community members after processing or distributed from the Gjoa Haven Hunters and Trapper's community freezer.

Contaminant Analysis
Muscle tissue was used for all contaminant analysis. Frozen muscle samples (200 mg;~2 cm 3 ) from sampled salmonids were oven dried overnight at 70 • C or at 24 • C for two days for analysis of inorganic elements and Hg, respectively. Samples from a total of 531 fish were individually ground and submitted to the Queen's University Analytical Services Unit (QUASU; Kingston, ON, Canada). They were analyzed for 59 elements, including As but excluding Hg, by acid digestion followed by measurements using inductively coupled plasma mass spectrometry (ICP-MS), except for boron (B), phosphorus (P), and sulfur (S), which were measured using ICP-optical emission spectroscopy (OES). Analysis of total Hg (from a total of 540 fish) was accomplished by thermal decomposition of the solid sample, amalgamation, and atomic absorption spectroscopy in an Hg analyzer (DMA-80). Economic considerations dictated that PCBs were only analyzed in 20% (101/531) of the sampled fish analyzed for inorganic elements, but took into account age, species, and fishing site in an effort to target 10 samples for each site and species, although this was not always achieved. Samples were processed (dried and ground) as for Hg analysis (but as separate subsamples). Samples (1-2 g) were analyzed at QUASU for lipids and for PCBs; the latter analysis was by extraction and gas chromatography-electron capture detection (GC-ECD) and gas chromatography-tandem mass spectrometry (GC-MS/MS) analysis for individual congeners using standard procedures (USEPA, 8082A). Quality assurance/quality control steps included analytical duplicates, certified reference materials (TORT-2) and appropriate controls and blanks for inorganic elements and Hg, and for PCBs, analytical duplicates, controls, blanks, and surrogates.
Arsenic speciation analysis was carried out by ALS Global Environmental, Vancouver, British Columbia, Canada. The dried, ground fish samples were further homogenized and then extracted using a methanol/enzymatic (protease, alpha amylase and lipase in 25% methanol solution) extraction procedure at 37 • C. The extracts were analyzed using anion exchange high performance liquid chromatography-ICP-MS. It should be noted that QUASU and ALS Global are accredited by the Canadian Association for Laboratory Accreditation Inc. to the standards of ISO/IEC 17025. Methods used were listed in the scopes of accreditation at the time of analysis.
Corrections to wet weight (ww) values when required (Hg, PCBs) were made using percent moisture obtained from a subset of samples according to fish species, either Arctic char, cisco, lake whitefish, or lake trout. All results were shared with community members in meetings with the Gjoa Haven Trappers and Hunters Association as well as the community, and additional fishing sites were recommended at each discussion, making ITK, site selection, and chemical analysis an interactive process.

Statistical Methods for Contaminants
Hg (as ww concentrations) and As (as dry weight; dw) concentrations, according to conventions in the contaminant literature, were analyzed for differences between means using non-parametric Kruskal-Wallis statistics, as analysis of variance (ANOVA) on log-transformed data that did not yield normally distributed residuals. Spearman rank correlations (a non-parametric measure of correlation) were calculated between age, Hg, As, selenium (Se), and nutritional elements including calcium (Ca), chromium (Cr), cobalt (Co), copper (Cu), iron (Fe), magnesium (Mg), manganese (Mn), P, potassium (K), sodium (Na), S, vanadium (V), and zinc (Zn). All statistical tests for Hg and As were performed in XLSTAT 2020.4.1, where As concentrations below the detection limits were imputed using log-normal regression on order statistics (ROS) methods in ProUCL 5.1.
For the PCB analysis, data included the PCB congeners, inorganic elements, site, collection date, species, length, age, and % lipid. Where the reported congener datasets differed (because of different resolution of congeners on the GC column), separately reported congeners were summed to obtain the same congener dataset for each sample. Calculations were determined for total PCBs, total PCBs on a lipid weight basis (T-PCBlip), dioxin-like PCBs (DL-PCBs), non-dioxin-like PCBs, and the sum of six European Union (EU) PCBs (PCB6); details are provided (Supplementary Materials Table S1). PCB and inorganic element concentrations, including Hg, were in dw unless compared to published guidelines, as previously indicated, where ww was used. The data used for statistical analyses of PCB concentrations, performed with XLSTAT unless noted otherwise, included substitutions for non-detectable values obtained using log-normal ROS methods in ProUCL 5.1. ProUCL 5.1 was also used for two sample testing, as it allows for inclusion of non-detectable values and non-parametric approaches. Spearman rank correlations were calculated between age, % lipid, total PCBs, T-PCBlip, DL-PCBs, and inorganic elements with > 5 detectable values. Principal components analysis (PCA) was conducted with PCB congeners only (proportions, Spearman correlations) and with PCB total parameters and other parameters including inorganic elements, age, and % lipid (log-transformed, Pearson correlation). Non-parametric testing of means of PCB concentrations was conducted using the Kruskal-Wallis test on ranks, and ANOVA was conducted on log-transformed data to test for differences between geographic location groups and fish species. ANOVA results indicated that, for most of the PCB data, residuals were not normally distributed (even after log transformation), and therefore non-parametric Kruskal-Wallis testing was used.

Genetic and Bioinformatic Analysis
DNA was extracted from fin clips and muscle samples from 429 Arctic char samples using either a Qiagen DNeasy Blood and Tissue kit (Qiagen, Venlo, The Netherlands) following the manufacturer's protocol or a salt extraction method [40]. DNA sample purity and concentration were assessed with a spectrometer and fluorometer as well as by electrophoresis as described [32]. Double-digest restriction associated DNA sequencing (ddRAD-Seq) libraries were constructed at the Institute for Integrative Systems Biology (Laval University, Quebec City, QC, Canada) using the restriction enzymes, SbfI and MspI. In the last step of library preparation, samples with unique barcodes (unique short sections of DNA to distinguish among individual samples) were pooled and then sized with a BluePippin ® DNA size selection system (Sage Science Inc., Beverly, MA, USA). SNP fragments of the appropriate size were then sequenced as single-end, 100 bp reads on a HiSeq2000 platform (Illumina, San Diego, CA, USA).
Filtering and SNP calling have been described in detail [32] with sequences submitted to the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) database under BioProject accession # PRJNA680999. Briefly, the libraries were demultiplexed and aligned to a reference genome (Arctic char GenBank accession: GCF_002910315.2; [41]), with variant calling and genotyping performed using SAMtools (v1.9) and BCFtools (v1.9; [42]) to obtain a SNP dataset, which then was filtered using VCFtools (v0.1.14; [43]). As reported [32], the final data set included 413 samples and 3055 SNPs. These were analyzed using a suite of population genetic approaches and showed population structure in the Lower Northwest Passage with genetic division between King William Island and the southerly mainland sites (and thus designated "Northern" and "Southern" populations). Here, population assignments were performed, and the power of the SNP panel to assign an individual back to its genetic cluster or, alternatively, its site of capture, was tested. In the first test, the 413 Arctic char DNA samples were tested to determine if samples taken at random could be reliably assigned to the one of the two populations ("Northern" vs. "Southern"). In the second test, the assignment of random samples to 17 individual fishing sites was undertaken. For both tests, missing data were also imputed in the SNP datasets based on allele frequencies per group using the function RandomForestRC [44] as implemented in R program grur [45] with 100 random trees and 10 iterations. There were thus two SNP datasets for each test, one with original data including missing values, and the other with imputed data. The program gsi_sim [46,47] from the R package AssigneR [48] was used to perform the assignments on both the original and the imputed datasets. Assignment tests to the two populations required 7 steps: (1) a subset of samples (N = 174) was randomly chosen from each of the two groups to eliminate any bias due to uneven sample size; (2) 50% of the samples were randomly picked to form a training dataset, and the remaining 50% comprised the holdout dataset; (3) markers were ranked based on F ST values computed from the training dataset, and the top 2,4,8,16,32,64,128,200,500,1000,2000, and 3055 markers were used as panels of loci to evaluate the impact of the number of markers used for assignment; (4) the holdout samples were then probabilistically assigned to either "Northern" or "Southern" populations; (5) steps 2-4 were repeated 30 times; (6) steps 1-5 were repeated three times (thus for each data set, there were three replicates each with 30 iterations); and (7) the average performance was then assessed to represent assignment accuracy. Theoretical assignment to 1 of 17 fishing sites did not use steps 1 and 6 due to small sample sizes at some sites, but the rest of the steps were followed for both original and imputed datasets.

Mapping Subsistence Fishing Locations and Sampling
ITK proved crucial for the mapping of subsistence fishing locations, including local context as to routes, equipment (hand lines, spears, and nets), as well as optimal times for western sampling protocols and resulted in robust knowledge exchange [36,49]. The mapping tool and the interactive online atlases facilitated community validation by Elders and harvesters and resulted in the sampling at more than 10 traditional subsistence fishing sites, excluding sampling done at some closely adjacent sites. Contaminant analysis was conducted on 540 salmonids (~200 Arctic char, as well as~100 each of lake trout, lake whitefish, and ciscoes, with a few fish not yielding data for particular contaminants), and the genetic analysis used an additional~200 more Arctic char, including those from non-traditional fishing sites, to augment the SNP analysis. Harvest information results were made available to the community and can be visualized in the Gjoa Haven Nattilik Heritage Centre using a large touch screen, thus enabling the use of this knowledge and extension with an initial Traditional Land Use and Ecological Knowledge Atlas followed by a Commercial Quotas and Opportunities Atlas. These were further annotated by community members and harvesters, where additional sites of subsistence fish were identified. Relevant aspects of these atlases are presented as a single map, on which geographical groupings of the fishing sites have been placed as has also been made available for public access (Figure 1; also https://tsfn.gcrc.carleton.ca/index.html?module=module.tsfnatlas.quotas). Additional details of each fishing site including site number, location, equipment used, water type, and if fish from a particular site were sampled for contaminants and/or genetics are shown in Table 1. There were differences in average age, weight, and growth rates depending on the salmonid species. Arctic char sampled ranged in age from 5-29 years with a mean of 14.2 ± 0.6 years (95% confidence limits), with an average weight of 3.086 ± 247 g and a linear growth rate of 47.1 ± 1.9 mm·year −1 . The lake trout samples representing the other Salvelinus species were generally older and slower growing, ranging from 8-62 years (mean = 25.4 ± 1.6), weighed 3.308 ± 186 g with a growth rate of 27.8 + 1.4 mm·year −1 . Of the Coregonus salmonids sampled, the ciscoes were similar in age to Arctic char, ranging from 2-30 years (mean = 14.8 ± 1.4) but much smaller with an average weight of 504 ± 66 g and a growth rate comparable to lake trout at 28.5 ± 2.9 mm·year −1 . The other Coregonus, lake whitefish ranged from 4-47 years (mean = 21.3 ± 2.2), with a weight of 935 ± 79 g and the slowest growth rate, perhaps reflecting their existence at the northern edge of their range at 23.  Table 1. Fishing sites sampled in the Kitikmeot region of Nunavut, Canada, showing fishing site numbers and names or designations, global positioning system coordinates, geographic group assignments, as well as water type (fresh or saline), fishing gear used, and if samples from this location were analyzed for contaminants (C) and/or genetics (G).

Variation of Hg, As, and PCB Concentrations by Fish Species
Contaminant levels varied depending on the age of the sampled salmonids. As the fish aged, there was some accumulation of inorganic elements including Hg, independent of fish type (    Not all the salmonids accumulated contaminants to the same levels. The mean Hg in Arctic char in the sampled waters was 0.07 mg·kg −1 ww, with similar concentrations for lake whitefish (0.11 mg·kg −1 ww) and ciscoes (0.09 mg·kg −1 ww), although the average concentrations in the older salmonids were higher (Figure 2A). Lake trout Hg levels were also statistically higher in older fish (2 sample t test, DF = 140, p < 0.0001) and across all age classes had a higher mean level of 0.36 mg·kg -1 ww, statistically higher than the mean Hg concentration for all other fish (Kruskal-Wallis, n = 540, DF = 3, p < 0.0001). Differences were seen between the overall means of other fish species as well (p < 0.0001), except for ciscoes and lake whitefish.
Overall, As levels in the sampled fish varied over a broad range of concentrations (<0.5-270 mg·kg −1 dw), and a wide range was also seen when individual species were examined ( Figure 2B). A positive association of fish age with As levels was seen for all the salmonids except for the ciscoes (2 sample t test, DF = 201, p < 0.0001 for char, DF = 135, p < 0.0001 for lake trout, and DF = 102, p = 0.04 for lake whitefish). As levels varied depending on the species, with the highest average values (25 mg·kg −1 for young fish and 59 mg·kg −1 for older fish) found in lake trout, with the overall mean significantly higher in these fish than in the other species (Kruskal-Wallis, n = 537, DF = 3, p < 0.0001). The lowest values were seen in lake whitefish (3.9 mg·kg −1 in young and 5.1 mg·kg −1 in older fish), with the overall mean significantly different from the other fish species (p< 0.0001 to p = 0.008); means for Arctic char and cisco were not significantly different from each other.
Overall, the mean PCB concentration was 15 µg·kg −1 ww. However, since only 20% of the fish processed were assayed for PCBs due to costs, comparisons of contaminant concentrations with age were somewhat limited ( Figure 2C). Nevertheless, similar to the overall As results, the values varied over a wide extent (0.04-367 µg·kg −1 ww) with Arctic char showing the broadest absolute range of all the salmonids (0.15-367 µg·kg −1 , means of 5.6 in young and 27 µg·kg −1 in older fish). Lake trout had mean values of 7.1 in young and 26 µg·kg −1 in older fish, and, curiously, PCB levels in lake whitefish showed a reverse of the general trend for contaminant levels in all fish species in that they had higher mean values in young fish compared to older fish (20 vs. 7 µg·kg −1 , respectively; Figure 2C). It should be noted that, since the PCB sample sizes were necessarily small, these differences were not statistically significant.

Geographical Analysis of Contaminant Levels
To understand site-specific contaminant levels and to keep sample numbers sufficiently high, all of the fishing sites were divided into four groups based on their geographic locations (Figure 1). When fish were grouped irrespective of species, group 2 fish, located on the island close to Gjoa Haven, had significantly higher Hg levels than group 1 and mainland group 4 fish (Kruskal-Wallis, n = 540, DF = 3, p = 0.022, 0.027), with no significant differences between group 3 fish and any other groups or between group 1 and 4 fish. Species-specific comparisons showed that Arctic char obtained from nine fishing sites and grouped into four geographic regions showed low Hg levels independent of location, ranging from 0.05-0.13 µg·g −1 ww ( Figure 3A). Lake trout fished from sites placed in three groupings showed the highest levels ranging from 0.14-1.36 µg·g −1 ww. Coregonus species generally showed low levels at all locales (0.02-0.13 µg·g −1 ww), but ciscoes from King William Island sites had significantly (p < 0.01) higher mean Hg levels than those obtained from mainland sites (groups 1 and 2 vs. 3 and 4) at 0.16 and 0.08 µg·g −1 ww, respectively. It should be noted that community members were interested in obtaining contaminant information from all individual sites, but this was not possible. For life history or logistical reasons, individual species were not obtained or were taken in insufficient numbers at all sites, particularly since this region represents the northern most distribution of lake whitefish [50].  In contrast to the generally species-specific results for Hg levels, clear differences were apparent when As levels from all salmonids obtained from mainland fishing sites were compared to King William Island sites (groups 3 and 4 vs. group 1 and 2), and these were statistically significant (Kruskal-Wallis, n = 537, DF = 3, p = 0.861 for groups 1 and 2, p = 0.997 for groups 3 and 4, p < 0.0001 between the island sites and the mainland sites). This geographic trend was also seen for Arctic char, lake trout, and cisco (Kruskal-Wallis, n = 203, DF = 3 for char; n = 137; DF = 2 for lake trout; n = 93, DF = 3 for cisco, and p ≤ 0.001), but only the group 1 lake whitefish mean, and not group 2, was significantly different from the means of lake whitefish groups 3 and 4 (n = 104, DF = 3, p ≤ 0.002). Additionally, the mean As in Arctic char from group 3 was significantly lower than that of group 4 (n = 203, DF = 3, p = 0.048), but no differences were seen between group 3 and 4 means for other fish ( Figure 3B).
When PCB levels for all the fish species were combined, no statistically significant differences were seen overall between the different geographical groups. Arctic char samples analyzed at all four geographic location groups similarly showed no statistical differences for total PCBs ( Figure 3C). However, DL-PCB content from Arctic char caught from sites within the island group 1 was significantly lower than mainland group 4 (Kruskal-Wallis, DF = 3, p = 0.03) as well as mainland groups (3 and 4) in T-PCBlip (Kruskal-Wallis, DF = 3, p = 0.002; Figure 3C). PCB concentrations from ciscoes in group 1 and 4 were not significantly different for total PCBs and T-PCBlip and for DL-PCBs, but lake whitefish from group 1 were significantly higher than those from group 4 (2-sample t test, DF = 21, p = 0.01 for total PCBs, 0.003 for T-PCBlip, and 0.01 for DL-PCBs; Mann-Whitney test gave similar results). PCB content differences between species were not apparent at location group 4, the only location group for which all fish species were analyzed. On the other hand, Arctic char PCB concentrations in group 1 were significantly lower than group 1 lake whitefish (Kruskal-Wallis, DF = 3, p = 0.03 for total PCBs, T-PCBlip and DL-PCBs) and ciscoes (no difference, p = 0.086 for total PCBs, 0.013 for T-PCBlip, and 0.029 for DL-PCBs). Only Arctic char were analyzed from location groups 2 and 3 ( Figure 3C).
Exploratory PCA analysis indicated different fingerprints on the basis of PCBs, lipid, age, and inorganic elements with Arctic char samples from group 1 and 2, cisco and lake whitefish from group 1, and lake whitefish from group 4 plotting somewhat separately from each other but with a general overlap for all Arctic char and lake trout (Supplementary Materials Figure S3B). The PCA was influenced by correlations between PCBs, Hg, As, age, and some nutritional elements, and, interestingly, a lack of correlation between PCBs and lipid as well as a negative correlation between age and lipid (older fish are leaner but contain higher concentrations of PCBs). PCB congener fingerprints again using PCA analysis indicated that Arctic char from island fishing sites (groups 1 and 2) plotted separately from each other and generally from the rest of the fish (Supplementary Materials Figure S4B). Three lake whitefish samples from group 1 were clustered together but plotted slightly away from other lake whitefish samples (group 4), which plotted in the center of the plot with the other fish from group 4. Different locations on the PCA plot suggest differences in congener profiles, albeit with profiles in groups 1 and 2 for Arctic char and group 1 for lake whitefish showing the same predominant congeners, characterized by five or six chlorines (Supplementary Materials Figures S5 and S6). The differences were seen in Arctic char having higher levels of lower chlorinated congeners (2-4 chlorines) as well as higher chlorinated congeners (8-9 chlorines) than lake whitefish from group 1. Strikingly, the lower and the higher chlorinated congeners were almost absent in lake whitefish. PCB congeners 110, 153, 118, and 138 represented the largest peaks, and three or more of these peaks were amongst the highest five peaks in 84% of all samples (Supplementary Materials Figures S5 and S6).

Arctic Char Geographical and Population Assignments as Revealed by Genetics
Population assignments were undertaken for Arctic char using 3055 SNP markers, with one population that included fish caught on or near King William Island and designated as "Northern" residents and a mainland or "Southern" population ( [32]. Random samples of 413 Arctic char DNA samples were successfully assigned to the one of the two populations using as few as 32 SNPs, producing > 90% correct assignments overall. However, even as few as four SNPs generated > 85% correct assignment ( Figure 4A). Reassuringly, when the analysis was repeated three independent times, the results were unchanged and remained so even when missing data were imputed (Supplementary Materials Figure S7A). Less than 13% of individuals sampled from the "Southern" population (21/174 in the dataset with no data imputation and 22/174 in dataset with missing data imputed) were mis-assigned as "Northern" Arctic char, while all 174 DNA samples from "Northern" fish were successfully assigned back to this population ( Figure 5A; Supplementary Materials Figure S8A).  The circles on the red dashed diagonal represent successful assignments, and other circles display mis-assignments in which fish were not successfully assigned. Circle sizes are proportional to the number of fish. (A). Regional level assignment tests with Arctic char taken at random and designated either to "Northern" or "Southern" populations. (B). Fishing site assignment tests, with each individual fishing location indicated by a number, corresponding to the sites listed in Table 1 (with spring-and winter-netted Arctic char shown as 1_S and 1_W, respectively). Similar figures analyzed with any missing data imputed are found in Supplementary Materials Figure S8.
In contrast, the designation of individual Arctic char to particular fishing sites was much less successful either with non-imputed or imputed datasets ( Figure 4B, Figure 5B, Supplementary Materials Figures S7B and S8B). Depending on the number of markers, the overall correct assignment for the fishing site-level analysis varied from 10% (two SNPs) to 50% (500 SNPs) (not shown). Three fishing sites showed reasonable assignment success (>75%) with the full SNP panel ( Figure 4A). Alternatively, rather than using the full panel, if only 200 SNPs with the highest computed F ST values were used, assignment to 35% of the fishing sites achieved a success rate of >75%. Several sites showed close to 100% mis-assignment no matter the number of markers used (Figures 4B and 5B).

Distribution of Contaminants by Species and Geography
Fish are a dietary staple for Indigenous communities living along the Arctic coasts. According to a wildlife harvest study, Gjoa Haven residents consumed 9279 Arctic char, 2427 lake trout, and 4080 whitefish and cisco annually [51]. Therefore, we calculate that more than two servings of these locally harvested species are consumed per person per week, given an estimated total edible weight of approximately 22,500 kg and an average population of 920 [52][53][54]. Thus, the identification of Arctic food fish that bear high contaminant loads is crucially important for Inuit who depend on these resources for subsistence. However, costs of these analyses need to be balanced against funding demands to address numerous twenty-first century challenges including social transformation, increases in population, climate change-mediated ice and permafrost thaw, in addition to anthropogenic pollutants. This is an important consideration since, as previously mentioned, only a single contaminant had been tested in "a few" Arctic char from the community under study 16 years ago. We considered that geographical and genomic analyses would offer the prospect of directed contaminant testing, if targeted to anadromous food fish in the Arctic Ocean, and, if successful, this strategy would be applicable to similar challenging environments elsewhere.
We selected Hg, As, and PCBs for assay at the behest of the people of Gjoa Haven, an Inuit community on the shores of the lower Northwest Passage, who, as indicated, rely on anadromous salmonids as an important part of their traditional diet. Success in sampling depended upon a respect and integration of ITK and interactive mapping and atlases as well as sample tracking using a newly developed R-based barcoding system [36,38]. A large number of samples were obtained and, overall, 6% (31/540) of the fish tested for Hg exceeded Canadian guidelines for commercial sale (0.5 mg·kg −1 wet weight) [55], with preliminary analysis of a few fish showing that much of the Hg was in the neurotoxic methylated form. Notably, all but one of the exceedances were from lake trout, representing 21% of the trout tested. Since these fish are not marketed but consumed entirely within the community, a~0.2 mg·kg −1 recommendation for subsistence consumption may be more appropriate, similar to those issued by several jurisdictions, including American states (e.g., Alaska advisory for women and children at 0.15 mg·kg −1 [56], sport fish in California at 0.2 mg·kg −1 [57], the US New England region at 0.2-0.3 mg·kg −1 [58]; and Canadian provinces (e.g., consumption limit advice for fish levels between 0.2-0.5 mg·kg −1 [59]). Although contaminant levels were generally higher in older fish, irrespective of species or contaminant type (Figure 2), for Hg, 70% of all lake trout sampled exceeded this level, even for the youngest fish harvested (8 years old). Few (6-8%) of the other salmonids exceeded this lower benchmark. Lake trout, unlike adult Arctic char, which fast during the winter, can be caught on baited lines under the ice and thus are a popular source of protein for the Gjoa Haven Hunters and Trapper's "food bank" freezer as well as groups such as "moms and tots", "senior lunches", and "prenatal cooking classes". The high Hg levels in lake trout and the absence of regional or territorial consumption guidelines are thus cause for concern.
Bioaccumulation of high Hg levels in lake trout undoubtedly reflect the positive correlation between contaminant levels and fish weight [60], but, in addition, they are piscivores and feed throughout the year seasonally in both fresh and marine waters, in contrast to the overwinter fasting by Arctic char with similar average weights. Community members were concerned about Hg pollution and wondered if it was regionally governed, in that fishing sites, either close to Gjoa Haven or more distant, might vary in concentration and thus could usefully dictate safer fishing sites for their popular trout derbies, for example. Unfortunately, when all salmonids were compared across the four geographic groupings, it was fish from island fishing sites closest to Gjoa Haven that had significantly higher Hg contamination than mainland sites (group 4), and ciscoes caught on the island were also significantly higher than mainland sites ( Figure 3A). Island fishing sites are convenient, especially since access does not require travel over the Arctic ocean with its seasonal rough seas or sea ice. It is not known why island-sourced fish showed generally higher levels of Hg, but thawing permafrost and shallow lakes on the island could make fish that consume freshwater high trophic level prey especially vulnerable to the legacy of atmospheric Hg emissions.
As levels in the sampled salmonids varied over a very large range, more than two orders of magnitude (<0.5-270 mg·kg −1 dw), with this much variation seen in older lake trout alone ( Figure 2B). Indeed, As concentrations were highest in lake trout, followed by Arctic char and cisco, with the lowest levels found in lake whitefish. As can bioaccumulate in fish [61], and as top predators with high growth rates, it is not surprising that lake trout and Arctic char showed the highest concentration of this contaminant. As concentrations in 64% of the salmonids exceeded the maximum level of 3.5 parts per million (ppm; mg·kg −1 ) in Canada's Food and Drug Regulations [62]. However, the guideline applies to defatted fish protein from specific fish families in the orders Clupeiformes and Gadiformes (e.g., smelt and cod), and all the anadromous Arctic species analyzed here belong to the order Salmoniformes. Nonetheless, these As concentrations were higher than in other reports of the same species, including Arctic char at 0.017-13 µg·g −1 dw [60,63,64], lake whitefish at 0.07-2.8 µg·g −1 dw [65][66][67][68], and lake trout at 0.2-2.65 µg·g −1 dw [65,67,69]. Therefore, mean levels in older lake trout of 59 mg·kg −1 may be troubling and warrant further investigation. Overall, the As concentrations in these fish were higher than for other locations in the Arctic, with the highest concentrations similar to those seen in strictly marine fish [70]. Ocean fish have higher As levels, and much of this is in the form of arsenobetaine, which is considered non-toxic because it is not metabolized by humans [71,72]. Because the adult Arctic char fast in freshwater and consume only marine prey, it is likely that some of the As would have been in this form, at least in this species. A preliminary As species analysis with four Arctic char sampled from mainland sites showed that arsenobetaine accounted for 67-97% of the As present. Therefore, it would be prudent to conduct a more comprehensive speciation analysis, especially in lake trout, to ensure food safety.
When all the samples were grouped geographically, mainland fish (groups 3 and 4) had~2-fold the As concentrations seen for the island group 2 and~10-fold the concentration found for island group 1. Additionally, higher As concentrations in group 3 and 4 fish were seen in all individual fish species, and mean As concentrations of group 1 and 2 fish were consistently indistinguishable for all species (Figure 3). The higher concentrations of As in mainland fish could be explained by the mostly marine or estuarine fishing locations; as mentioned, fish from such locations have higher levels of arsenic, principally arsenobetaine.
As indicated, economic considerations limited the amount of PCB testing, but when the PCB data from all species were amalgamated, total PCB values varied from 5-367 µg·kg −1 ww, similar to the range of PCB concentrations in fish from other Arctic locations, including former military sites at 0.5-364 µg·kg −1 ww [73][74][75][76][77][78][79]. Lake trout and Arctic char showed the highest mean levels of PCBs, which can be explained by this contaminant's known bioaccumulation and dependency on diet [80]. In this dataset, there was no correlation between fish PCB concentration and lipid content, contrasting with previous assumptions that fattier fish contain higher PCBs (e.g., [81], but also see [74]). The general observation that levels of the tested contaminants were higher in older fish was curiously reversed in lake whitefish, which live at the northern extreme of their distribution [50]. That these lake whitefish may be under stress is supported by otolith age analysis that shows that there are years where no juveniles were recruited. As well, the average condition (fish length/weight) of lake whitefish on the seasonal migration from ocean to lakes was significantly lower than the condition values at other times (not shown). Thus, we speculate that those lake whitefish with the additional burden of contaminant accumulation, including PCBs and Hg, may not live long lives, resulting in an apparent "decrease" in average PCB concentration in older fish. Indeed, sublethal levels of these chemicals have been observed to have a negative impact on another salmonid, rainbow trout, where a significant decrease in swimming performance and exercise recovery was seen following an injection of 100 µg·kg −1 PCBs, in the form of congener 126 [82].
Of the total salmonids analyzed for PCBs, 12% (12/101) exceeded freshwater sportfish guidelines (see Supplementary Materials Table S2). However, guidelines for freshwater fish may not be directly applicable to anadromous fish. The total PCB concentrations are below the value (2000 µg·kg −1 ww) used by the U.S. FDA [83], but these guidelines are still under national review in Canada [55]. None of the 101 fish assayed contained dioxin-like (DL)-PCBs at concentrations that gave 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD)-toxicity equivalency values exceeding those guidelines [62]. There was also no definitive geographical pattern, with the highest PCB concentrations in fish from all groupings and in all fish species ( Figure 3C). Although there appears to be no cause for concern, it would be prudent to continue monitoring fish from the island sites in group 2 and the mainland group 3 sites, as they are the closest to former distant early warning stations (DEW Line) where PCBs were used, including Gladman and Matheson Points on King William Island as well as Shepherd Bay and Simpson Lake on the mainland (Supplementary Materials Figure S9). A similar recommendation has been noted for former Alaskan defense sites [79]. Although the DEW Line stations were remediated up to 5 decades after closure, PCBs may have contaminated the waterways in the interim.
The numbers and the positions of chlorines on the biphenyl structure correspond to different PCB congeners, and these can be used as "fingerprints" for PCB mixtures. Nonetheless, PCB congeners 110, 153, 118, and 138 had the highest abundance in all the fish, and three or more were amongst the five predominant congeners in 84% of all samples (Supplementary Materials Figures S5 and S6). Strikingly, congeners 153, 118, and 138 predominate in Monsanto Industrial Chemical Company's Aroclor 1260, first manufactured in the 1950s, and these are routinely used as diagnostic congeners to identify this PCB mixture in soil samples (e.g., [57]). All three of these congeners also are found in Monsanto's Aroclor 1254, along with PCB 110. These findings suggest that the sampled fish have retained the characteristic peaks of these two Aroclor mixtures, which were used at the DEW Line stations ( [84]; Supplementary Materials Figure S9). The wide variation in PCB levels in different fish of the same species, even in the same geographic groupings, may very well reflect different levels of contamination in natal lakes in the region, another reason for further monitoring.

Genetic Tools for Geographical or Population Assignments and the Future of Contaminant Monitoring
Geographic groupings of contaminants showed variation, but there was a generally higher level of Hg in salmonids as a whole and ciscoes in particular from island fishing sites than on the mainland, with the reverse for As contamination in all sampled fish, as well as a clustering of distinct PCB congeners from lake whitefish and Arctic char from island sites (Figures 2 and 3, and Supplementary Materials Figure S4B). Theoretically, for routine contaminant analysis, it should be more cost effective and efficient for community members to net these anadromous species in the ocean during the salmonids' summer feeding period. Samples could be taken for DNA, and diagnostic "kits" could quickly identify those fish originating from particular regions, such as island waters, which could then be targeted for specific contaminant testing. There would be no need for expeditions to dispersed fishing sites in other seasons with more challenging weather conditions, and the accrued savings could be used to assay more fish. To our knowledge, this strategy has not been previously undertaken, although the utility of as few as 20-150 SNPs for population assignment in mixed populations of fish in the ocean has been demonstrated [85,86]. We tested this idea by taking 413 Arctic char DNA samples at random to determine if they could be correctly assigned to each of 17 fishing sites. The results were disappointing, with overall few correct assignments to individual sites. This remained true irrespective of whether thousands of SNPs or only a subset of SNPs with the highest computed F ST values were used. In the latter case, low assignment success characterized 65% of the fishing sites, with four sites approaching a surprising 0% success ( Figure 5). This was likely attributable to the low genetic differentiation in Arctic char among sampled fishing sites, exacerbated by the small sample sizes at some locations.
In contrast, population assignments were much more successful with random samples from 413 Arctic char correctly assigned as "Northern" residents, those fish caught on or near King William Island, as well as a mainland or "Southern" population. An individual Arctic char could be correctly assigned to one of these two populations more than 85-90% of the time with only 4-32 SNPs, respectively ( Figure 4). Therefore, rapid genetic assignments for Arctic char caught in the open ocean are feasible and suggest that new investigations and ongoing monitoring of contaminant levels using SNP markers could help target fish of possible concern, including, for example, those that could have higher As levels and some PCB congeners. Large panels of SNP markers are available for lake whitefish in this region with outlier analysis showing a few of these with potential to be linked to their natal sites [87]. If confirmed, such SNPs would be most helpful diagnostic tools for the higher PCB levels in this species from some island fishing sites. As yet, there has been no attempt to characterize populations of ciscoes or lake trout in this region, but it is not advised for this purpose alone. Ciscoes are not a favorite food fish, and considering that lake trout showed bioaccumulation of Hg, As, and PCB congeners, we reiterate our recommendation that lake trout consumption guidelines be developed by regional or territorial governments.
Our original hypothesis was that levels of contaminants would vary geographically and that genomic tools would be useful to help target food fish at risk, even in mixed stocks in the open ocean, circumventing the necessity to analyze hundreds of fish at particular fishing sites. This exploratory evaluation shows that analysis with as few as 4-32 genetic markers can correctly assign random Arctic char to geographically distinct regional populations. With additional testing, SNPs for lake whitefish may also hold promise. It is our hope that this novel strategy can be pursued in the future to allow more widespread testing of contaminants from these remote Arctic regions so as to help safeguard the wellbeing of Inuit who depend on these anadromous salmonids. Further, we expect that this approach will be applicable for the monitoring of vulnerable food fish elsewhere particularly in extreme locations.
Supplementary Materials: The following are available online at http://www.mdpi.com/2304-8158/9/12/1824/s1, Supplementry Materials: Correlations and statistical analysis, Figure S1: Correlation table of Spearman coefficient ρ values for PCBs and inorganic elements having 5 or more detectable values, Figure S2: Correlation table of Spearman's coefficient values for selected inorganic elements. Supplementary Materials: Principle components analysis, Figure S3: Factor loadings of variables used in PCA for all variables (PCBs, inorganic elements, age, and % lipid), and factor loadings of samples used in PCA for all variables, Figure S4: Factor loadings of variables used in PCA for PCB congeners and factor loadings of samples used in PCA for PCB congeners, Figure S5: Proportions of PCB congeners in groups from PCA (different fish species) and Aroclor 1254 and 1260, Figure S6: Proportions of PCB congeners as line graphs in groups from PCA (different fish species) and Aroclor 1254 and 1260, Figure S7: Arctic char single nucleotide polymorphisms used for assignments with success rates shown as a function of the number of markers used but with missing data imputed (similar to Figure 4A,B where data were not imputed), Figure S8: Assignment results with Arctic char single nucleotide polymorphism markers with missing data imputed (similar to Figure 5A,B where data were not imputed), Figure S9: Map of the Distant Early Warning stations (DEW line) in this region of the Arctic, Table S1: Details of calculated PCB parameters, Table S2: List of samples and concentrations that exceed the Ontario guideline used for setting fish advisories for freshwater sport fish and exceedances of other available guidelines, Table S3: Percent moisture for fish species showing percent moisture values from measurements performed at room temperature and used to convert dry weight Hg and PCB values to wet weight values.