MassArray Genotyping as a Selection Tool for Extending the Shelf-Life of Fresh Gilthead Sea Bream and European Seabass

Simple Summary This study focused on improving the fillet quality of European seabass and gilthead sea bream in aquaculture by exploring the genetic basis of fillet degradation after harvest. We identified specific SNPs related to enzymes affecting fillet quality and associated them with enzymatic activity using genotyping. By integrating this platform into breeding programs, we could enhance the shelf-life of fish products in a cost-effective manner. This is crucial for addressing the challenge of fresh fish perishability, ultimately reducing food waste and production costs in the aquaculture industry. Abstract In modern aquaculture, genomics-driven breeding programs have emerged as powerful tools for optimizing fish quality. This study focused on two emblematic Mediterranean fish species, the European seabass (Dicentrarchus labrax) and the gilthead sea bream (Sparus aurata), with a primary aim of exploring the genetic basis of white muscle/fillet degradation in fresh fish following harvest. We identified 57 and 44 missense SNPs in gilthead sea bream and European seabass, respectively, located within genes encoding for endogenous proteases responsible for fillet quality. These SNPs were cherry-picked based on their strategic location within the catalytic/regulatory domains of endogenous proteases that are expressed in the white muscle. Using MassArray technology, we successfully associated differentiated enzymatic activity of those endogenous proteases post-harvest as a phenotypic trait with genetic polymorphism of six SNPs in gilthead sea bream and nine in European seabass. These findings can be valuable attributes in selective breeding programs toward the extension of freshness and shelf life of these species. The integration of MassArray technology into breeding programs offers a cost-effective strategy for harnessing the potential of these genetic variants to enhance the overall quality of the final product. Recognizing that fresh fish perishability is a challenge, extending shelf-life is pivotal in reducing losses and production costs.


Introduction
One of the main pursuits in modern aquaculture is to increase the shelf life of the fresh final product, thus minimizing losses and overall production costs.Seafood is extremely perishable and typically degrades faster than other types of muscle foods.The extent to which these changes occur over time dictates the product's shelf life [1].Fish are more susceptible to textural deterioration post-mortem because of biochemical and microbiological deterioration due to their high moisture content, reactive endogenous enzymes, and enhanced nutrients [2].As a result, significant spoilage of fish occurs at various points along the production chain (post-harvest handling, processing, storage, and distribution), with considerable economic losses, product quality degradation, and customer safety concerns [3].Biochemical changes have a significant effect on the deterioration of the quality of fish fillets.These changes can be metabolic or structural (e.g., changes in the myofibrillar and changes in the extracellular matrix), all of which are triggered by endogenous proteases [4][5][6].Proteases that contribute to myotomia degradation can originate from both muscle tissue and the digestive system, provided the latter has not been removed prior to storage [7].Collagenases [8,9], which hydrolyze connective tissue collagen, as well as cathepsins [5,10] and calpains [11,12], which proteolyze muscle fibril proteins, play a critical part in this process.
These enzymes belong to multi-member gene families, with a plethora of members being expressed in the white muscle tissue of European seabass and gilthead sea bream.The genetic variability in these proteolytic enzymes can be used as a tool for genomic selection and prolongation of fillet shelf life [13].Shelf life is the period before a food product is considered unsuitable for consumption or sale.During the last several years, reliable methods have been developed to extend the shelf life of food products with formulation, processing, or packaging innovations [14][15][16][17][18][19].
European seabass and gilthead sea bream are the two emblematic fish species in Mediterranean marine aquaculture.At the European level, they rank third and fourth, respectively, in value after Atlantic salmon and trout [20].Modern fish farming has embraced the importance of genetic selection using existing genomic technologies to estimate well-characterized genetic diversity and enhance broodstock formation and selection approaches [21].Achieving the goal of genetically selecting and improving a population in the context of breeding programs often necessitates the production of genetic data for whole genomes, such as single-nucleotide polymorphisms (SNPs), from a significant number of individuals.When these polymorphisms are associated with a specific trait, this information can be utilized for targeted parental selection to ensure the prevalence of the desired traits in a population [22].Over the years, significant genomic tools for European seabass and gilthead sea bream have been developed, including the sequencing and annotation of their whole genomes [23,24].Over the last decade, genome-wide association studies (GWASs) have contributed significantly to new discoveries of genes related to various traits.Despite the array's utility for gene identification, a fundamental need remains for platforms that enable the affordable and effective genotyping of a customized SNP list for certain parts of the genome.For instance, once SNPs associated with a particular phenotype are identified in a GWAS analysis, replication of the findings in a second sample is often required.Often, only a few dozen SNPs require genotyping at this time [25].Consequently, a substantial fraction of the data generated in a GWAS is redundant, resulting in inefficient resource utilization [26].MassArray technology is an approach that is appropriate for reproducing polymorphisms in a second population.The Agena Bioscience MassARRAY ® system is a genotyping platform that enables the genotyping of tens to hundreds of user-defined SNPs in hundreds or thousands of high-performance DNA samples.Multiplex PCR design is accomplished by grouping selected SNPs (up to 40 suitable SNPs) [25,27].
To our knowledge, this is the first attempt to use genotyping to identify polymorphisms associated with this specific trait and generate data that can be utilized for parental selection in Mediterranean-farmed fish species.
The objectives of this study were (i) to identify variants in genes encoding for calpains, cathepsins, and metalloproteases responsible for muscle deterioration in gilthead sea bream (Sparus aurata) and European seabass (Dicentrarchus labrax); (ii) to genotype missense variants, as they are known to alter the genetic code affecting the function of a protein, and to select those located within crucial domains for the protein function; and (iii) to explore possible associations between the selected variants and the enzymatic activity of the aforementioned proteases.

Ethics Statement
All examined biological materials were derived from fish reared and harvested at commercial farms registered for aquaculture production in EU countries.Animal sampling followed routine procedures, and the samples were collected by a qualified staff member from standard production cycles.The legislation and measures implemented by the commercial producers complied with existing national and EU (Directive 1998/58/EC) legislation (protection of animals kept for farming).

Animal Selection for Whole Genome Sequencing
Whole genome sequencing was performed on both species using Illumina platforms.For the European seabass, DNA from five individuals was mixed equimolarly.For gilthead sea bream, 24 individuals were selected from various European aquaculture farms and were split into four sequencing pools.For the fastq files produced, the quality of the reads was evaluated using FASTQC [28], and low-quality reads (minimum PHRED score: 30), as well as adapter sequences, were discarded with Trimmomatic [29].Then, the reads were aligned to the reference genomes using the Burrows-Wheeler aligner (BWA) [30].SAM files were converted into BAM files using SAMtools [31] and finally, variant calling was performed using freeBayes [32].The variant calling file (VCF) was used to find the alternate variant in contrast with the reference genomes (Sparus aurata: GCA_900880675.1, Dicentrarchus labrax: http://public-genomes-ngs.molgen.mpg.de/cgi-bin/hgGateway?db=dicLab1, accessed on 4 March 2021).The detailed pipeline used for the analysis can be found on GitHub (https://github.com/RafaelAngelakopoulos/Bioz_lab/tree/0f040a4aee3536952a6df587f25a02ddb74fa61b/WGS, accessed on 12 December 2023).
After annotating the variants mapped in the genes responsible for proteolysis (calpains, collagenases, and cathepsins) a filtering step was performed, selecting missense variants in genes that are expressed in white muscle tissue and preferably those mapped in the catalytic/regulatory domains of the enzymes.Public RNAseq data, Sparus aurata: SRR6237499 and Dicentrarchus labrax: ERR9715622, were used to identify calpain, collagenase, and cathepsin genes expressed in white muscle.

Animal Selection for Genotyping
Fish were of commercial size (300-500 g) and were sacrificed using approved slaughtering methods.A total of 166 gilthead sea bream and 201 European seabass individuals, reared in two different Greek aquaculture farming units, were selected for DNA and enzymatic extraction.

Enzymatic Phenotyping
On harvest day, the activity of calpain, collagenase, and cathepsin was determined in the gilthead sea bream and European seabass samples.White muscle samples (200 mg) were extracted from the fish fillet and immediately snap-frozen in liquid nitrogen and kept at −80 • C until further investigation, as previously described [4,33].Briefly, calpain, collagenase, and cathepsin B and L enzymatic activity were assayed using the Barret and Kirschke method, with minor modifications.L-methionine-AMC trifluoroacetic salt in DMSO and Suc-Gly-Pro-Leu-Gly-Pro-AMC in DMSO were used as calpain and collagenase substrates, respectively.Enzyme extracts were thoroughly mixed with an appropriate substrate buffer solution containing 100 mM bis-Tris and 5 mM CaCl 2 at a pH of 6.5.Cathepsin B and L activity were determined using proper substrates, i.e., Z-arginine-arginine-7amido-4-methyl-coumarin hydrochloride and Z-phenylalanine-arginine-7-amido-4-methylcoumarin hydrochloride, respectively.The enzyme extract was mixed with the substrate solution (pH 6.5, 100 mmol/L Tris-HCl, 20 mmol/L EDTA, and 4 mmol/L DTT) [1].A spectrofluorometer (VarioskanTM LUX multimode microplate reader, Thermofisher, Waltham, MA, USA) was used to measure the fluorescence of 7-amino-4-methylcoumarin (AMC) released from each and every substrate used (excitation = 360 nm, emission = 460 nm).The protein content of the crude extracts was measured in triplicate using the Bradford method with bovine serum albumin as a reference [34].Fluorescence units (FUs) per minute and mg of protein were used to calculate enzymatic activity.The enzyme activities in each sample were assayed in duplicate.

DNA Extraction
Total DNA was extracted from the white muscle tissue of all individuals, a procedure necessary for genotyping the selected variations, and stored at −20 • C. The PureLink ™ Genomic DNA Mini kit from Invitrogen (Invitrogen, Catalog number: K182002) was used to extract the DNA from the samples according to the manufacturer's instructions.DNA quality was assessed using agarose gel electrophoresis and quantified with photometric measurement (Quawell, Q3000) at 260 nm.Samples were properly diluted to 50 ng/µL and sent to Inqaba Biotechnical Industries (Pty) Ltd. (Pretoria, South Africa) for primer synthesis (Supplementary Tables S1 and S2) and genotyping using a MassArray system.

Data Filtering and Association Analysis
The genotypic data for the loci of interest (57 SNPs for gilthead sea bream and 44 SNPs for European seabass) were converted into ped format, and a quality control procedure was performed using PLINK 1.9 [35] to generate reliable data and avoid false positive results in the downstream statistical analysis.Therefore, for quality control, we removed SNPs and individuals based on genotypic and individual missingness.Then, we discarded SNPs with a minor allele frequency of less than 5% and checked the Hardy-Weinberg equilibrium to exclude SNPs that deviated significantly from it, and a threshold of 5% was set for the individual missingness.
SNPstats, a tool for the analysis of the association of genetic polymorphisms (SNPs) with a phenotype, developed by the Institut Catala d' Oncologia (ICO), was used to process the data derived from the genotype [36].In terms of statistics, the association with the response (enzymatic activity) was modeled using linear regression models in order to evaluate the rate of variation in the response explained by the polymorphisms using multiple inheritance models [36,37].Tables with allele and genotype frequencies were generated along with tables showing the association between SNPs and the enzymatic activity per inheritance model (Supplementary Tables S3-S17).

SIFT Algorithm for Amino Acid Substitution Prediction
The Sorting Intolerant from Tolerant (SIFT) algorithm was used to estimate the effect of amino acid substitutions on protein function, and the results were integrated with other functional annotations.SIFT generates predictions by evaluating the properties of the amino acids involved in a specific substitution as well as the evolutionary conservation of the affected region in the protein.It starts by aligning the protein sequence of interest with related protein sequences from other species.This alignment is then used to pinpoint evolutionarily conserved regions that are more likely to be functionally important.SIFT then considers the amino acid properties at the specific substitution position, such as size, charge, polarity, and other chemical properties.It utilizes of this knowledge to predict the effect of the substitution on the structure and function of the protein.Finally, the algorithm calculates a SIFT score for the substitution by combining information about the properties of the substituted amino acid with the evolutionary conservation of the affected region.The SIFT score goes from 0 to 1, with lower values suggesting a higher possibility that the mutation would impair protein function [38].

Whole Genome Sequencing and Genotyping
Approximately ~80 M reads per sample and 95% of the reads of the whole genome sequencing passed the quality control criteria.
In total, 6800 and 2608 SNPs for gilthead sea bream and European seabass, respectively, were detected in the genes encoding for calpains, cathepsins, and collagenases and expressed in white muscle (Tables 1 and 2).More specifically, most variants were found in intronic regions both in European seabass and gilthead sea bream followed by synonymous and untranslated region variants (UTRs).The functional annotation of these SNPs was performed using the SnpEff tool [39] and is presented in Figure 1.
Using PLINK 1.9 and a 5% cutoff for individual missingness, five individuals from the gilthead sea bream dataset and 16 individuals from the European seabass dataset were excluded from the downstream statistical analysis.
Among the 57 and 44 SNPs selected for genotyping for gilthead sea bream and European seabass, respectively, 31 and 8 SNPs, were found to be monoallelic or to have failed genotyping.The association analysis revealed several SNPs to be statistically significantly associated with enzymatic activity.Enzymatic activity was calculated for calpain, collagenase, and cathepsins in both species from white muscle samples, as previously described [4].The allele frequencies of statistically significant variants are reported in Table 3. Table 4 summarizes the changes in enzymatic activity for each variant including the p-value for the computed linear regression.Indicative figures regarding the enzymatic activity for each genotype are provided in Figure 2 (two SNPs for each species), and the rest are provided in the Supplementary Materials (Figures S1-S11).Notably, none of the SNPs identified and genotyped are located in the active site of the enzymes, even though several are located within protein domains.After genotyping the variants, we sought to assess the tolerability of the observed amino acid changes using the Sorting Intolerant from Tolerant (SIFT) algorithm.Our analysis revealed that two mutations in Sparus aurata and one mutation in Dicentrarchus labrax were non-tolerated.Of note, the mutation in the capn5a and capn10 genes of gilthead sea bream exhibited low frequency in the population, in contrast to the alteration observed in the MMP13b gene of European seabass.While our analysis identified two mutations as non-tolerated, it is important to note that some substitutions may have been erroneously predicted to affect function due to the limitation of the SIFT algorithm that considers the diversity of the sequences used (Table 5).
Table 5. Sift algorithm results.The sift score is indicative of the amino acid substitution effect on the protein.A threshold of <0.05 exists for non-tolerated mutations.Results with underlined bold font depict the mutations that are predicted to affect protein function.Results with an asterisk (*) depict the mutations that are predicted to affect protein function but with low confidence.

Discussion
Traditional selection strategies based on phenotypic information were beneficial in boosting the profitability of livestock species in earlier decades.However, these approaches have biological constraints and limitations that are not encountered when using the information in SNPs, which are the primary source of genetic variability across individuals of the same species [40].Therefore, one of the main aims of genomics analysis is to locate SNPs that impact the functionality and activity of gene products.The identification of associated polymorphisms is critical not only for a better understanding of their genetic basis (i.e., identifying the causal genes) [41,42] but also for the design of genetic selection programs [43].
In this regard, the current study focuses on the relationship between missense SNPs in genes encoding for enzymes driving postmortem degradation of fish white muscle and the actual enzyme activity.
Proteolytic enzymes compromise fish fillet firmness and hardness [44].The activation of these proteases or their synergistic actions cause autolysis of myofibrils in fish, which results in postmortem muscular weakness [6].Enzymatic activity determines the severity of the proteolysis, i.e., how rapidly the fillet degrades.As previously noted, all the SNPs used in this investigation are missense variants that alter an amino acid sequence.These alterations are probably involved in changes in protein structure and functionality [45,46].
Calpains are intracellular endopeptidases that initiate myofibril proteolytic breakdown.Four SNPs (SA_capn10_11, SA_capn10_14, DL_capn14b_1, DL_capn5b_3) associated with differential enzymatic activity in both species are located in the CysPc domain of the calpain family (Table 2).The crystal structure of various classical calpains revealed that the core protease domain (CysPC) is composed of two sub-domains containing a catalytic triad [11].In the presence of Ca 2+ , these two sub-domains are probably reoriented to assemble a cysteine protease active site.Three SNPs (SA_capn5a_1, SA_capn5a_2, DL_capn5b_5) (Table 2) in both species are located in the C2 domain, a calcium and phospholipid binding domain of the Capn5 gene [47].This gene belongs to a variation in the non-classical calpains, the TRA-3 group, which contains one C2L domain and one C2 domain in tandem.This domain is important for binding/recognizing substrates and for calpastatin binding, which is in contact with the C2 domain [11,48].
Capn2b has an SNP in the EF-hand domain.The EF-hand is a Ca 2+ binding domain with the typical structure of EF-hands [49,50].Regularly, there are five (5) EF-hand motifs; one of them binds with the regulatory subunit, unifying the heterodimers.The result of this binding is the activation of the enzyme [50].The Capn15b gene is a member of the SOL subfamily.The main structural variations in the SOL subfamily concern the several Zn 2+ -finger motifs, that interact with the target substrate within the N-terminal domain and with a specific SOL-homology domain at the C-terminus of the core protease domain (CysPC) [51].An SNP in the Zn 2+ -finger motifs was found that can probably affect the interaction with the target substrate.
Cathepsins are lysosomal cysteine proteases that assist in intracellular protein breakdown and turnover [52].A variant in the peptidase A1 domain has been identified in the CTSDb gene.This domain is one of the two monomers composed of two asymmetric lobes ("bilobed").Each of the lobes provides a catalytic Asp residue, positioned within the hallmark motif Asp-Thr/Ser-Gly, to the active site [53].
In a fish fillet, myotomes are held together by connective tissue called myocommata, which are surrounded by collagenous fibrils [54].Collagenases are matrix metalloproteases that degrade collagenous fibrils, producing the characteristic gaps found in chilled fish fillets [5,9].Two SNPs in both MMP13 paralogs were located in the PGBD domain, which appears to affect MMP enzymatic activity and is located in the region of the gene referred to as the proteoglycan binding domain.As a proteoglycan binding-like domain of MMPs, this domain seems to bind to proteoglycan molecules [55], which are a very important component of the extracellular connective tissue plexus, thus proceeding to the degradation of proteoglycans [56], as well as indirectly participating in the regulation of the concentration of molecules such as chemokines [57].Another study concluded that the interaction between a pre-MMP and proteoglycans participates in its activation, possibly by bringing it close to some membrane activator [58].Many different proteoglycans appear to bind to collagen and to differentially regulate the formation and degradation of collagen fibrils, as discussed by [59].Based on these findings, it is plausible that the proteoglycan-binding domain regulates MMP activation or directs MMPs to approach collagen by binding to collagen-bound proteoglycans, therefore facilitating collagen proteolysis.
We examined the association between nine SNPs in European seabass and six SNPs in gilthead sea bream and their association with enzymatic activity, with the aim of identifying genetic markers for use in breeding programs.Of these SNPs, including three in European seabass and one in gilthead sea bream, the heterozygous genotypes were associated with the preferable phenotype, i.e., a lower enzymatic activity compared with both homozygous genotypes [60].This phenomenon is likely due to a decrease in enzymatic activity or protein stability in the heterozygous state, resulting in a lower response phenotype [61].Conversely, the remaining six SNPs in European seabass and two SNPs in gilthead sea bream displayed a dominant/recessive interaction, where one of the homozygous genotypes had a significantly lower response compared with the other homozygous genotype and the heterozygote [62].For two SNPs in gilthead sea bream, both homozygous genotypes were not present in the population studied, resulting in only the heterozygous genotype and one of the homozygous genotypes being observed (Tables S1-S15).
Finally, we cannot overlook that genes that perform critical functions in the cell are typically under strong evolutionary pressure to avoid accumulating deleterious mutations [63].This is especially true for enzymes that play crucial roles in regular metabolism, as missense mutations in these genes can have severe consequences for the cell and organism's survival.Therefore, the fact that some of the SNPs examined in this study displayed a heterozygote advantage may suggest a more complex evolutionary process at play [64,65].

Conclusions
Among the 57 and 44 SNPs selected, 9 and 6, respectively, for European seabass and gilthead sea bream appeared to be associated with changes in enzyme activity in the population used for the analysis, which is a very modest number compared with those initially selected.We acknowledge that the fish populations studied were of limited size and stress the importance of further investigation to validate our findings.
The 15 non-synonymous polymorphisms found to be associated with the proteolytic activity of these genes, which are actively involved in proteolysis, can be incorporated into genetic improvement programs to select parents exhibiting desired traits (lower proteolytic activity).For the first time, these findings provide the basis for extending parental selection in breeding programs to improve/extend the shelf life of the final product, indicating that low-cost genotyping techniques are of great importance for selecting a specific trait.The combination of the variants arising from the current study can be used to extend the freshness and shelf-life of these emblematic Mediterranean fish.

Figure 1 .
Figure 1.Functional annotation of all variants in the genes of interest in both species, European seabass (Dicentrarchus labrax) and gilthead sea bream (Sparus aurata).

Table 1 .
Genes expressed in the white muscle of gilthead sea bream.High: gene expressed in white muscle (logreads > 5), Low: low expression of gene in white muscle (logreads < 5), No: gene not expressed in white muscle.

Table 2 .
Genes expressed in the white muscle of European seabass.High: gene expressed in white muscle (logreads > 5), Low: low expression of gene in white muscle (logreads < 5), No: gene not expressed in white muscle.

Table 4 .
Genotypes associated with changes in enzymatic activity in both species, Sparus aurata and Dicentrarchus labrax.The association was performed using SNPstats.The 95% CI (95% confidence interval), AIC (Akaike information criterion), and BIC (Bayesian information criterion) values were calculated using SPNstats.The model of inheritance with lower AIC and BIC values was selected as the most possible model.