Novel Single Nucleotide Polymorphisms and Haplotype of MYF5 Gene Are Associated with Body Measurements and Ultrasound Traits in Grassland Short-Tailed Sheep

Myogenic factor 5 plays active roles in the regulation of myogenesis. The aim of this study is to expose the genetic variants of the MYF5 and its association with growth performance and ultrasound traits in grassland short-tailed sheep (GSTS) in China. The combination technique of sequencing and SNaPshot revealed seven SNPs in ovine MYF5 from 533 adult individuals (male 103 and female 430), four of which are novel ones located at g.6838G > A, g.6989 G > T, g.7117 C > A in the promoter region and g.9471 T > G in the second intron, respectively. Genetic diversity indexes showed the seven SNPs in low or intermediate level, but each of them conformed HWE (p > 0.05) in genotypic frequencies. Association analysis indicated that g.6838G > A, g.7117 C > A, g.8371 T > C, g.9471 T > G, and g.10044 C > T had significant effects on growth performance and ultrasound traits. The diplotypes of H1H3 and H2H3 had higher body weight and greater body size, and haplotype H3 had better performance on meat production than the others. In addition, the dual-luciferase reporter assay showed that there are two active regions in the MYF5 promoter located at −1799~−1197 bp and −514~−241 bp, respectively, but g.6838G > A and g.7117 C > A were out of the region, suggesting these two SNPs influence the phenotype by other pathway. The results suggest that the MYF5 gene might be applied as a promising candidate of functional genetic marker in GSTS breeding.


Introduction
The myogenic regulatory factors (MRFs) are essential for regulating the development of muscle fiber and then meat production capacity in mammals [1,2], which include four conserved basic helic-loop-helic (bHLH) molecules functioning as transcription factors as myogenic regulatory factor D (MyoD), myogenic regulatory factor G (MyoG), myogenic factor 5 (MYF5), and myogenic factor 6 (MYF6). All of these four factors affect the determination and maturation of muscle fibers [3]. Variants of these genes can therefore be associated with phenotype of growth, and carcass and meat traits in livestock. Among the four MRFs genes, MYF5 is the earliest one expressed during embryonic muscle development [4]. Experiments evidently lacking MYF5 delayed differentiation and modestly impaired proliferation of satellite cells in MYF5-null knock-out mice [5], and the fetuses exhibited skeletal defects and died immediately after birth in another experiment [6].
Research about the effect of variants in MYF5 on animal production traits has been explored in pigs and cattle. In pigs, variation of MYF5 has been found to be associated with meat quality including loin-eye area, tenderness, total fiber number, water moisture content, and pH value of longissimus dorsi [7][8][9][10]. While in cattle, variation in MYF5 affects growth and meat quality traits including birth, fat, carcass weight, and body length [11][12][13]. In addition, variants of MYF5 have been reported to be associated with growth and reproductive traits in chicken [14], and showed differences in carcass and meat quality traits in pigeon [15] and rabbit [16].
Ovine MYF5 maps to chromosome 3 (OAR3) [17]. Currently, there is a report [1] on variants of ovine MYF5, which are associated with carcass traits in Romney sheep. No other study exists on ovine MYF5 describing its variation in performance, especially in body measurements and ultrasound traits.
Given that the association of MYF5 polymorphism is similar to other livestock in growth and meat quality traits, we hypothesized that there are polymorphisms in MYF5 in ovine that are associated with the traits of growth and meat quality. The objective of this study was therefore to screen for variations in the promoter, 5 UTR, exons, introns, and 3 UTR regions in the MYF5 gene of grassland short-tailed sheep (GSTS) in Inner Mongolia, China, and to explore the presence of associations between variants of MYF5 and growth (body weight and body measurements) and ultrasound traits for revealing its molecular mechanism preliminarily.

Experimental Animals
A total of 533 adult GSTS (430 ewes and 103 rams) in Inner Mongolia, China were randomly selected from their breeding farm in Ewenki Autonomous Banner, Inner Mongolia Autonomous Region. These sheep were grazed in the steppe all year round, moved freely, and were in good health.

Data Collection and Sampling
The blood samples were collected from the jugular vein of sheep (5 mL/sheep) by a qualified veterinarian and stored in blood collection vessels containing EDTA. All operations followed the guideline of Ai Yi Ti Breeding farm of grassland short-tailed sheep, Inner Mongolia Autonomous Region (approval number: AYT-2020112003). Genomic DNA was extracted using the TIANamp Genomic DNA Kit (TIANGEN, Beijing, China) following the manufacturer's instruction. The DNA purity was tested by Spectrophotometer and dissolved in TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0), and stored at −20 • C.
Growth traits including body weight (BW), body length (BL), withers height (WH), chest depth (CD), chest circumference (CC), chest width (CW), cannon bone circumference (CBC), and hip width (HW) were determined as is routine method. Ultrasound traits including eye muscle area (EMA) and backfat thickness (BFT) were measured by the animal specific B ultrasonic instrument whilst the animals were alive (Dawei, Xuzhou, China). Animals to be tested were restrained in a standing position after shaving of the wool and cleaning of the skin on the right side of the region between the 12th and 13th thoracic vertebrae; the ultrasound images of the longissimus muscle were captured using the procedure described by Rozanski et al. [18]. At the end of the measuring, all images of each individual were analyzed by the same person who operated the ultrasound apparatus, using the software fixed in the instrument in order to eliminate systematical error.

Identifying SNPs and Genotyping
According to the ovine MYF5 sequence provided in Genbank in NCBI (Ovis aries. NC_040254.1), seven pairs of sequential overlapping primers were designed using Primer Premier 5, each pair of primers can amplify about 800 bp in length, and are named MYF5-1, MYF5-2, and MYF5-3, etc., in numerical order, covering the open reading frame (ORF), promoter, 5'UTR, and 3'UTR. The information of the primers is shown in Supplementary Materials Table S1.
For SNPs detection, twenty genomic DNA samples were selected randomly as a pool to identify if mutation of the gene was harbored. PCR was performed in a 25 µL reaction volume, including 2 × Taq PCR MasterMix (12.5 µL), forward and reverse primer (10 µM, 1 µL for each), DNA (1 µL), and ddH 2 O (9.5 µL). The reaction of thermal cycling included: 95 • C for 5 min, followed by 35 cycles: 95 • C for 30 s, 60 • C for 40 s, and 72 • C for 1 min; and 72 • C for 7 min for the final extension, then the samples were stored at 4 • C. Bidirectional sequencing was operated by Sangon (Shanghai, China). The Chromas software was employed to analyze and determine the location of the SNPs.
After sequencing detection, the technique of SNaPshot was used to type each specific location of SNPs found and then the other seven primers were designed according to the description of Turner et al. (2002), labeled as SNP1, SNP2, and SNP3 etc., in numerical order. The information about the primer pairs is shown in Supplementary Materials Table S2. The PCR reaction mixture was 25.0 µL, including 1.1 × T3 Super PCR Mix 22 µL (TSINGKE, Beijing, China), 10 µM primer F 1 µL, 10 µM Primer R 1 µL, and gDNA 1 µL. The thermal cycling procedure was 98 • C for 2 min, followed by 35 cycles at 98 • C for 10 s, 60 • C for 10 s, and 72 • C for 20 s; then 72 • C for 5 min, and the products were stored in 4 • C.  Table S3), and double restriction sites of XhoIand HandIII were added at each ends of the fragment as PGL4.2-Basic vector request, which were named as PGL4.2-1799, PGL4.2-1197, PGL4.2-866, PGL4.2-514, and PGL4.2-241, respectively. Each insertion of fragments with the same 25 base pairs at 5 and 3 ends corresponding to the PGL4.2-Basic was homologously recombined using Clon Express ® Ultra One Step Cloning Kit (Vazyme, Nanjing, China), and the ligated products were transformed into DH5α competent cells.

Dual-Luciferase Reporter Assay
The 293. T and SEF cells were inoculated into a 24-well plate at a density of 1 × 10 5 cells per well initially, and cultured in a 5% CO 2 incubator at 37 • C. The cells were transfected when the fusion reached approximately 80% within 24 h. A total of six recombinant plasmids (including one empty plasmid) with 1000 ng and TK-Renilla plasmid with 15 ng were co-transfected into 293T and SEF cells, and each of experiments were repeated three times. After 48 h transfection the cells were collected, and the firefly luciferase (F) and sea kidney luciferase activity (R) were determined by the Dual-Luciferase Reporter ® Assay System (Promega, Madison, WI, USA) and detected by GloMax 20/20 Luminometer (Promega, Madison, WI, USA). The firefly luciferase activity levels in each well were normalized by those for Renilla luciferase. All of the experiments were replicated three times and each assay was performed thrice.

Prediction of Transcription Factors in the Promoter Region
Alibaba 2.1 (http://gene-regulation.com/pub/programs/alibaba2/index. Html, accessed on 13 July 2021) online tool was paired with the input complete sequence of MYF5 containing SNP1, 2, and 3 from NCBI firstly, the specific parameters set as: the predicted length is 50 bp, the longest binding site in length is 10 bp, and the shortest is 4 bp, with the confidence above 80%. Then, the predicted transcription factors were further analyzed on the JASPER website (http://jaspar.genereg.net/, accessed on 13 July 2021), both with highest scores were selected.

Construction Luciferase Reporter Plasmid with Mutation
To investigate the difference of luciferase activity among different genotypes of SNP2, a 48 bp (−695 to −742 bp) wild-type sequence containing SNP2 was inserted into PGL4.2-Basic as PGL4.2-SNP2-W, then the site G in wild-type was mutated into T by PCR method to construct a mutant plasmid as PGL4.2-SNP2-M. The constructed plasmid was verified by double digestion and Sanger sequencing. A blank control and the experimental groups were tested for three times in SEF cells. The information of the base mutation primers is shown in Supplementary Materials Table S4.

Statistical Analysis
The genotype and allele frequency were calculated directly. The polymorphism information content (PIC) was used to estimate the MYF5 in population.
p i and p j are the frequencies of i and j alleles, respectively, and n is the number of multiple alleles. PIC > 0.5 indicates that the locus is highly polymorphic, when PIC < 0.25 means the locus is in low polymorphism condition, and 0.25 < PIC < 0.5 suggests that the locus is moderately polymorphic [19].
The Hardy-Weinberg equilibrium (HWE) was evaluated by the Chi-squared test in Haploview 4.2 (Whitehead Institute for Biomedical Research, Cambridge, MA, USA). The r 2 and D' linkage disequilibrium (LD) and haplotype analysis were also calculated with Haploview 4.2.
In this experiment, the general linear model (GLM) as Y ij = µ + G i + S j + e ij was used for analyzing the association between SNPs and phenotypes by SAS 9.4 (SAS Institute Inc., Cary, NC, USA), where Y ij was the phenotypic observations; µ was the averaged values; G i was the fixed effect of genotype; S j was the fixed effect of sex; and e ij was the residual effect. All values were expressed as the mean ± standard deviation. The results with p < 0.05 were considered statistically significant.

SNP Detection and Genetic Parameters
A total of 7 SNPs have been detected in this study by direct sequencing the pool PCR products from seven sequential primers for MYF5 (Figure 1), SNP1, SNP2, and SNP3 locate in the promoter, SNP4 and SNP5 in the first intron, SNP6 in the second intron and SNP7 in the 3 UTR region of the gene; the detailed information is shown in Table 1 and Figure 1. Based on the information above, 533 samples were genotyped by SNaPshot technology. Genetic parameters of genotype and allele frequency, observed heterozygosity (ObsHe), predicted heterozygosity (PredHe), χ 2 test for Hardy-Weinberg equilibrium (HWE), and the polymorphism information content (PIC) were calculated for all 7 SNPs and shown in Table 1. All SNPs were in Hardy-Weinberg equilibrium and the genetic diversity of SNP1, SNP5, SNP6, and SNP7 were in low polymorphism condition (PIC < 0.25), and SNP2, SNP3, and SNP4 were in moderate status (0.25 < PIC < 0.5).

Linkage Disequilibrium and Haplotype of the SNPs
The linkage disequilibrium coefficients of D' and r 2 values (D' = 1 indicates full linkage, r 2 > 0.33 indicates strong linkage) showed that between SNP4 and SNP5 (D' = 0.97, r 2 = 0.73), SNP6 and SNP7 (D' = 0.82, r 2 = 0.64) and SNP1 and SNP2 (D' = 0.98, r 2 = 0.22) were in strong linkages states, and the degree of linkage went from high to low. As Figure 1c displayed, these 7 SNPs formed three haplotype modules in STGS flock.

Association of the SNPs and Haplotype Combinations with Phenotype Traits
Through genotyping, three mutations were determined as SNP1, SNP2, and SNP3, which existed in the promoter region. For the SNP1, the sheep with AA genotype has significantly higher BW, WH, CC, and SC than GA and GG ones (p < 0.05). Moreover, individuals of AA genotype had significantly higher CW than that of GG genotype (p < 0.05). For SNP3, the EMA of AC genotype individuals was significantly higher than that of CC (p < 0.05), while there was no difference between CC and AA (p > 0.05). However, there was no significant association detected between the mutants and the phenotypes for SNP2 (Table 4). The other SNPs, namely SNP4, SNP5, and SNP6, are located in the intron regions of MYF5 and had complicated association with phenotype traits. SNP4 is located in the first intron, and the heterozygous genotype CT had more advantages in BW, BL, WH, CC, CBC, and EMA than CC genotypes (p < 0.05). TT genotype also had higher mean values of BW, CBC, and EMA than CC genotype. Meanwhile, CC, CT genotype had significantly higher WH than CC and TT genotypes (p < 0.05), but there was no significant difference between CC and TT genotypes (p > 0.05).
SNP5 was also located in the first intron, but there is no significant difference among genotypes (p > 0.05). For SNP6, it was located in the second intron of MYF5 gene, and the individuals with GG genotype had significantly higher values in BW, WH, and CBC than individuals with TT genotype (p < 0.05). Genotype GT was in moderate level in all phenotype traits, except that CBC was significantly lower than GG genotype.
For the 3 UTR mutation of SNP7, the sheep with TT genotype had higher BW and greater WH than CT and CC genotype (p < 0.05). Besides, TT genotype had significant higher mean values of chest circumference (CC) than CC genotype, and there was no difference (p > 0.05) between CT and CC genotype.
The result of association analysis amongst the four diplotypes and phenotypic traits showed that the mean value of BW, BL, WH, CC, CW, CBC, and HW of H1H3 and H2H3 were significantly (p < 0.05) higher than those of H1H1 and H1H2. However, there were no significant differences (p > 0.05) among the four diplotypes in two ultrasound traits (EMA and BFT) ( Table 5).

An Optimal Haplotype for Meat Production Breeding
According to the results above, it seems that individuals with haplotype H3 had a better performance than the others, so all individuals were divided into two groups, which were animals with H3 haplotype (H3+) and animals without H3 haplotype (H3−). The result of contrastive analysis showed that H3+ group had significant (p < 0.05) advantages over H3-group in BW, BL, WH, and CW (Table 6).

The Position and Binding Factor of Three SNPs Functioned in Ovine MYF5 Promoter
After successfully co-transfecting the plasmids of successive truncating fragments of ovine MYF5 promoter with Renilla plasmids into 293T or SEF cells in 24 h, the dualluciferase activity detecting results tended to be consistent in the two cell lines (only the result of SEF is shown), which indicated that all five truncation fragments are in active, and the fragment from −514 to −241 bp showed the highest activity (p < 10 −5 ) (Figure 2a); this indicated, therefore, that the functional core region of ovine MYF5 gene promoter is located between −514 to −241 bp. However, it is a little surprising that SNP1, 2, and 3 are out of the core region. Certainly, the fragment containing SNP 1, 2, and 3 (−1799 to −866 bp) also had a significant effect (p < 0.001) with the promoter activity (Figure 2a). It is logical to assume that these three SNPs play an important role in regulating the promoter activity. . Results are shown as mean ± standard deviation and the data are representative of at least three independent assays. The statistically significant difference between groups was tested by independent sample t-test. *** p < 10 −3 , ***** p < 10 −5 .
The results of prediction for the binding sites to transcription factor on the three SNPs showed that SP1 is located in the binding site of SNP2, and there are no binding sites on SNP1 and 3. Hence, the SP1 inhibitor assay conducted, and the result displayed that the luciferase activity decreased significantly (p < 0.001) when SP1 inhibitor added to the culture of SEF cells transfected with PGL4.2-SNP2-W plasmid (Figure 2b). Further, the mutant of PGL4.2-SNP2-M plasmid was transfected, and the activity was significantly (p < 0.001) lower compared to the wild-type (Figure 2c). This suggests that the binding of SP1 to SNP2 plays an important role in exerting the promoter activity of MYF5.

Discussion
There were few studies on variations in ovine MYF5, especially the variations associated with the production traits in sheep. Sousa et al. [20] had genotyped MyoD family genes including MYF5 in 192 Santa Inês sheep and also analyzed association with meat quality traits, and had not found an association with MYF5. Phua and Wood [21] firstly reported two variations in ovine MYF5 in 24 sheep samples using Southern hybridization RFLP method. Wang et al. [1] detected two SNPs in exon 3 and an indel in intron 2 from five breeds with total of 645 sheep samples using PCR-SSCP analysis, and found genotype AA from Romney sheep associated with carcass lean meat yield. More MYF5 sequence variation has been described in cattle [12,13,22] and pigs [10,23,24].
In the study, the ovine MYF5 gene was genotyped with seven primers spanning 4606 bp within an indigenous Chinese breed named GSTS and seven SNPs were determined from 533 individuals located in the promoter (SNP1, SNP2, and SNP3), intron 1 (SNP4 and SNP5), intron 2 (SNP6), and 3 UTR (SNP7) of MYF5, respectively. Among them, except for three SNPs in the intron 1 and 3 UTR, which were reported previously in Santa Inês sheep [20], the SNP1, SNP2, SNP3, and SNP6 were detected firstly in Ovis aries. Except for SNP2, this study also revealed significant association (p < 0.05) between SNPs, body measurements, and ultrasound traits in a relatively large amount of sheep samples. This might be the reason that previous studies failed to find the variations in ovine MYF5 gene and the association with measurements of phenotype. The results above further confirmed that MYF5 gene had an effect on the body measurements traits of domestic animals.
Usually, a single SNP has little effect on a phenotype, while haplotype could provide more abundant information, because haplotype composite heterozygous SNPs are located on single chromosomes, play an important role in the mining of disease genes, and search for new strategies for disease treatment [25]. In the present study, we detected eight haplotypes consisting of seven mutant sites with a frequency above 1%, and the most common haplotype is H1 (0.497) in the GSTS population. Through analyzing the association between different haplotypes with the body weight, body size, and ultrasound traits in GSTS flock, we also found that haplotype H3 had a better performance in body weight, body size, and ultrasound traits, hence the combinations of H1H3 and H2H3 showed a higher mean value than other major diplotypes, suggesting that H3 was a beneficial haplotype for body measurements and carcass traits, which laid the foundation for formulating the breeding regime. It is anticipated that strengthening the selection of H3 haplotype would increase the frequency of H3 in the population of the sheep flock, resulting in desirable body measurements traits for improving the meat production performance at an accelerated rate for GSTS.
In the study, seven mutants of SNP were detected in the flock, but no SNPs were found in exon, suggesting that the exon region of MYF5 in GSTS tends to be relatively conserved. It was believed that mutations occurring in intron could not affect biological traits. With the progressing of research, it is found that introns are involved in lots of gene regulation such as splicing in eukaryotes. The variants in introns may lead to change splicing accuracy or efficiency, resulting in changes in amino acid coding and ultimately change in the phenotype of eukaryotes [26]. For example, the variation in the third intron of the IGF2 may affect its binding to the transcription factor ZBED6, which would then improve the meat yield of Chinese Bama pigs [27]. In the study, we detected three SNPs in the intron regions of ovine MYF5, and CC homozygous genotype of the SNP4 had a negative effect on body weight, body measurement traits, and eye muscle area, but CT heterozygous had the highest mean value in all measuring traits among the three genotypes. Comparatively, the mutation of SNP6 had a positive effect on body weight, withers height, and cannon bone circumference. These results supplied clues for further investigation in the mechanisms of transcriptional regulation in gene introns.
Generally, the region of 3 UTR involved the binding site of mRNA and therefore regulated nuclear transport, polyadenylation, and subcellular targeting, as well as rates of localization, translation, and mRNA-specific degradation [28,29]. The function of miRNAs appears to be negative regulators for transcription by interacting with 3 UTR of target gene [29]. Thus, sequence alterations in the region of 3 UTR may involve an inappropriate expression of genes and has been conclusively proven in a lot of studies [29,31,32]. In the current study, we found the sheep with TT genotype at SNP7 in 3 UTR had a higher mean value of body weight, withers height, and chest circumference than CT and CC genotypes. From the routine perceiving, the mutant T at 10,044 bp of MYF5 might be a beneficial allele for muscle development in terms of appearance. Whether this mutation would be favorable to muscle growth without other side effects needs more investigation.
Among the seven SNPs found in the present study, the SNP1, SNP2, and SNP3 are located in the promoter region of ovine MYF5. The luciferase assay determined that SNP2 interacted with transcription factor SP1, SNP1 was significantly associated with the body weight and measurement traits, and SNP3 was significantly associated with eye muscle area. A promoter contains important information about the network of the gene expression regulation [33], and single nucleotide polymorphisms in the promoter regions can affect the gene's regulation activity by altering the binding efficiency of the transcript factors [34,35]. For instance, Kostek et al. reported that a polymorphism in the IGF1 promoter is significantly associated with total human fat mass [36]. From our study, the results of related SNPs in MYF5 promoter suggest that three SNPs may affect the transcriptional efficiency by changing its promoter regulating activity, and SNP2 specifically may affect the SP1 binding efficiency to regulate the promoter's activity. Of course, more precise assays such as immunoprecipitation need to be conducted to ensure the interaction between SP1 and the binding site containing SNP2.

Conclusions
This is the first report describing the variation of ovine MYF5 in its whole length, and the association between variations and body weight, and body measurements and ultrasound traits in sheep. Through the investigation, seven SNPs were identified in a Chinese indigenous sheep breed and four of them were novel in identification. In addition, g.6838G > A, g.7117 C > A, g.8371 T > C, g.9471 T > G, and g.10044 C > T and diplotypes of H1H3 and H2H3 in GSTS might influence body weight, body measurements, and longissimus dosi ultrasound traits, and haplotype of H3 would be beneficial for improving meat production. These findings laid the foundation for formulating the breeding regime for grassland small-tailed sheep and provided a deep insight into offering a functional genetic marker for improving economically valuable traits in sheep breeding.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes13030483/s1, Table S1: Primers and amplification conditions for identifying SNPs; Table S2: SNaPshot single base extension primers; Table S3: Homologous recombination primer designed for construction of expression vector; Table S4: The primer information of the base mutation of SNP2.  Informed Consent Statement: Not applicable.

Data Availability Statement:
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.