Selection of Parental Material to Maximize Heterosis Using SNP and SilicoDarT Markers in Maize

The chief aim of plant breeding is to improve varieties so as to increase their yield and breeding traits. One of the first stages of breeding is the selection of parental forms from the available gene pool of existing varieties. To date, costly and laborious methods based on multiple crossbreeding and phenotypic selection have been necessary to properly assess genetic resources in terms of productivity, quality parameters, and susceptibility to biotic and abiotic stressors. The often long and complicated breeding cycle can be significantly shortened through selection using DNA markers. To this end, use is made of close couplings between the marker and the locus responsible for the inheritance of the functional trait. The aim of this study was to identify single nucleotide polymorphism (SNP) and SilicoDArT markers associated with yield traits and to predict the heterosis effect for yield traits in maize (Zea mays L.). The plant material used in the research consisted of 19 inbred maize lines derived from different starting materials, and 13 hybrids resulting from crossing them. A two-year field experiment with inbred lines and hybrids was established at two Polish breeding stations on 10 m2 plots in a randomized block design with three replicates. The biometric measurements included cob length, cob diameter, core length, core diameter, number of rows of grain, number of grains in a row, mass of grain from the cob, weight of one thousand grains, and yield. The isolated DNA was subjected to DArTseq genotyping. Association mapping was performed in this study using a method based on the mixed linear model with the population structure estimated by eigenanalysis (principal component analysis of all markers) and modeled by random effects. Narew, Popis, Kozak, M Glejt, and Grom were the hybrids used in the study that showed the highest significant heterosis effect in 2013 and 2014. The similarity between parental components determined on the basis of SNP and SilicoDArT marker analysis did not exceed 33%. It was found that the genetic similarity between parental components, determined on the basis of SNP and SilicoDArT markers, reflected their degree of relationship, and correlated significantly with the effect of heterosis. As the results indicate, the parental components for heterosis crosses can be selected based on genetic similarity between parental components evaluated using SNP and SilicoDArT markers, supported with information on the origin of parental forms. Of the markers we analyzed, 76 were selected as being significantly associated with at least six traits observed in 2013 and 2014 at both the Łagiewniki and Smolice stations.


Introduction
The pressure to increase and sustain food production has been felt for a long time. Tools have thus been developed to guarantee greater accuracy in selection. The currently used methods of selection have been enhanced by the achievements of molecular biology and statistical models, enabling identification of both the markers of individual traits resulting from the action of individual genes and those conditioned by many QTLs that explain the phenotypic traits to various extents [1].
The DArT marker can be used in genomic selection (GS) [2]. GS allows for plant selection based on the total pool of DNA markers for the selected statistical model. It reduces the need for phenotyping and shortens the culture cycle. Meuwissen first described this method; he examined the accuracy of genomic selection carried out using the DArT technique and compared this with phenotypic selection and with selection supported by molecular markers (marker assisted selection, MAS). Genomic selection proved to be 28% more accurate than traditional marker-assisted selection, though slightly less accurate than phenotypic selection. The results of his study demonstrate that GS can be used to increase the profitability of breeding [3]. The method has been successfully used in barley [4] and oats [5], and also works well in improving the efficiency of breeding perennial species, such as eucalyptus (Eucalyptus L'Her) [6].
Modern methods for identifying single nucleotide polymorphisms (SNPs) make use of next generation sequencing (NGS) methods. These refer to sequencing techniques developed in the twenty-first century that provide higher performance and throughput than the Sanger [7] sequencing technique commonly used before. The most common NGS techniques are pyrosequencing 454 [8], the Solex technique (Ilumina), the SOLiD platform (Applied Biosystems), the Polonator system (Dover/Harvard), and the HeliScope single molecule sequencer (Helicos). These technologies provide inexpensive whole-genome sequence readings through methods such as chromatin immunoprecipitation, mutation mapping, detection of polymorphisms, and detection of noncoding RNA sequences [9]. Modern sequencing methods enable the identification of a large number of markers and also allow more accurate examination of many loci.
Modern genotyping technologies can also shed new light on the genetic basis of heterosis. The use of heterosis to increase and stabilize yield has become one of the major drivers of increased agricultural production over the last few decades. Despite the huge significance of heterosis and the growing tendency to use hybrid vigor even in inbred crops like bread wheat, the molecular and genetic mechanisms underlying this phenomenon have still not been fully explained [10].
Song and Messing [11] isolated a specific region of the genome of two crossed inbred corn lines, which were subsequently sequenced and mapped. They found that the size of this area and the presence of genes from a given gene family in it were significantly different. Genes that were present in one line were absent in the other, although phenotypic symptoms of their expression were visible in the other line. This is evidence that genes from the same gene family that produce similar phenotypic effects were located in different parts of the genome in each of the tested lines. According to Song and Mesing, heterosis can therefore be a consequence of differences in the structure of the genome, especially in the distribution and presence of certain genes from a given gene family in crossed inbred lines. Predicting the magnitude of the heterosis effect in hybrids based on molecular marker analysis has been widely discussed. According to the literature, there is regression of either hybrid performance or heterosis with increasing molecular genetic distance and estimation of correlations between these variables [11][12][13][14] or estimation of marker effects and marker associations with hybrid performance, heterosis, or specific combining ability [12,15].
The aim of this study was to identify single nucleotide polymorphism (SNP) and SilicoDArT markers associated with yield traits and to predict the heterosis effect for yield traits in maize (Zea mays L.). This topic was selected because the decreasing cost of next-generation sequencing means that these methods are beginning to be used in applied research to identify feature markers or even to select on the whole-genome level. This publication is one of a number to recently have suggested the possibility of using the latest molecular techniques (such as SNP and SilicoDArT) to select parental materials for heterosis crosses.

Phenotyping
Analysis of variance indicated that the main effects of year and genotype-as well as the L×G, Y×G, and L×Y×G interactions-were significant for all the studied traits. The main location effects were not significant for LCO or MGC. The L×Y interaction was not significant for NGR or MGC.
Tables 1 and 2 show trait correlation matrices for both locations and years. All significant coefficients were positive. Most trait pairs were correlated in all four environments. Three pairs (LC-NRG, LCO-NRG, and NRG-WTG) were not significant in any of the four environments. Additionally, LC-DCO, LCO-DCO, DCO-NGR, DCO-MGC, DCO-WTG, DCO-Yield, NRG-NGR, NRG-MGC, and NRG-MGC were not significant at Łagiewniki 2012 (Table 3). NRG was not correlated with yield in either year at Łagiewniki.  Individual traits are of differing importance and represent different proportions in the joint multivariate variation. Analysis of the multivariate genotypic variation also includes identification of the most important traits in the multivariate variation of genotypes. Analysis of canonical variables is a statistical tool that makes it possible to solve the problem of multivariate relationships. The first two canonical variables jointly explain 71.11% of the total variation between genotypes (Figure 1). Individual traits are of differing importance and represent different proportions in the joint multivariate variation. Analysis of the multivariate genotypic variation also includes identification of the most important traits in the multivariate variation of genotypes. Analysis of canonical variables is a statistical tool that makes it possible to solve the problem of multivariate relationships. The first two canonical variables jointly explain 71.11% of the total variation between genotypes (Figure 1).   Figure 1. The first group consists of the O Glejt hybrid and all the inbred lines that exhibited inbred depression for all analyzed yield structure parameters, except for the S41324A-2 and S160 lines. The second group consists of the hybrid forms, which have the same paternal components (Brda, Blask, and Grom), where the S41324A-2 line was the paternal form, and Bejm, Dragon, Narew, and Kozak, where the S61328 line was the paternal form. The third group consists of hybrids (M Glejt, M Prosny, Budrys, and Popis), whose parental components were not related to each other or else were related only to a small percentage. The first canonical variable was significantly positively correlated with LC, DC, LCO, NGR, MGC, WTG, and yield. The second canonical variable was significantly negatively correlated with NRG. The greatest variation in terms of all the traits considered together (measured using Mahalanobis distances) was found for S160 and Kozak (with a Mahalanobis distance of 8.01). The greatest similarity was found for the S64423-2 and M Wilga genotypes (0.91). The Mahalanobis distance values for all genotype pairs are presented in Table 4.  Figure 1 presents trait variation in the analyzed genotypes in the system of the first two canonical variables. In the graph, the point coordinates of a given genotype constitute the values of the first and second canonical variables. Three groups containing inbred lines and hybrid forms can be distinguished in Figure 1. The first group consists of the O Glejt hybrid and all the inbred lines that exhibited inbred depression for all analyzed yield structure parameters, except for the S41324A-2 and S160 lines. The second group consists of the hybrid forms, which have the same paternal components (Brda, Blask, and Grom), where the S41324A-2 line was the paternal form, and Bejm, Dragon, Narew, and Kozak, where the S61328 line was the paternal form. The third group consists of hybrids (M Glejt, M Prosny, Budrys, and Popis), whose parental components were not related to each other or else were related only to a small percentage. The first canonical variable was significantly positively correlated with LC, DC, LCO, NGR, MGC, WTG, and yield. The second canonical variable was significantly negatively correlated with NRG. The greatest variation in terms of all the traits considered together (measured using Mahalanobis distances) was found for S160 and Kozak (with a Mahalanobis distance of 8.01). The greatest similarity was found for the S64423-2 and M Wilga genotypes (0.91). The Mahalanobis distance values for all genotype pairs are presented in Table 4. DArTseq NGS analysis of the tested maize lines allowed us to identify 49,911 polymorphisms (33,452 SilicoDArTs and 16,459 SNPs). In total, 8192 of these markers (including 8189 SilicoDArTs and three SNPs) were selected for GWAM using the criteria specified above. The dendrogram (UPGMA) DArTseq NGS analysis of the tested maize lines allowed us to identify 49,911 polymorphisms (33,452 SilicoDArTs and 16,459 SNPs). In total, 8192 of these markers (including 8189 SilicoDArTs and three SNPs) were selected for GWAM using the criteria specified above. The dendrogram (UPGMA) based on these 8192 markers showed the genetic relationships of 19 inbred lines and 13 hybrids ( Figure 2). The dendrogram shows three basic similarity groups. The first group includes the M Prosny hybrid with its parental line, the Popis hybrid with its maternal line, the M Glejt hybrid, and the inbred lines S78510, S63322-3, S56125A, S245, S41796, S68911, S64417, and S64423-2. The second group contains the M Wilga hybrid with its parental components. The third group consists of the hybrids Bejm, Budrys, Kozak, Brda, Narew, Blask, Grom, and Smok with their paternal lines and the O Glejt hybrid with its maternal line ( Figure 2). The highest genetic similarity, calculated on the basis of both types of markers (equal to 0.94) was detected between S41324A-2 and O Glejt, whereas the lowest genetic similarity (equal 0.17) was found for S41796 and S61328. Figure 2 shows that the genetic similarity (as determined on the basis of the SNP and SilicoDArT markers) between the parental components of individual hybrids reflects their relative relationships, except in the case of the parental lines of the Blask hybrid (Table 1). The highest genetic similarity (83%, 80%, and 52%) was recorded between parental components of the O Glejt, M Wilga, and Grom hybrids, whose relationship between parental components was 50% (Table 3).

Association Mapping
There were 1678 markers that were significantly associated with the investigated traits at FDR < 0.05 in GWAM: 1675 SilicoDArTs and three SNPs. Table 5 shows the number of markers relevant for individual traits in the considered environments. We observed a large number of statistically significant associations with particular features: 3003 (for LC), 3750 (for DC), 2901 (for LCO), 1641 (for DCO), 851 (for NRG), 2097 (for NGR), 3419 (for MGC), 3886 (for WTG), and 2128 (for Yield).   (Figure 2). The highest genetic similarity, calculated on the basis of both types of markers (equal to 0.94) was detected between S41324A-2 and O Glejt, whereas the lowest genetic similarity (equal 0.17) was found for S41796 and S61328. Figure 2 shows that the genetic similarity (as determined on the basis of the SNP and SilicoDArT markers) between the parental components of individual hybrids reflects their relative relationships, except in the case of the parental lines of the Blask hybrid (Table 1). The highest genetic similarity (83%, 80%, and 52%) was recorded between parental components of the O Glejt, M Wilga, and Grom hybrids, whose relationship between parental components was 50% (Table 3).

Association Mapping
There were 1678 markers that were significantly associated with the investigated traits at FDR < 0.05 in GWAM: 1675 SilicoDArTs and three SNPs. Table 5 shows the number of markers relevant for individual traits in the considered environments. We observed a large number of statistically significant associations with particular features: 3003 (for LC), 3750 (for DC), 2901 (for LCO), 1641 (for DCO), 851 (for NRG), 2097 (for NGR), 3419 (for MGC), 3886 (for WTG), and 2128 (for Yield). Seventy-six of the analyzed markers were selected as being significantly associated with at least six traits observed in 2013 and 2014 at both Łagiewniki and Smolice (Table S1). The most significant marker was 4777143, which determined all the analyzed traits except for NGR and DCO at Łagiewniki in 2013. The following markers were significantly associated with the yield in both localities in 2013 and 2014: 4777143, 100002778, 4767650, 21693206, 9625858, 16723979, 9713903, 100002999, 100000002, and 7057018. The markers most often significantly associated with the observed features are shown in Table S1.

Prediction of the Heterosis Effect
The values of heterosis effects for the observed traits in individual environments are shown in Table 6. Narew and Popis were the hybrids that showed the highest significant heterosis effect for most of the analyzed yield structure traits at Łagiewniki in both 2013 and 2014 ( Table 6). The Narew hybrid showed the most significant heterosis effect for DC (0. It is noteworthy that the parental components of these four hybrids were either not related to each other or else showed a low degree of relationship due to origin (Narew: 4% relationship between parents; Popis: 0%; Kozak: 0%; and M Glejt: 13%). The similarity between parental components, as determined on the basis of SNP and SilicoDArT markers was also low (Narew: 18% similarity between parents; Popis: 26%; Kozak: 26%; and M Glejt: 33%; Table 3). The situation was similar for both years at Smolice. Here, Narew turned out to be the best hybrid, showing the most significant heterosis effect for the DC (0.897), NRG (2.37), and WTG traits (118.8) in 2013, and for the DC (0.847) and WTG traits (151) in 2014. The highest significant heterosis effects for two traits were also recorded in the Popis and Grom hybrids in 2013 and for four traits in the Kozak hybrid in 2014 (Table 6). * P < 0.05; ** P < 0.01; *** P < 0.001.
The relationships between the effects of heterosis and genetic and phenotypic distance (expressed as the Mahalanobis distance) are shown in Table 7. Statistical analysis shows the features of the yield structure, for which the magnitude of the heterosis effect in the hybrid forms depended on the genetic distance between the parental components estimated on the basis of SNP and SilicoDArT markers. * P <0.05; ** P < 0.01; *** P < 0.001. Table 7, the scale of the heterosis effect for most of the observed traits was significantly correlated with genetic distance, regardless of the year and locality of the experiment.

Discussion
Maize is a major crop species characterized by very high yield efficiency and versatility in utilizing the whole plant. Modern breeding programs focus on hybrid cultivars with the greatest heterosis effect, thanks to which it is possible to obtain much higher yields through appropriate selection of parental components.
Heterosis or hybrid vigor is a phenotypic result of gene interaction due to the effect of heterozygotes of hybrids in the F1 generation. According to the dominance hypothesis cited by Ruebenbauer, having many heterozygous genes causes an increase in hybrid performance due to the dominant alleles [16]. The more homozygous alleles are complementary in the parental forms, the greater the effect of heterosis in the hybrid forms. Hence, the greatest theoretical heterosis can be expected when there is a large allele diversity of individual genes in the parent plants. Such diversity occurs when the crossed genotypes are less related and the genetic distance is greater. Progeny of genotypes with a large genetic distance should thus show a significant heterosis effect.
We used the SilicoDArT and SNP markers to assess genetic diversity. The DArTseq NGS analysis of the tested maize lines allowed us to identify 49,911 polymorphisms (33,452 SilicoDArT and 16,459 SNP). In total, 8192 of these markers (including 8189 SilicoDArT and three SNPs) were selected for GWAM. Of all the analyzed markers, 76 were selected as being significantly associated with at least six traits observed in 2013 and 2014 at both Łagiewniki and Smolice.
In the present study, the genetic distance between parental components, as determined by the SNP and SilicoDArT markers, reflected their degree of relationship and was significantly correlated with the heterosis effect observed in the majority of the yield structure features, as well as the yield itself. Genotypes grouped according to specific patterns are shown on the dendrogram, with the first group including all inbred lines, except for the S41324A-2, S160, and O Glejt hybrid lines. The second group consists of hybrid forms that have the same paternal components. The third group consists of hybrids (M Glejt, M Prosny, Budrys, and Popis) whose parental components were not related to each other, or which were only related to a small percentage.
As the results indicate, the parental components for heterosis crosses can be selected on the basis of genetic distance between the parental components, as determined using SNP and SilicoDArT markers, supported with information on the origin of the parental forms.
In this study, Narew, Popis, Kozak, M Glejt, and Grom were the hybrids that showed the highest significant heterosis effect for the majority of the yield structure traits, at both sites in both years. Importantly, the parental components of these hybrids (except for the Grom hybrid) were either not related to each other or else showed only a low degree of relationship due to origin (Narew: 4% relationship between parents; Popis: 0%; Kozak: 0%; and M Glejt: 13%). The similarity between parental components determined on the basis of SNP and SilicoDArT marker analysis did not exceed 33% (Narew: 18% similarity between parents; Popis: 26%; Kozak: 26%; and M Glejt: 33%). Many researchers attribute the dependence of the heterosis effect on the genetic distance of parental forms, taking into account their degree of relationship [17][18][19][20][21]. In our own research, the genetic distance between parental components, as estimated with SNP and SilicoDArT markers, reflected their relationship and translated into the magnitude of the heterosis effect. We observed that the lower the similarity and the degree of relationship between parental components, the greater the effect of heterosis was in the hybrid forms.
In recent years, methods have been sought to allow initial selection of lines intended for heterosis crossing. The dependence of the heterosis effect on genetic distance, as determined using molecular markers, has been analyzed by many researchers in various species, including [22], pepper [23], cocoa [24], barley or sunflower [25]. Factors associated with the hybrid heterosis effect resulting from the crossing of inbred maize lines were discovered in 1992 [26]. Research conducted using molecular RFLP markers on 148 inbred lines of maize supported the use of these markers when the breeding material was clustered into heterotic groups [27]. The AFLP system was used to select parental components for maize heterosis crosses [28]. Five primer pairs generated 56 polymorphic bands, allowing the degree of similarity to be determined, which was then correlated with the effect of heterosis. Shehata et al. [29] demonstrated the usefulness of the SSR system in assessing the genetic distance of eight inbred maize lines. Berilli et al. [30] studied the genetic distance between two maize populations (CYMMYT and Piranao), which was estimated using molecular ISSR markers. Thirteen primers generated as many as 140 products, of which 84.4% were polymorphic. The genotypes tested were divided into two main groups, which contained mainly individuals from a single population.
DArT technology also works as an efficient diagnostic tool for analyzing genetic diversity [31]. DArT markers have been successfully used to study the genetic diversity and structure of Chinese common wheat (Triticulum aestivum L). A total of 111 cultivars and breeding lines from northern China were examined, with the results providing information that allowed further selection of parental forms and the establishment of heterozygous materials for the needs of the Chinese wheat breeding program [32]. The DArT method has found broad application in relationship analysis, such as in oats (Avena sp.), where 134 cultivars were examined and groups corresponding to winter and spring forms were identified [33]. However, research into 232 forms of the pigeon pea (Cajanus cajan) showed a low degree of material differentiation. Of 696 DArT markers, only 64 turned out to be polymorphic, with the wild forms being the most diverse [34].
Genome profiling in large hybrid populations currently offers unprecedented resolution for the dissection of loci and genes involved in heterotic expression. Huang et al. [35] recently published a study in which an extensive population of 1495 elite hybrid rice varieties, along with their inbred parental lines, were subject to detailed genome-wide sequence analysis in order to investigate genomic effects on hybrid vigor for 38 agronomic traits. The resequenced genomes of all parental lines harbored around 1.3 million polymorphic SNP markers, which were subsequently used to study population genetic parameters and perform GWAS at an unprecedented resolution. This approach revealed heterozygous chromosome regions that contributed to trait expression in the F1 hybrids. Elucidation of the corresponding genomic effects on phenotypic traits demonstrated that the pyramiding of multiple loci facilitated the accumulation of many rare superior alleles with positive effects. In other words, dominance complementation contributes most to the heterosis effect in the hybrid rice production. A combination of forward and background selection using high-throughput genome screening tools [36,37] can thus significantly increase the breeding gain potential through the efficient exploitation of hybrid vigor.
The idea of genomic hybrid breeding, in which a genome-based prediction strategy using genomic sequence data is used to estimate the performance of the F1 progeny in hybrid breeding, was introduced in rice by Xu et al. [38]. These authors used over 250,000 SNP markers generated by resequencing 210 parental inbred lines from a training set of 278 randomly selected hybrids; this study demonstrated the power of marker-directed estimation of F1 hybrid yields in rice. The top one hundred predicted hybrids, from a total of 21,945 possible combinations between the parental accessions, were estimated to exceed the overall average yield by 16%. This means there was a significant improvement in the average selection gains, compared to conventional breeding and accelerated hybrid rice production.

Plant Material
The plant material used for the research consisted of 19 inbred maize lines derived from a range of starting materials, and 13 hybrids resulting from their crossing. Maize lines and hybrids came from Hodowla Roślin Smolice (Table 3).

Genotyping and SilicoDArT and SNP Data Processing
The genotypic data for association mapping were derived from polymorphisms identified in DArT and candidate gene sequences.

DArT Sequences
Thirty-two genotypes were genotyped. The total genomic DNA extraction from the young leaves of the analyzed forms was performed using the GenElute Plant Mini Kit (Sigma-Aldrich, Darmstadt, Germany). DNA purity and concentration were determined spectrophotometrically (Thermo Scientific, Waltham, MA, USA), and the quality was determined electrophoretically in a 1% agarose gel. The concentration of all DNA samples was adjusted to 100 ng µl -1 . DArTseq analysis was performed at Diversity Arrays Technology, Australia. The GBS procedure involves several stages, which include preparation of DNA samples, digestion of genomic DNA with restriction enzymes, ligation of adapters, and independent creation of individual libraries and their final assembly. Next, the products are amplified, and the results are sequenced and analyzed. The detailed methodology is as follows: DNA samples were processed in digestion/ligation reactions principally as per Kilian et al. [39], but replacing a single PstI-compatible adapter with two different adapters corresponding to two different restriction enzyme (RE) overhangs, and transferring the assay onto the sequencing platform, as described by Sansaloni et al. [40]. The PstI-compatible adapter was designed to include the Illumina flowcell attachment sequence, sequencing primer sequence, and a "staggered" barcode region of varying length, similar to the sequence reported by Elshire et al. [41]. The reverse adapter contained the flowcell attachment region and the NspI-compatible overhang sequence.
Only "mixed fragments" (PstI-NspI) were effectively amplified in 30 PCR cycles under the following reaction conditions: denaturation for 1 min at 94 • C, followed by 30 cycles of 20 sec at 94 • C, 30 sec at 58 • C, 45 sec at 72 • C, and final elongation for 7 min at 72 • C. After PCR, equimolar amounts of the amplification products from each sample of a 96-well microtiter plate were bulked and applied to c-Bot Illumina bridge PCR, before sequencing on an Illumina Hiseq2500. Single read sequencing was run for 77 cycles.
The sequences generated from each lane were processed using proprietary DArT analytical pipelines. In the primary pipeline, the fastq files were first processed to filter away poor quality sequences, applying more stringent selection criteria to the barcode region than to the rest of the sequence. In this way, assigning sequences to the specific samples carried in the "barcode split" step is very reliable. Approximately 2,500,000 (± 7%) sequences per barcode/sample were used in marker calling. Finally, identical sequences were collapsed into "fastqcall files". These files were used in the secondary pipeline for DArT PL's proprietary SNP and SilicoDArT calling algorithms (presence/absence of restriction fragments in representation; DArTsoft14). Only DArT sequences meeting the following criteria were selected for the association analysis: one SilicoDArT and SNP within a given sequence (69 nt), minor allele frequency (MAF) > 0.25, and < 10% missing observation fractions.

Statistical Analysis and Association Mapping
The Henderson method [42] was used to construct a relationship matrix using the full pedigree information. Firstly, the normality of trait distribution was tested using the Shapiro-Wilk normality test [43]. Relationships between the traits were estimated using correlation coefficients on the basis of means of genotypes for each location and year independently. The results were also examined using multivariate methods. The canonical variate analysis was applied in order to present a multitrait assessment of the similarity of the tested genotypes in a lower number of dimensions with the least possible loss of information [44]. This allows the genotype variation to be illustrated in a graphic form in terms of all observed traits. The Mahalanobis distance was suggested as a measure of "polytrait" genotype similarity [45], whose significance was verified by means of the critical D α value referred to as the least significant distance [46]. The Mahalanobis distances were calculated for species. The coefficients of genetic similarity (S) of the investigated lines were calculated using the Nei and Li [47] formulas. The lines were grouped hierarchically using the unweighted pair group method of arithmetic means (UPGMA) based on the calculated coefficients. The relationship between lines was presented in the form of a dendrogram. Association mapping was performed using a method based on a mixed linear model with the population structure estimated by eigenanalysis (principal component analysis applied to all markers) and modeled by random effects [48,49]. All analyses were conducted in Genstat 18.2. The significance of associations between traits and SilicoDArT and SNP markers was assessed on the basis of P-values corrected for multiple testing using the Benjamini-Hochberg method [50].

Prediction of the Heterosis Effect
Heterosis effects for hybrids for each trait were estimated and tested by comparing a particular hybrid with the trait mean of both parents. Analysis was carried out using the GenStat 18 statistical package.

Conclusions
This study has demonstrated that molecular SNP and SilicoDArT markers may be useful in predicting hybrid formulas in maize, and could find application in selecting parental components for heterosis crossings. These markers can also be used in maize to group lines in terms of origin and lines with incomplete origin data. In the breeding programs, it proved possible to successfully use lines S54555 and S79757 for heterosis crosses; these were the parent components of the hybrid Popis; lines S64417 and S61326, which were the parental components of the Narew hybrid, also proved useful for heterosis crosses. Narew and Popis were the hybrids that showed the highest significant heterosis effect for most of the analyzed yield structure traits at Łagiewniki in both 2013 and 2014. It is worth noting that the parental components of these two hybrids were either not related to each other or else showed only a low degree of relationship due to origin (Narew: 4% relationship between parents; Popis: 0%). The similarity between the parental components, as determined on the basis of SNP and SilicoDArT markers was also low (Narew: 18% similarity between parents; Popis: 26%).

Funding:
The authors received no financial support for the research, authorship, and/or publication of this article.

Conflicts of Interest:
On behalf of all authors, the corresponding author states that there is no conflict of interest.