Identification and Functional Analysis of microRNAs Involved in the Anther Development in Cotton Genic Male Sterile Line Yu98-8A

Hybrid vigor contributes in a large way to the yield and quality of cotton (Gossypium hirsutum) fiber. Although microRNAs play essential regulatory roles in flower induction and development, it is still unclear if microRNAs are involved in male sterility, as the regulatory molecular mechanisms of male sterility in cotton need to be better defined. In this study, two independent small RNA libraries were constructed and sequenced from the young buds collected from the sporogenous cell formation to the meiosis stage of the male sterile line Yu98-8A and the near-isogenic line. Sequencing revealed 1588 and 1536 known microRNAs and 347 and 351 novel miRNAs from male sterile and male fertile libraries, respectively. MicroRNA expression profiles revealed that 49 conserved and 51 novel miRNAs were differentially expressed. Bioinformatic and degradome analysis indicated the regulatory complexity of microRNAs during flower induction and development. Further RT-qPCR and physiological analysis indicated that, among the different Kyoto Encyclopedia Gene and Genomes pathways, indole-3-acetic acid and gibberellic acid signaling transduction pathways may play pivotal regulatory functions in male sterility.


Introduction
Cotton is an important economic crop for its natural source of fiber, as well as high-quality protein and oil [1]. Cotton heterosis has the potential of increasing yield from 10%-20% and could therefore significantly impact breeding programs, if it could be appropriately exploited. Although genic male sterility (GMS) has been widely used in the utilization of hybrid vigor based on its unique advantages, the complex regulatory network of GMS occurrence has yet to be extensively investigated.
MicroRNAs (miRNAs) are a class of endogenous small regulatory non-coding RNAs, which negatively regulate the expression of target genes at the post-transcriptional level through degrading target mRNAs or repressing gene translation [2]. miRNAs may also regulate the expression of target genes through DNA methylation [3] and function to regulate multiple developmental processes, including seed germination [4], root development [5], leaf development and polarity [6], and floral organ identity [7,8]. In addition, miRNAs are also involved in plants' responses to diverse environmental biotic and abiotic stresses, such as drought [9], salinity [10] and pathogen invasion [11].
Different miRNAs serve regulatory functions in floral development in various plants. In Arabidopsis, miR172 causes early flowering and disrupts floral organ identity through regulation of the AP2 gene via a translational mechanism [7,12]. The SBP-box gene SPL8, which is essential for proper development of sporogenic tissues by regulating genes mediating cell division, differentiation and specification in early anther development in Arabidopsis, is regulated by miRNA 156/7 [13]. In rice, OsmiR396d controls spikelet development by regulating the expression of the OsGRF gene, which functions with OsGIF1 in floret development through targeting of JMJ706 and OsCR4 genes [14]. miR159 indirectly affects the development of aleurone cells and flowers via regulation of a GAMYB gene, which plays an essential role in gibberellin signaling, in rice [15]. In addition, miR159 is also involved in the expression of the DUO1 gene, which is required for cell division in male gametophytes [16]. During auxin signaling, miR160 directs the post-transcription regulation of the auxin response transcription factor ARF17. Changes in the expression of ARF17 lead to many developmental defects, including abnormal stamens, sterility and premature inflorescence development [17]. Similarly, two other auxin response transcription factors ARF6 and ARF8, which regulate both stamen and gynoecium maturation, are regulated by miRNA167 in Arabidopsis [18,19].
The expression pattern of miRNA in several important male sterile crops, using high-throughput sequencing or microRNA array technologies combined with quantitative real-time PCR, suggests a complex multiple regulatory network of miRNAs. Upregulation of some miRNAs was essential for pollen abortion in a maize cytoplasmic male sterile line [20]. During the rice anther development process, the differential expression patterns of a total of 24 miRNAs were also identified in the cytoplasmic male sterile line MeixiangA [21]. In addition, similar results were also reported in Brassica juncea [22]. Comparative expression profiling of miRNA during anther development of cotton revealed that the target genes of miRNAs were expressed differentially between a GMS mutant line and the wild-type lines and were specifically involved in anther development [23]. Recently, several similar results have been reported in Brassica campestris ssp. Chinensis [24,25], radish [26] and soybean [27].
In the present study, high-throughput sequencing, combined with degradome sequencing, RT-qPCR and histological and biochemical analyses, was applied to identify and characterize the key miRNAs involved in the network underlying the abortion process of a novel male sterile mutant line Yu98-8A, in which the sterility is controlled by a single recessive gene (named yms1) and also displays a strong heterosis in various combinations. In this study, two independent small RNA libraries of the young buds collected from the pollen mother cell formation to meiosis stages were constructed based on the microscopic observation data. One thousand five hundred eighty eight and 1536 known miRNAs and 347 and 351 novel miRNAs were identified and predicted, and among them, 49 known and 51 novel miRNAs were differentially expressed. Targets analysis of differentially-expressed miRNAs indicated the regulatory complexity of miRNAs during flower induction and development. Further, RT-qPCR and physiological analysis indicated that, among the different KEGG pathways, indole-3-acetic acid (IAA) and gibberellic acetic acid (GA) signaling transduction pathways may play pivotal regulatory functions in male abortion.

Morphological and Histological Identification of Male Genic Sterile Buds
Wild male fertile (MF) flowers displayed a normal floral phenotype in the days post-anthesis (DPA) (Figure 1a), while the genic male sterile mutant (GMS) showed an abnormal longer exposed stigma ( Figure 1A). No significant phenotypic differences between MS and MF buds were observed in the sporogenous cell formation (GMS: ≤1.8 mm; MF: ≤2.0 mm) or pollen mother cell formation (GMS: 1.8-2.5 mm; MF: 2.0-2.8 mm) stages. However, during meiosis, especially in the tetrad formation stage (MS: 2.5-3.0 mm; MF: 2.8-3.3 mm), the GMS line Yu98-8A showed a shriveled tetrad, and no spinescent protuberances on the pollen wall were detected ( Figure 1B,C), but MF flowers had normal tetrad and pollen morphologies (Figure 1b,c). Furthermore, abnormal and fragmented microspores were observed in Yu98-8A buds, apparently resulting in the formation of empty anther sacs due to eventual pollen degradation ( Figure 1D,E). In contrast, wild-type anther sacs contained large numbers of pollen grains (Figure 1d,e). The shriveled tetrad appeared to be the first significant observable feature of the abnormal MS phenotype, indicating that the pollen mother cell formation phase and the following meiotic cycle were early indicators of microspore abortion in Yu98-8A.

Morphological and Histological Identification of Male Genic Sterile Buds
Wild male fertile (MF) flowers displayed a normal floral phenotype in the days post-anthesis (DPA) (Figure 1a), while the genic male sterile mutant (GMS) showed an abnormal longer exposed stigma ( Figure 1A). No significant phenotypic differences between MS and MF buds were observed in the sporogenous cell formation (GMS: ≤1.8 mm; MF: ≤2.0 mm) or pollen mother cell formation (GMS: 1.8-2.5 mm; MF: 2.0-2.8 mm) stages. However, during meiosis, especially in the tetrad formation stage (MS: 2.5-3.0 mm; MF: 2.8-3.3 mm), the GMS line Yu98-8A showed a shriveled tetrad, and no spinescent protuberances on the pollen wall were detected ( Figure 1B,C), but MF flowers had normal tetrad and pollen morphologies (Figure 1b,c). Furthermore, abnormal and fragmented microspores were observed in Yu98-8A buds, apparently resulting in the formation of empty anther sacs due to eventual pollen degradation ( Figure 1D,E). In contrast, wild-type anther sacs contained large numbers of pollen grains (Figure 1d,e). The shriveled tetrad appeared to be the first significant observable feature of the abnormal MS phenotype, indicating that the pollen mother cell formation phase and the following meiotic cycle were early indicators of microspore abortion in Yu98-8A.

Analysis of the Small RNA Library and Expression Profile of Small RNAs during Anther Development
To investigate the miRNAs involved in microspore abortion in cotton GMS mutant line Yu98-8A and the corresponding MF type, two independent small RNA libraries of the young buds collected from the pollen mother cell formation to meiosis stages were analyzed based on the microscopic observation data using Illumina Solexa high-throughput sequencing technology. The MS and MF libraries generated 11,095,604 and 10,741,527 raw reads, respectively. In general, the sequence length distributions of raw data were very similar between the two libraries. For example, the size of the majority of the small RNAs ranged from 21-24 nt in size in both MS and MF young buds, and small RNAs with 24 nt were the most abundant small RNAs in all sequence reads ( Figure S1). After removing low quality reads including adapter and insert contaminants, RNAs shorter than 18 nucleotides and polyA, 11,002,282 and 10,663,659 clean reads were obtained, from the MS and MF libraries, respectively (Table S1). Further analysis of the variation of small RNA expression profiles in both MS and MF buds showed that, in addition to the common small RNAs accounting for

Analysis of the Small RNA Library and Expression Profile of Small RNAs during Anther Development
To investigate the miRNAs involved in microspore abortion in cotton GMS mutant line Yu98-8A and the corresponding MF type, two independent small RNA libraries of the young buds collected from the pollen mother cell formation to meiosis stages were analyzed based on the microscopic observation data using Illumina Solexa high-throughput sequencing technology. The MS and MF libraries generated 11,095,604 and 10,741,527 raw reads, respectively. In general, the sequence length distributions of raw data were very similar between the two libraries. For example, the size of the majority of the small RNAs ranged from 21-24 nt in size in both MS and MF young buds, and small RNAs with 24 nt were the most abundant small RNAs in all sequence reads ( Figure S1). After removing low quality reads including adapter and insert contaminants, RNAs shorter than 18 nucleotides and polyA, 11,002,282 and 10,663,659 clean reads were obtained, from the MS and MF libraries, respectively (Table S1). Further analysis of the variation of small RNA expression profiles in both MS and MF buds showed that, in addition to the common small RNAs accounting for 57.36% between the two libraries, the MS and MF libraries are comprised of 42.75% and 45.49% of the unique small RNAs, respectively ( Figure 2b). The results indicated that up to 78.43% of the unique small RNAs in MS buds were expressed specifically in those buds. 57.36% between the two libraries, the MS and MF libraries are comprised of 42.75% and 45.49% of the unique small RNAs, respectively (Figure 2b). The results indicated that up to 78.43% of the unique small RNAs in MS buds were expressed specifically in those buds. To further annotate the small RNAs, all sequences corresponding to known non-coding RNAs, such as rRNAs, tRNAs, snRNAs and snoRNAs, possible degraded fragments of mRNA, repeat sequences and some non-annotated small RNAs were filtered. Finally, a total of 15,511 and 15,476 unique small RNAs were obtained from the MS and MF libraries, respectively, and these RNAs were used for the identification and prediction of conserved and novel miRNAs ( Table 1).

Identification and Prediction of Known miRNAs and Novel miRNAs
To identify the miRNAs in young buds from the pollen mother cell formation phase and the following meiotic cycle, a total of 15,511 and 15,476 filtered unique small RNAs were identified by deep sequencing and compared with the currently-known mature plant miRNAs in miRBase. As a result, 1588 and 1536 known miRNAs, 206 and 205 miRNA-5ps, 165 and 167 miRNA-5ps and 1987 and 1842 hairpins belonging to 19 families were identified from the MS and MF libraries, respectively (Table S2).
For further identifying potentially novel miRNAs, the cotton transcript assembly database (Available online: http://occams.dfci.harvard.edu/pub/bio/tgi/data/Gossypium) was chosen to map unique small RNA sequences, and the flanking genome sequences of unique small RNAs were used To further annotate the small RNAs, all sequences corresponding to known non-coding RNAs, such as rRNAs, tRNAs, snRNAs and snoRNAs, possible degraded fragments of mRNA, repeat sequences and some non-annotated small RNAs were filtered. Finally, a total of 15,511 and 15,476 unique small RNAs were obtained from the MS and MF libraries, respectively, and these RNAs were used for the identification and prediction of conserved and novel miRNAs (Table 1).

Identification and Prediction of Known miRNAs and Novel miRNAs
To identify the miRNAs in young buds from the pollen mother cell formation phase and the following meiotic cycle, a total of 15,511 and 15,476 filtered unique small RNAs were identified by deep sequencing and compared with the currently-known mature plant miRNAs in miRBase. As a result, 1588 and 1536 known miRNAs, 206 and 205 miRNA-5ps, 165 and 167 miRNA-5ps and 1987 and 1842 hairpins belonging to 19 families were identified from the MS and MF libraries, respectively (Table S2).
For further identifying potentially novel miRNAs, the cotton transcript assembly database (Available online: http://occams.dfci.harvard.edu/pub/bio/tgi/data/Gossypium) was chosen to map unique small RNA sequences, and the flanking genome sequences of unique small RNAs were used to predict the secondary hairpin structures and the minimum free energy by using MIREAP, a tool used to identify both known and novel microRNAs from small RNA libraries deeply sequenced by Solexa/454/Solid technology (Available online: http://sourceforge.net/projects/mireap/). Any small RNAs that met the certain criteria described in the Materials and Methods perfectly and exactly mapped to the genome sequence, but not the known plant miRNAs, were classified as novel miRNAs. As a result, a total of 347 and 351 novel miRNAs and their corresponding precursors were predicted from MS and MF young buds, respectively (Tables S3 and S4). As one example, one novel mature miRNA sequence miRNA and the secondary hairpin of the precursor are presented here ( Figure 3). to predict the secondary hairpin structures and the minimum free energy by using MIREAP, a tool used to identify both known and novel microRNAs from small RNA libraries deeply sequenced by Solexa/454/Solid technology (Available online: http://sourceforge.net/projects/mireap/). Any small RNAs that met the certain criteria described in the Materials and Methods perfectly and exactly mapped to the genome sequence, but not the known plant miRNAs, were classified as novel miRNAs. As a result, a total of 347 and 351 novel miRNAs and their corresponding precursors were predicted from MS and MF young buds, respectively (Tables S3 and S4). As one example, one novel mature miRNA sequence miRNA and the secondary hairpin of the precursor are presented here ( Figure 3).

Differentially-Expressed Analysis and Target Prediction of Known and Novel miRNAs
To identify the relative differential expression levels of conserved and novel miRNAs, the fold change and p-value were calculated to determine the significance of differential expression. The miRNAs with a fold change >2 (p-value < 0.05) and a fold change <1.2 (p-value < 0.05) were regarded as differentially-expressed miRNA. As a result, 49 known miRNAs, including 22 specific to MS, 20 specific to MF and 7 common miRNAs, were identified. For the novel miRNAs, 51, 39 and 4 novel miRNAs were expressed specifically in MS, MF and both samples, respectively ( Figure 4 and Table S5). The conserved and predicted novel miRNAs, respectively. Red dots represent miRNAs with a fold change >2; blue and green dots represent miRNAs with 1.2 ≤ fold change <2 and with a fold change <1.2 (p-value < 0.05), respectively. The dots on the X and Y axes represent the specific expression of miRNAs in MF and MS young buds, respectively.
In order to elucidate the functions of the differentially-expressed miRNAs identified in the young buds from the mother pollen formation to meiotic stages, target genes were predicted initially by

Differentially-Expressed Analysis and Target Prediction of Known and Novel miRNAs
To identify the relative differential expression levels of conserved and novel miRNAs, the fold change and p-value were calculated to determine the significance of differential expression. The miRNAs with a fold change >2 (p-value < 0.05) and a fold change <1.2 (p-value < 0.05) were regarded as differentially-expressed miRNA. As a result, 49 known miRNAs, including 22 specific to MS, 20 specific to MF and 7 common miRNAs, were identified. For the novel miRNAs, 51, 39 and 4 novel miRNAs were expressed specifically in MS, MF and both samples, respectively ( Figure 4 and Table S5). to predict the secondary hairpin structures and the minimum free energy by using MIREAP, a tool used to identify both known and novel microRNAs from small RNA libraries deeply sequenced by Solexa/454/Solid technology (Available online: http://sourceforge.net/projects/mireap/). Any small RNAs that met the certain criteria described in the Materials and Methods perfectly and exactly mapped to the genome sequence, but not the known plant miRNAs, were classified as novel miRNAs. As a result, a total of 347 and 351 novel miRNAs and their corresponding precursors were predicted from MS and MF young buds, respectively (Tables S3 and S4). As one example, one novel mature miRNA sequence miRNA and the secondary hairpin of the precursor are presented here (Figure 3).

Differentially-Expressed Analysis and Target Prediction of Known and Novel miRNAs
To identify the relative differential expression levels of conserved and novel miRNAs, the fold change and p-value were calculated to determine the significance of differential expression. The miRNAs with a fold change >2 (p-value < 0.05) and a fold change <1.2 (p-value < 0.05) were regarded as differentially-expressed miRNA. As a result, 49 known miRNAs, including 22 specific to MS, 20 specific to MF and 7 common miRNAs, were identified. For the novel miRNAs, 51, 39 and 4 novel miRNAs were expressed specifically in MS, MF and both samples, respectively ( Figure 4 and Table S5). The conserved and predicted novel miRNAs, respectively. Red dots represent miRNAs with a fold change >2; blue and green dots represent miRNAs with 1.2 ≤ fold change <2 and with a fold change <1.2 (p-value < 0.05), respectively. The dots on the X and Y axes represent the specific expression of miRNAs in MF and MS young buds, respectively.
In order to elucidate the functions of the differentially-expressed miRNAs identified in the young buds from the mother pollen formation to meiotic stages, target genes were predicted initially by The conserved and predicted novel miRNAs, respectively. Red dots represent miRNAs with a fold change >2; blue and green dots represent miRNAs with 1.2 ≤ fold change <2 and with a fold change <1.2 (p-value < 0.05), respectively. The dots on the X and Y axes represent the specific expression of miRNAs in MF and MS young buds, respectively.
In order to elucidate the functions of the differentially-expressed miRNAs identified in the young buds from the mother pollen formation to meiotic stages, target genes were predicted initially by bioinformatics. In addition, 100 and 281 different targets were predicted for 13 of 49 differentially-expressed known miRNAs and for 30 of 51 novel miRNAs. The prediction also showed that some miRNAs had multiple targets (e.g., miR160a targets genes Cotton_D_10032280/Cotton_ D_10025306/Cotton_D_10019554/Cotton_D_10010678/Cotton_D_10008456/Cotton_D_10005922/ Cotton_D_10006397/Cotton_D_10002224/). Furthermore, a single gene may potentially be targeted by several miRNAs (e.g., the gene Cotton_D_10036714 was targeted by novel_mir_137/23/270/323), which indicated the complexity of miRNA regulation (Table S6). Further KEGG analysis of all of the annotated target genes showed that these genes were mainly involved in ascorbate and aldarate metabolism, plant hormone signal transduction, metabolic pathways and starch and sucrose metabolism.

Experimental Identification of miRNA Target Genes by Degradome Analysis
In order to provide further experimental evidence of the targets of differentially-expressed miRNAs and acquire more concrete targets, a degradome analysis was performed on the basis of bioinformatics prediction. A total of 222 annotated candidate target genes with 254 locations in the cotton genome of 70 known and novel miRNAs were detected, and most of these target genes were transcription factors, auxin response factors, TIR-NBS-LRR-like disease-resistant proteins, etc. (Table S7). As expected, the identified targets by degradome analysis were similar to those predicted by bioinformatics. For example, the identified targets of miR160a were the same as those by bioinformatics. However, some miRNAs had different targets from that predicted by bioinformatics (Table 2 and  Table S7). The cleavage sites of the identified miRNA targets were presented in the form of target plots (t-plots), and in each of these t-plots, a clear peak for the absolute number of tags is found at the predicted cleavage site ( Figure 5). bioinformatics. In addition, 100 and 281 different targets were predicted for 13 of 49 differentiallyexpressed known miRNAs and for 30 of 51 novel miRNAs. The prediction also showed that some miRNAs had multiple targets (e.g., miR160a targets genes Cotton_D_10032280/Cotton_D_10025306/ Cotton_D_10019554/Cotton_D_10010678/Cotton_D_10008456/Cotton_D_10005922/Cotton_D_10006 397/Cotton_D_10002224/). Furthermore, a single gene may potentially be targeted by several miRNAs (e.g., the gene Cotton_D_10036714 was targeted by novel_mir_137/23/270/323), which indicated the complexity of miRNA regulation (Table S6). Further KEGG analysis of all of the annotated target genes showed that these genes were mainly involved in ascorbate and aldarate metabolism, plant hormone signal transduction, metabolic pathways and starch and sucrose metabolism.

Experimental Identification of miRNA Target Genes by Degradome Analysis
In order to provide further experimental evidence of the targets of differentially-expressed miRNAs and acquire more concrete targets, a degradome analysis was performed on the basis of bioinformatics prediction. A total of 222 annotated candidate target genes with 254 locations in the cotton genome of 70 known and novel miRNAs were detected, and most of these target genes were transcription factors, auxin response factors, TIR-NBS-LRR-like disease-resistant proteins, etc. (Table S7). As expected, the identified targets by degradome analysis were similar to those predicted by bioinformatics. For example, the identified targets of miR160a were the same as those by bioinformatics. However, some miRNAs had different targets from that predicted by bioinformatics (Table 2 and Table S7). The cleavage sites of the identified miRNA targets were presented in the form of target plots (t-plots), and in each of these t-plots, a clear peak for the absolute number of tags is found at the predicted cleavage site ( Figure 5).   Further analysis of differentially-expressed miRNAs (Table S5) showed a total of 42 targets of five known miRNAs (17%, 5/30) and five targets of three novel miRNAs (23%, 3/13). Among the targets, auxin response factors (miR160a), DNA-directed RNA polymerases and NBS-LRR resistance protein-like proteins (miRNA2118a_3p), F-box family protein (miRNA394a), TCP4 (novel miRNA104) and chromatin licensing and DNA replication factor (novel miRNA20) were detected (Table 2).

Expression Pattern Analysis of miRNAs by RT-qPCR
Considering the fact that the small RNA libraries were constructed from a mix of different young buds, the differences of miRNAs' expression between the MS and MF buds of certain developmental stages were likely eliminated. RT-qPCR analysis indicated that miRNA159 and miRNA172 displayed a low expression from the sporogenous cell formation to meiotic stage. Although the expression of both miRNAs in MS buds was far below that in MF buds, the expression of the two miRNAs in MS buds was 1.33-and 1.21-times higher than that in MF buds during the sporogenous cell formation stage The expression profiles of miRNA159 and miRNA172, respectively. Relative expression was calculated by normalizing to the level of 5S rRNA in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiotic stages, respectively. Bars represent means ± standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate significant differences at p < 0.05.
In addition, although miRNA5658 has not been specifically identified (all of the p-values ≥ 0.05) by degradome analysis, it is noteworthy that it showed a very significantly higher expression in MF (Table 5). Furthermore, RT-qPCR analysis showed that during the sporogenous cell formation and meiosis stages, the expression of miRNA5658 in MF buds was about 11.7-and 16.2-fold higher than in MS buds, respectively (Figure 7). Combined with the target prediction by informatics (Table S6), it was supposed that miRNA5658 is involved in phytohormonal metabolism and signaling pathways during cotton flower development. Figure 7. Expression pattern analysis of miRNA5658 of MS and MF young buds by RT-qPCR. Relative expression was calculated by normalizing to the level of the 5S rRNA coding gene in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiosis stages, respectively. Bars represent means + standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate a significant differences at p < 0.05.

Dynamics of Indole-3-Acetic Acid Levels in Young Buds during Different Developmental Stages
Given that plant hormones are important regulators of metabolism, the concentration of indole-3-acetic acid (IAA), which plays a critical role in regulating many aspects of plant growth and development, was measured in young buds at different developmental stages (Figure 8). The IAA concentration of MS buds was consistently and significantly lower than that of MF buds from The expression profiles of miRNA159 and miRNA172, respectively. Relative expression was calculated by normalizing to the level of 5S rRNA in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiotic stages, respectively. Bars represent means ± standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate significant differences at p < 0.05.
In addition, although miRNA5658 has not been specifically identified (all of the p-values ≥ 0.05) by degradome analysis, it is noteworthy that it showed a very significantly higher expression in MF (Table S5). Furthermore, RT-qPCR analysis showed that during the sporogenous cell formation and meiosis stages, the expression of miRNA5658 in MF buds was about 11.7-and 16.2-fold higher than in MS buds, respectively (Figure 7). Combined with the target prediction by informatics (Table S6), it was supposed that miRNA5658 is involved in phytohormonal metabolism and signaling pathways during cotton flower development. The expression profiles of miRNA159 and miRNA172, respectively. Relative expression was calculated by normalizing to the level of 5S rRNA in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiotic stages, respectively. Bars represent means ± standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate significant differences at p < 0.05.
In addition, although miRNA5658 has not been specifically identified (all of the p-values ≥ 0.05) by degradome analysis, it is noteworthy that it showed a very significantly higher expression in MF (Table 5). Furthermore, RT-qPCR analysis showed that during the sporogenous cell formation and meiosis stages, the expression of miRNA5658 in MF buds was about 11.7-and 16.2-fold higher than in MS buds, respectively (Figure 7). Combined with the target prediction by informatics (Table S6), it was supposed that miRNA5658 is involved in phytohormonal metabolism and signaling pathways during cotton flower development. Figure 7. Expression pattern analysis of miRNA5658 of MS and MF young buds by RT-qPCR. Relative expression was calculated by normalizing to the level of the 5S rRNA coding gene in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiosis stages, respectively. Bars represent means + standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate a significant differences at p < 0.05.

Dynamics of Indole-3-Acetic Acid Levels in Young Buds during Different Developmental Stages
Given that plant hormones are important regulators of metabolism, the concentration of indole-3-acetic acid (IAA), which plays a critical role in regulating many aspects of plant growth and development, was measured in young buds at different developmental stages (Figure 8). The IAA concentration of MS buds was consistently and significantly lower than that of MF buds from Figure 7. Expression pattern analysis of miRNA5658 of MS and MF young buds by RT-qPCR. Relative expression was calculated by normalizing to the level of the 5S rRNA coding gene in RT-qPCR. SP, PO and ME represent the sporogenous cell, pollen mother cell formation and meiosis stages, respectively. Bars represent means + standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate a significant differences at p < 0.05.

Dynamics of Indole-3-Acetic Acid Levels in Young Buds during Different Developmental Stages
Given that plant hormones are important regulators of metabolism, the concentration of indole-3-acetic acid (IAA), which plays a critical role in regulating many aspects of plant growth and development, was measured in young buds at different developmental stages (Figure 8). The IAA concentration of MS buds was consistently and significantly lower than that of MF buds from sporogenous cell formation to microspore stages, especially during the pollen mother cell formation stage. In addition, considering that several auxin response factor genes were targeted by the differentially-expressed miR160a (Table 2), it is supposed that miRNA160a might play a vital role in the regulation of the IAA signal pathway. sporogenous cell formation to microspore stages, especially during the pollen mother cell formation stage. In addition, considering that several auxin response factor genes were targeted by the differentially-expressed miR160a (Table 2), it is supposed that miRNA160a might play a vital role in the regulation of the IAA signal pathway. Bars represent means ± standard deviations, which were derived from three biological replicates (n = 3), and different letters above the bar indicate a significant differences at p < 0.05.

Discussion
In order to identify and characterize the key miRNAs involved in the regulatory network underlying the abortion process of a novel male sterile mutant line Yu98-8A, high-throughput sequencing was combined with degradome sequencing, quantitative RT-PCR and histological and biochemical analyses. Based on the these analyses, 1588 and 1536 conserved miRNAs, 206 and 205 miRNA-5ps, 165 and 167 miRNA-5ps, 1987 and 1842 hairpins and 347 and 351 novel miRNAs were identified from the MS and MF libraries, respectively. Bioinformatics prediction and analysis showed 100 and 281 different targets of the differentially-expressed miRNAs, including 13 (26.5%, 49) known and 30 (58.9%, 51) novel miRNAs mainly involved in ascorbate and aldarate metabolism, plant hormone signal transduction, metabolic pathways and starch and sucrose metabolism, which indicated the vital and complex roles of miRNAs during the process of male sterility of Yu98-8A.
Target prediction demonstrated that eight of 53 targets of miRNA5658 were predicted, and all eight targets were involved in phytohormonal metabolism and signaling pathways. For example, the auxin-responsive SAUR protein (Cotton_D_gene_10015913), as the largest family of early auxin response factors, plays a pivotal role in cell enlargement [28,29]. BAK1 (Cotton_D_gene_10003171), BKI1 (Cotton_D_gene_10027756 and Cotton_D_gene_10027686) and BRI1 (Cotton_D_gene_10012266) were also degraded by miRNA5658, and BAK1 not only regulates brassinosteroid perception, but also activates BRI1, both of which together mediate cell division, anther development and the brassinosteroid signaling pathway [30][31][32]. The other targets, including Cotton_D_gene_10033147, Cotton_D_gene_10015920 and Cotton_D_gene_10018858, are involved in the salicylic acid, jasmonic acid and abscisic acid signaling pathways. Gibberellic acid signaling promotes the growth and development of plants by degrading DELLA proteins, some of which are major GA repressors during vegetative growth and floral induction [33]. As anticipated, miR473 was not expressed in MF buds in the present study, which targets a DELLA protein-coding gene (Cotton_D_gene_10021396) (Table S6).

Discussion
In order to identify and characterize the key miRNAs involved in the regulatory network underlying the abortion process of a novel male sterile mutant line Yu98-8A, high-throughput sequencing was combined with degradome sequencing, quantitative RT-PCR and histological and biochemical analyses. Based on the these analyses, 1588 and 1536 conserved miRNAs, 206 and 205 miRNA-5ps, 165 and 167 miRNA-5ps, 1987 and 1842 hairpins and 347 and 351 novel miRNAs were identified from the MS and MF libraries, respectively. Bioinformatics prediction and analysis showed 100 and 281 different targets of the differentially-expressed miRNAs, including 13 (26.5%, 49) known and 30 (58.9%, 51) novel miRNAs mainly involved in ascorbate and aldarate metabolism, plant hormone signal transduction, metabolic pathways and starch and sucrose metabolism, which indicated the vital and complex roles of miRNAs during the process of male sterility of Yu98-8A.
Target prediction demonstrated that eight of 53 targets of miRNA5658 were predicted, and all eight targets were involved in phytohormonal metabolism and signaling pathways. For example, the auxin-responsive SAUR protein (Cotton_D_gene_10015913), as the largest family of early auxin response factors, plays a pivotal role in cell enlargement [28,29]. BAK1 (Cotton_D_gene_10003171), BKI1 (Cotton_D_gene_10027756 and Cotton_D_gene_10027686) and BRI1 (Cotton_D_gene_10012266) were also degraded by miRNA5658, and BAK1 not only regulates brassinosteroid perception, but also activates BRI1, both of which together mediate cell division, anther development and the brassinosteroid signaling pathway [30][31][32]. The other targets, including Cotton_D_gene_10033147, Cotton_D_gene_10015920 and Cotton_D_gene_10018858, are involved in the salicylic acid, jasmonic acid and abscisic acid signaling pathways. Gibberellic acid signaling promotes the growth and development of plants by degrading DELLA proteins, some of which are major GA repressors during vegetative growth and floral induction [33]. As anticipated, miR473 was not expressed in MF buds in the present study, which targets a DELLA protein-coding gene (Cotton_D_gene_10021396) (Table S6).
A total of eight auxin response factor (ARF) targets of miRNA160a were identified by degradome analysis, and IAA levels in MS buds were consistently and significantly lower than in MF buds from the sporogenous cell formation to microspore development stages (Table 2 and Figure 6), which indicated ARFs targeted by miR160 mainly play important roles in floral development [17,37]. In addition, two TCP family transcription factors (Cotton_D_gene_10039906 and Cotton_D_gene_10003308) of miRNA319a (novel miRNA104) were also identified by both bioinformatics and degradome analysis. It has been well illustrated that all of these transcription factors are involved in the regulation of bud growth, cell proliferation, flower induction and stamen development [34,35,38]. Both miRNA159 and miRNA172 regulate the floral organ development by regulating the expression of their transcription factor targets [7,12,15]. For example, upregulation of miRNA159 expression could result in delayed flowering and disordered anther development in Arabidopsis [39].
F-box proteins usually degrade DELLA proteins through a SCF complex during the GA signaling pathway [40]. In the current study, a particular miRNA394a, which targets a specific f-box family protein coding gene (Cotton_D_gene_10011419), was expressed at higher levels in MF buds than that in MS buds. According to Table S5, there was no expression of miRNA473, which targets DELLA protein (Cotton_D_gene_10021396) in MS buds. In addition, GA levels in MS young buds were significantly lower than those of young MF buds [41]. Therefore, it can be speculated that the miRNA cleaves some of the f-box protein-coding genes by miR394a and that no efficient cleavage of DELLA mRNA is present due to significantly weaker expressions of miR394a during the development of buds from the sporogenous cell formation to the meiotic stages. This might be one of the reasons for male abortion in Yu98-8A through blocking GA signal transduction pathways. miRNA2118a was specifically expressed in MF buds, and 11 of the total targets were pre-mRNA-processing factors and DNA-directed RNA polymerases I, II and III (Table 2 and Table S5). Some of the targets or their paralogs may be involved in miRNA biogenesis. For example, STA1 is an Arabidopsis pre-mRNA processing factor that is involved in miRNA biogenesis [42]. In this study, several RNA polymerase III-coding genes were targeted by miRNA2118a (Table 2). It is well known that most plant and animal miRNAs are transcribed by DNA-directed RNA polymerase II and III (Pol II and Pol III) [22,43]. Therefore, these results clearly indicate that miRNAs play several direct or indirect roles contributing to male sterility in Yu98-8A.

Plant Materials
Plants of genic MS line Yu98-8A and the wild-type of Yu98-8A were grown in the field of the Henan Academy of Agricultural Science Xinxiang in Henan province, China. In order to study the relationship between the diameter of the flower bud and pollen stages, buds (<9 mm) were harvested from 10 sterile and 10 fertile plants for cytological and microscopic analysis [41]. Based on microscopic evaluation, the buds from the pollen mother cell formation to meiosis stages were collected from 56 (MS) and 58 (MF) plants from three biological replicates of each line, respectively, snap-frozen in liquid nitrogen and stored at −80 • C for future use.

Histological Analysis
After the sterile and fertile floral buds were fixed, specimen were dehydrated in an ethanol series and embedded in paraffin. Samples were sectioned into~8-µm sections using a microtome (Leica RM2245, Microsystems, Wetzlar, Germany), mounted on glass sliders and then stained with 0.1% toluidine blue O for 30-60 seconds at room temperature for viewing (Leica DM2500 M, Microsystems) [41].

Extraction, Library Construction and Sequencing of Small RNAs
Total RNA was extracted from equally-mixed young buds of each line using the pBIOZOL Total RNA Extraction Reagent (BioFlux, Fluxion Biosciences Inc., South San Francisco, CA, USA), precipitated with ethanol, dissolved in diethylpyrocarbonate (DEPC) treated water and stored at −80 • C. All RNA samples were examined for protein contamination (A260/A280 ratios) and reagent contamination (A260/A230 ratios) using a NanoDrop ND 1000 spectrophotometer (NanoDrop, Wilmington, DE, USA). The extracted total RNA was then electrophoresized through denatured 15% polyacrylamide gels. Gel fragments with a size range of 18-30 nt were excised, and the small RNA fragments were eluted overnight with 0.5 M NaCl at 4 • C. Fragments were precipitated with 1 mL ethyl ethanol, resuspended in buffer; 5 and 3 RNA adapters were added prior to ligation with T4 RNA ligase by the Illumina TruSeq Small RNA Preparation Kit (LC Sciences, Hangzhou, China). The adapter-ligated small RNAs were subsequently transcribed into cDNA by SuperScript ® II Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) and PCR amplified, using primers that were annealed to the ends of the adapters. The amplified cDNA products were purified, recovered and sequenced by Solexa sequencing (BGI, Shenzhen, China).

Computational Analysis of Sequencing Data and Identification of Known Conserved and Novel microRNAs
After the raw sequence reads were extracted, reads of low quality, adapter and insert contaminants, RNAs shorter than 18 nucleotides and polyA were all filtered. The remaining sequences were blasted to the Rfam database (Version:10.1, https://en.wikipedia.org/wiki/Rfam), NCBI GenBank database and repeat-Repbase for ribosomal RNA (rRNA), transfer RNA (tRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA) and other non-coding RNAs, repeat sequences and mRNAs, and all matched sequences were again filtered. The sequences that mapped to TIGR Cotton Transcript Assemblies (http://plantta.jcvi.org/) were then annotated using the Short Oligonucleotide Analysis Package (http://soap.genomics.org.cn), and the unmatched sequences were filtered [23]. Finally, the remaining sequences were mapped to all known plant miRNA sequences to identify the conserved miRNAs by miRBase database (version 19.0, http://www.mirbase.org/), and the perfectly-matched sequences were considered conserved miRNAs. To identify potentially novel miRNAs, cotton transcript assemblies (http://occams.dfci.harvard.edu/pub/bio/tgi/data/Gossypium) from the Dana Farber Cancer Institute were chosen to map unique small RNA sequences and the flanking genome sequence of unique small RNAs using MicroRNA Discovery By Deep Sequencing (MIREAP) (http://sourceforge.net/projects/mireap/). Any potential miRNA must meet some criteria. The first criterion is both the candidate miRNA sequences and their corresponding reverse candidate miRNA* sequences must be detected using high-throughput sequencing. Second, the precursor of any candidate miRNAs should fold secondary hairpin structures; the miRNA and miRNA* sequences must be found on the stem of the precursor; the number of mismatched bases between them must be less than three; and the mature miRNA strand and its complementary miRNA* complex should present two-nucleotide 3' overhangs. Third, within the miRNA/miRNA* duplex, the number of asymmetric bulges must be one or fewer, and the number of bases in the asymmetric bulges must be less than two [44]. Fourth, the potential miRNA precursor must have higher negative minimal folding energy (MFE) and minimal folding energy indexes (MFEI), with the MFEI > 0.8 [45]. The last criterion is the number of mature miRNAs with predicted hairpin, which should not be less than five in the alignment result.
For miRNA differential expression analysis, p-values were calculated according to the method by Audic and Claverie, 1997 [46]. Normalized read counts were calculated according to the formula, Normalized read count = (actual miRNA count/total count of clean reads) × 1,000,000. If one miRNA had no read in a library, the normalized read count of this miRNA in the library was arbitrarily set at 0.001 for further calculations. Finally, log 2 (normalized count of miRNA in MS bud/normalized count) values was used to generate the scatter plots.

Target Prediction and Identification of miRNA by Informatics and Degradome Sequencing
For analyzing the functions of the conserved and novel miRNAs identified in the young buds from the mother pollen formation to meiotic stages, target genes were predicted and identified by bioinformatics and degradome sequencing. Bioinformatic prediction was conducted according to the method by Li et al., 2008 [23]. The main criteria were as follows: first, there were no more than four mismatches between sRNA and target, and no more than two adjacent mismatches in the miRNA/target duplex, no adjacent mismatches in 2-12 bases of the miRNA (5 end)/target duplex, no mismatches in 10-11 bases of the duplex and no more than 2.5 mismatches, G-U pair counted as 0.5 mismatches, in 1-12 bases of the miRNA (5 end)/target duplex. Furthermore, the minimum free energy (MFE) of the duplex should be ≥75% of the MFE of the miRNA bound to its perfect complement. Two degradome cDNA libraries were constructed from the young buds of male sterile (MS) and male fertile (MF) plants based on the method by Addo-Quaye et al., 2008; German et al., 2008 [47,48]. Poly (A)-enriched RNAs were ligated to 5 -RNA adapters using T4 RNA ligase (Takara, Dalian, China), and consequently, reverse transcription was performed to generate first-strand cDNAs using an oligo dT primer and SuperScript II RT (Invitrogen, Carlsbad, CA, USA), followed by PCR amplification for six cycles using Phusion Taq and enzyme Mme I digestion (NEB, Beijing, China). The digested products were later ligated with double-stranded DNA adapters using T4 DNA ligase, followed by purification by electrophoresis through a 10% PAGE-gel and a PCR amplification for 20 cycles. Finally, the PCR products were purified and used for high-throughput sequencing using Illumina HiSeq 2000 (San Diego, CA, USA). After low quality sequences and adapters were removed, the remaining unique sequence signatures were aligned to the database of cotton transcript assemblies in the Cotton Gene Index (Release 11.0) using SOAP software. Identification and classification of categories of the miRNA targets were then conducted using a public software package, CleaveLand 3.0 [49]. Finally, gene ontology (GO) (http://www.geneontology.org/) was used for the assignment of the identified target genes according to the previous described method [50].
To assign putative functions to assembled unigenes, a set of sequential BLAST was searched against sequences in NCBI [51,52], and the Kyoto Encyclopedia Gene and Genomes (KEGG) database (http://www.genome.jp/kegg/ko.html) was used to identify significantly-enriched metabolic pathways of targets of differentially-expressed miRNAs [53].

Expression Analysis of miRNAs by RT-qPCR
RT-qPCR was carried out to estimate the expression differences of miRNAs and their corresponding target genes between young MS and MF buds. All primers were designed according to corresponding gene sequences with Primer Premier 5.0 (Premier, Toronto, ON, Canada) ( Table S8). The RT-qPCR assay using 2 µL (5 ng/µL) cDNA and SYBR Green PCR Master Mix (Takara) was performed in triplicate on an ABI Prism 7000 Real-time PCR system (Foster City, CA, USA). Twenty-microliter RT-qPCR reactions were incubated in a 96-well plate at 95 • C for 10 min, followed by 40 cycles of 95 • C for 15 s and 60 • C for 60 s. The cotton endogenous 5S rRNA gene was used to normalize the amount of gene-specific RT-PCR products [54], and the relative expression of genes was quantified using the 2 −∆∆Ct method [55].

Measurement of the Indole-3 Acetic Acid Content of Young Buds
The extraction and purification of endogenous phytohormones prior to the immunoassay were carried out by the enzyme-linked immunosorbent assay (ELISA) according to the methods by Yang et al., 2001, andCui et al., 2005 [56,57], with minor modifications. Briefly, 0.5 g of lyophilized ground buds were homogenized through repeated inversion in pre-chilled 80% aqueous methanol containing 1 mM butylated hydroxytoluene. The supernatant was collected, centrifuged at 5000× g for 20 min at 4 • C, and the precipitate was re-extracted and re-centrifuged. The crude extracts were passed through a Sep-Pak C18 cartridge (Waters, Milford, MA, USA), prewashed with 10 mL 100% (w/v) and 5 mL 80% (v/v) methanol, respectively. Then, the filtrates eluted with 10 mL 100% (v/v) methanol and 10 mL ether from the columns were dried in N 2 gas, and the resultant residues were dissolved in phosphate-buffered saline (PBS, 0.01M, pH 7.4). The levels of indole-3-acetic acid (IAA) were then determined using monoclonal antibodies (Affandi, Shanghai, China) and determined as pg per gram fresh weight of sample. Absorbance of the developed color was measured at 490 nm using a microplate reader (M-SPmax250, Wako Pure Chem, Tokyo, Japan) [58].

Statistical Analysis
The analysis of variance (ANOVA) was performed on phytohormonal content, and enzymatic activity and all significant differences were examined according to Tukey's test using DPS 6.05 software at p < 0.05 [59].

Conclusions
Based on the use of microscopic observation to identify floral developmental stages, 1588 and 1536 conserved miRNAs and 347 and 351 novel miRNAs were further identified and predicted during the pollen abortion stages from MS (male sterile) and MF (male fertile) young buds of cotton plants. Two independent small RNA libraries were constructed and sequenced from the young buds collected from the sporogenous cell formation to meiosis stage of the male sterile line Yu98-8A and the near-isogenic wild-type. miRNA expression profiles revealed that 49 conserved and 51 novel miRNAs were differentially expressed. Target analysis based on bioinformatics and degradomes simultaneously indicated the regulatory complexity of miRNAs during flower induction and development. Further, RT-qPCR and physiological analysis indicated that, among the different KEGG pathways, the indole-3-acetic acid (IAA) and gibberellic acetic acid (GA) signaling transduction pathways may play pivotal regulatory functions in male abortion.