Next Article in Journal
Biostimulants Application on Olea europaea L. in Mediterranean Conditions Increase the Production and Bioactive Compounds of Drupes and Oil
Next Article in Special Issue
Identification and Analysis of Phosphatidylethanolamine-Binding Protein Family Genes in the Hangzhou White Chrysanthemum (Chrysanthemum morifolium Ramat)
Previous Article in Journal
Effect Mechanism of Solar Radiation on Maize Yield Formation
Previous Article in Special Issue
Comprehensive Evaluation of Morpho-Physiological and Ionic Traits in Wheat (Triticum aestivum L.) Genotypes under Salinity Stress
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genome-Wide Identification and Characterization of Trihelix Gene Family in Asian and African Vigna Species

1
Division of Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute (IASRI), Pusa Campus, New Delhi 110012, India
2
ICAR-National Bureau Plant Genetic Resources, Pusa Campus, New Delhi 110012, India
3
CSIR-Institute of Genomics and Integrative Biology, New Delhi 110007, India
*
Authors to whom correspondence should be addressed.
Agriculture 2022, 12(12), 2172; https://doi.org/10.3390/agriculture12122172
Submission received: 16 November 2022 / Revised: 6 December 2022 / Accepted: 7 December 2022 / Published: 18 December 2022

Abstract

:
Trihelix transcription factors play a crucial role in varied stress responses as well as in the growth and development of plants. The role of trihelix transcription factors in the non-shattering phenotype in domesticated rice is known. The Vigna group of crops has different degrees of shattering phenotypes in different species. To understand the evolutionary conservation or divergence of the trihelix gene family in important Vigna species here, the genome-wide identification and characterization of the trihelix gene family in four Vigna species including the cowpea (Vigna unguiculata), mung bean (V. radiata), adzuki bean (V. angularis) and rice bean (V. umbellata) was performed. A total of 39, 35, 41 and 50 trihelix genes were identified in the cowpea, mung bean, adzuki bean and rice bean, respectively. The trihelix genes in each of the four Vigna species were classified into five subgroups: GT, GTγ, SH4, S1P1 and GTδ. The members of each subgroup shared similar patterns of gene structure and motif across the four species. The cross-species positional relationships of the cowpea, adzuki bean and mung bean vis-a-vis rice trihelix genes were studied. Further, the Ka/Ks ratio for the trihelix genes in the four Vigna species indicated the purifying or stabilizing selection of the family. The gene expression analysis of the trihelix gene family in the cowpea showed that most of the genes express in at least some of the seed and/or pod developmental stages, although at varying degrees. Based on detailed bioinformatic analysis, a potential target for gene editing towards a possible non-shattering phenotype in the four important Vigna crops was discussed.

1. Introduction

The domestication of plants is one of the most crucial developments in the history of agriculture. Domesticated plants differ from their wild predecessors in several characteristics and the non-shattering of seeds or pod indehiscence is one of the most crucial traits [1,2]. Seed shattering or pod dehiscence is the detachment of the seed naturally at maturity, which leads to inefficient harvesting and substantial yield loss [3]. In several species, the indehiscent phenotype is acquired independently through convergent evolution leading to the same functional changes [4,5]. In rice (Oryza sativa), sh4 has been identified as a major QTL on chromosome 4 responsible for the reduction of shattering in cultivated rice [6]. Further, fine mapping and validation have revealed that a single mutation causing the change in nucleotide guanine (G) to thymine (T) leading to the substitution of the amino acid lysine (K) to asparagine (N) in the transcription factor at Sh4 loci which has been shown to be responsible for the non-shattering phenotype in cultivated rice [7]. In addition to this, a single dominant gene, ‘Shattering 1′ (SHA1), was identified from perennial wild rice (O. rufipogon) that controls seed shattering. SHA1 also harbored the same amino acid ‘K79′ as that of the wild rice O. nivara at sh4 locus. SHA1 was found allelic to sh4 and encodes a trihelix family transcription factor (TF) [8].
The trihelix family is one amongst the earliest discovered transcription factors in plants. Initially, the trihelix transcription factor was classified as a GT factor because of its binding specificity for the light-responsive GT element. The DNA-binding domain of the GT factor constitutes a typical helix (helix-loop-helix-loop-helix) structure with a degenerate core sequence of 5′-G-Pu-(T/A)-A-A-(T/A)-3′ [9,10]. It was observed that the structures of the GT factor trihelix domains and Myb DNA-binding domains were moderately similar [11]; however, they have different recognition sequences. Accordingly, a few databases have even classified the trihelix proteins as Myb/SANT-LIKE domains (PF13837) containing transcription factors [10,11]. The first ever subfamily of trihelix TF discovered was GT-1, isolated from the pea (Pisum sativum), where it was found regulating the light-dependent expression of the rbcS-3A gene by binding on the promoter region. The expression of GT-1 is light-inducible with core sequence 5′-GGTTAA-3, after which the transcription factor was named [12,13]. The gene sequences that encode a DNA-binding protein with specificity for the GT-1 factor were isolated from tobacco (Nicotiana tabacum), and its predicted structure indicated the presence of three α-helices separated by a turn (HHTH) [14]. Further, a gene with another DNA-binding domain showed a similar conserved core sequence in the upstream region and binding preferences as that of the GT-1 factor, so it was named GT-2 [15,16]. The predicted structural characteristics of the GT-2 polypeptide segment were found as a helix-loop-helix-loop-helix motif, so it was designated as trihelix [17].
Recently, several trihelix genes were studied from various plants such as tomato, Arabidopsis, chrysanthemum and rice and were shown to have a role in different biological processes [18,19,20,21]. These functions include the regulation of light-dependent gene expression, responses to biotic and abiotic stress and a variety of developmental processes such as morphogenesis, seed shattering/non-shattering, stomatal development, late embryogenesis, seed development and responses to abiotic stress including salt and drought [8,10,22,23,24,25].
The genus ‘Vigna’ comprises important cultivated species such as the cowpea (V. unguiculata (L.) Walp), mung bean (V. radiata (L.) Wilczek), urd bean (V. mungo (L.) Hepper), azuki bean (V. angularis (Willd.) Ohwi & Ohashi) and rice bean (V. umbellata (Thunb.) Ohwi & Ohashi) with substantial economic and environmental importance. These species play a crucial role in nutritional security in tropical Asian and African countries [26]. Although domesticated, these cultivated species possess variable degrees of wild traits such as seed shattering, pre-harvest sprouting, etc.
Since the role of the trihelix transcription factor is established in conferring non-shattering traits in cultivated rice (8), it will be intriguing to know how far the trihelix gene family is conserved or diverged in Vigna species compared to rice and to what extent the parallel can be drawn from rice with respect to the shattering phenomenon. So far, there has been no detailed study of the trihelix gene family in Vigna species. The availability of the genomic resources in the form of the genome and transcriptome sequences in the Vigna group of crops [27,28,29,30] has opened opportunities to undertake genome-wide studies in these crops and to draw benefit from the comparative genome analysis. In the present work, the trihelix transcription factor gene family was identified and characterized in four Vigna species, cowpea, mung bean, adzuki bean and rice bean using bioinformatics and comparative genomic tools. The study of the gene families in these four crops simultaneously could help better understand cross-species genomic co-localization, conserved motif distribution, gene structure, sequence conservation and variation, functional attributes and evolutionary perspectives.

2. Materials and Methods

2.1. Genome-Wide Identification of Trihelix Genes in Vigna Species

The whole genome sequence, genes and protein sequences of cowpea, mung bean and adzuki bean were downloaded either from Ensembl Plants or Phytozome [31]. Rice bean transcriptome SRA data were downloaded from NCBI. The transcriptomes of Vigna umbellata were assembled by Trinity (v2.13.1; GitHub, San Francisco, CA, USA) [32]. The de novo transcriptome assembly was then used to remove redundant sequences by CD-HIT [33]. The Vigna umbellata gene prediction was implemented using the MAKER pipeline [34] and the output results were used for training the AUGUSTUS [35] model parameter for accuracy of gene prediction. Using the trained model parameter of Vigna umbellata, the prediction pipeline was again re-run against the repeat masked Vigna umbellata scaffolds. To identify the candidate trihelix family protein sequences in four Vigna crops, multiple sequence alignment was performed using T-coffee [36]. Signature domain of trihelix of different crops was downloaded from plant transcription factor databases. The HMM search (https://www.ebi.ac.uk/Tools/hmmer/search/hmmsearch, accessed on 20 December 2019) was then performed with trihelix domain on whole genome protein sequence of each of the four crops. The significant hits with e-value <0.01 were then verified for the presence of trihelix domain using conserved domain search databases (CDD) [37] and Pfam [38]. The identified rice bean trihelix genes and protein sequences are given in Appendix A.

2.2. Phylogenetic Analysis

The multiple sequence alignment was carried out using ClustalW with amino acid sequences of trihelix family proteins in four Vigna crops and rice [21]. A phylogenetic tree was constructed based on above alignment by maximum likelihood (ML) method with the following parameters using MEGAX software: Poisson correction; pairwise deletion; 1000 bootstrap replicates [39]. Orthologous nomenclature was used for naming trihelix genes in four Vigna crops keeping rice trihelix genes as the basis.

2.3. Identification of Basic Features, Conserved Motifs and Gene Structure

The basic molecular features such as molecular weight, isoelectric point and coding sequence length of the trihelix proteins were obtained by using ProtParam tool [40]. The MEME-Suite web server [41] was used to identify the conserved motifs following default parameters. The trihelix gene sequences were extracted from genome file based on annotation attributes using custom-built Perl program. Exon/intron structure of trihelix genes was generated using Gene Structure Display Server [42]. Prediction of subcellular localization was obtained by using CELLO2GO [43].

2.4. Chromosomal Location and Evolutionary Selection Pressure

Genomic attributes of trihelix genes of cowpea, mung bean and adzuki bean were obtained from respective gene annotation files. Gene allocation on chromosomes was performed using MapChart (v.2.32; Wageningen University & Research, Wageningen, The Netherlands) software [44]. Circos [45] software was used to visualize the collinearity of orthologous trihelix genes of Vigna crops and rice. Evolutionary selection pressure was assessed by calculating Ka and Ks values for trihelix genes of each Vigna crop with its rice orthologs using PAL2NAL [46].

2.5. Expression Analysis of Trihelix Genes in Cowpea

Cowpea transcriptome SRA data (Table S1) [47] were used to study the expression pattern of trihelix genes in different tissue types (leaf, root, stem, pod, seed and mix-tissue) and time kinetics using RNA-Seq data analysis as per Trapnell et al. [48]. The 3-D structure of trihelix proteins with and without single amino acid change was constructed using I-TASSER server [49]. Minimum potential energy calculation for both protein versions (original and mutated) was carried out on Gromacs (version 2020.1-Ubuntu-2020.1-1; University of Groningen, Groningen, The Netherlands) [50].

3. Results

3.1. Genome-Wide Identification, Phylogenetic Analysis and Nomenclature

The trihelix gene family in the cowpea, mung bean, rice bean and adzuki bean were identified using HMM search and BlastP followed by validation for the presence of the complete trihelix domain using the Pfam and InterPro databases. In the cowpea, mung bean, adzuki bean and rice bean, a total of 39, 35, 41 and 50 trihelix genes were identified, respectively. The structural and functional attributes of the trihelix family members for the cowpea, mung bean, adzuki bean and rice bean are given in Table 1, Table 2, Table 3 and Table 4, respectively. To understand the relatedness of trihelix genes among the four Vigna groups of crops and with rice, phylogenetic analysis was performed (Figure 1). All the trihelix genes of the four Vigna crops were clustered in three major groups. As per the classification of trihelix genes in rice and the orthologous relationship, five subgroups, were assigned as GT, GTδ, GTγ, S1P1 and SH4 to the identified trihelix genes in the four Vigna crops. The GTγ was found to be the largest subgroup in all four Vigna species, comprising 18, 16, 21 and 25 genes in the cowpea, mung bean, adzuki bean and rice bean, respectively. The second largest subgroup was S1P1, which consisted of eight members each in the cowpea and adzuki bean, while there were four and ten members in the mung bean and rice bean, respectively. GTδ was the smallest subgroup, comprising one member in the cowpea and adzuki bean and three members each in the mung bean and rice bean. The SH4 and GT subgroups showed nine and three members in the cowpea, six members each in the mung bean, seven and four members in the adzuki bean and seven and five members in the rice bean, respectively. Most of the members of the subfamily in the trihelix in all four Vigna and rice were clustered together, albeit with a few exceptions.
In order to maintain uniformity of nomenclature in all four Vigna groups of crops, unified orthology-based nomenclature was followed keeping rice trihelix names as reference. In rice, the trihelix genes were named as OsMSL01 to OsMSL41 where MSL stands for ‘Myb/SANT-LIKE’. This same naming pattern was extended to all four Vigna species. For the cowpea, the trihelix genes were named ‘VungMSL’ as a prefix for ‘Vigna unguiculata Myb/SANT-LIKE’; for the mung bean, ‘VradMSL’ for ‘V. radiata Myb/SANT-LIKE’; for the adzuki bean, VangMSL for ‘V. angularis Myb/SANT-LIKE’; and for the rice bean, ‘VumbMSL’ for ‘V. umbellata Myb/SANT-LIKE’. The crop-specific prefix was then followed by the gene number based on the orthologous gene number of the rice trihelix. For example, an orthologue of the rice trihelix gene OsMSL21 in the cowpea was named VungMSL21. Wherever multiple orthologous genes were present in Vigna species for one trihelix gene of rice, these genes were considered as paralogues in respective crops. The paralogues were numbered based on distance with respective rice orthologs and the least distance was named as ‘.1′ and so on with increasing distance. For example, for OsMSL12 (Os02g33610.1), two genes were identified as orthologous in the cowpea and were named based on their distance, with OsMSL12 as VungMSL12.1 (Vigun03g314600) and VungMSL12.2 (Vigun09g101800), respectively, (Figure 1, Table 1, Table 2, Table 3 and Table 4).

3.2. Identification of Basic Features, Conserved Motifs, Gene Structure and Selection Pressure

The physical properties and other features, such as the length of CDS (coding sequence) and protein (amino acid), molecular weight, isoelectric point and subcellular localization, were predicted for the identified trihelix in all four Vigna crops (Table 1, Table 2, Table 3 and Table 4). In the cowpea, the length of the coding sequence varied from 2655 bp (VungMSL12.1) to 738 bp (VungMSL36.1); in the adzuki bean this range was 3018 bp (VangMSL03) to 363 bp (VangMSL15.1). In the mung bean, the largest cds was 2964 bp (VradMSL03.2) and smallest one was 450 bp (VradMSL34), whereas in the rice bean this length differed from 2604 bp (VumbMSL12) to 373 bp (VumbMSL22). The predicted protein molecular weight of trihelix proteins ranged from 98.29 kDa (VungMSL12.1) to 27.32 kDa (VungMSL36.1) for the cowpea, 110.71 kDa (VradMSL03.2) to 16.78 kDa (VradMSL34) for the mung bean, 112.78 kDa (VangMSL03) to 13.92 kDa (VangMSL15.1) for the adzuki bean and 96.37 kDa (VumbMSL12) to 51.27 kDa (VumbMSL40.1) for the rice bean. Their predicted isoelectric points varied from 10.47 (VungMSL36.1) to 4.62 (VungMSL06.1) in the cowpea, 10.06 (VradMSL05) to 4.8 (VradMSL06.2) in the mung bean, 9.73 (VangMSL20) to 4.69 (VangMSL06.1) in the adzuki bean and 9.73 (VumbMSL20.1) to 4.26 (VumbMSL22) in the rice bean.
As per the prediction of sub-cellular localization, most of the trihelix proteins in the four Vigna species were found to have sub-cellular localization in the nucleus except a few. In the cowpea, the sub-cellular localization of VungMSL12.1 was found in the chloroplast, whereas in VungMSL12.2 and VungMSL12.3 in the mitochondria. In the mung bean, all the proteins were predicted to be localized in the nucleus except VradMSL12, which was found to be located in the mitochondria. In the adzuki bean, VangMSL12.1 was found to have localization in the mitochondria. In the rice bean, all the trihelix proteins except two showed nuclear localization. VumbMSL12 and VumbMSL16.3 were found to have chloroplastic and cytoplasmic localization, respectively.
To further analyze the evolutionary relationship, the exon–intron distribution of trihelix genes was studied in the cowpea, mung bean and adzuki bean along with rice (Figure 2), for which whole genome sequence with annotation is available in the public domain. The gene structure, in terms of exon numbers and exon–intron distribution, was mostly conserved for the orthologous genes of the rice and Vigna species, especially, among the Vigna species. However, there was observed a large variation in exon numbers and exon–intron distribution within the trihelix gene family for each of the four Vigna species. The number of exons in the trihelix family ranged from 1 to 17 in the cowpea, 2 to 15 in the mung bean and 1 to 16 in adzuki beans (Table 1, Table 2, Table 3 and Table 4, Figure 2). Each subgroup of trihelix genes showed a specific pattern of exon number and exon–intron distribution. The average number of exons was lowest in the S1P1 family, which contained about one–three exons in the cowpea, two–four exons in the mung bean and one–two exons in the adzuki bean. The highest average number of exons was found in the SH4 group, which showed 2–17 exons in the cowpea, 2–16 exons in the adzuki bean and 3–15 exons in the mung bean. The second highest number of exons was in the GTδ group, and showed five exons in the cowpea, seven exons in the adzuki bean and four–seven exons in the mung bean. The trihelix subgroups, GT-1 and GTγ, showed two–four and one–two exons, respectively, in the cowpea; two–seven and one–six in the mung bean; two–five and one–three in the adzuki bean.
The non-synonymous to synonymous substitution rate (dN/dS) was calculated for every trihelix gene with respect to its rice ortholog. The calculated range of dN/dS rate varied from 0.0100 to 1.1412 in the GT subgroup, 0.0074 to 0.724 for GTγ, 0.0064 to 0.1020 for S1P1, 0.0039 to 0.086 for SH4 and from 0.0175 to 0.0782 in the GTδ subgroup (Tables S2–S5). The dN/dS value for all four Vigna trihelix genes with their rice ortholog pairs was found to be less than 1 except for VangMSL21, which indicated purifying or stabilizing selection during the course of evolution. For VangMSL21, a member of the GT family, the dN/dS rate with respect to its rice ortholog OsMSL21 was 1.1412.
The conserved motif of the trihelix family proteins of the four Vigna species along with their rice orthologs was identified by MEME-Suite web server [41]. Overall, all 10 distinct motifs were present in each trihelix family of the Vigna crop; however, they were variably present in the individual trihelix proteins of the species (Figure 3). Only the member of the subgroup GTγ contained all 10 motifs, whereas the GTδ family harbored only two distinct motifs, Motifs 4 and 6. Motif 6 was the only motif found in all the members of the trihelix sub-families of the four Vigna crops. The distribution of motifs was more conserved within the sub-families and particularly among the same orthologs. In the GT subgroup, the prevalence of eight motifs (Motifs 1, 2, 3, 4, 5, 6, 7 and 9) was found. In the S1P1 and SH4 groups, the distribution of six (1, 3, 5, 6, 7, 9) and five motifs (1, 3, 6, 9, 5) was observed. Motifs 8 and 10 were specifically found only in the GTγ, family where they were located on the orthologs of three rice proteins, OsMSL13, OsMSL37 and OsMSL40 in all four Vigna species. The motif symbol and its corresponding consensus sequence are given in Figure 3.
The conserved ‘helix-loop-helix’ domain of the trihelix protein was searched in all the trihelix proteins in Vigna by performing multiple sequence alignment of the Myb/SANT-LIKE domain. The well separated conserved motifs for three helices were observed in all the members of the five sub-families. Three individual α-helices with conserved amino acid tryptophan (W) were identified in GT, GTδ and SH4 (Figure S1, Figure S3 and Figure 4). In the S1P1 clade, other than the three individual α-helices with conserved tryptophan (W), there was an additional helix with conserved tryptophan followed by a continuous stretch of sequences (x-(F/Y)-(F/Y)-x-x-(L/M)-x-x-(L/M) (Figure S2). In the GT subfamily too, an additional helix with the conserved sequences ((Y/F)-(Y/Y)-x-x-(L/I/M)-x-x-(L/I/M)) was also found. The binding domain of the GTγ family comprised two individual α-helices with conserved tryptophan (W), whereas in third helix the conserved tryptophan (W) was found replaced by phenylalanine (F). The fourth helix with the sequence ((F/Y)-(F/Y)-x-x-(L/M)-x-x-(L/I)) was also found downstream of the third helix (Figure S4) in GTγ. This fourth helix was not observed in SH4; instead, they had the extended third helix with a conserved leucine (L). The same pattern was followed in all four Vigna species.

3.3. Chromosomal Distribution and Collinearity of Trihelix Genes among Vigna Species

The distribution of the trihelix family genes of the cowpea, mung bean and adzuki bean were mapped on chromosomes according to their loci position (Figure 5). For all three Vigna species, the genes were distributed on 10 of a total 11 chromosomes. In the cowpea, chromosome 7 harbored the highest number of trihelix genes, i.e., nine, whereas the least, two genes, were observed on both chromosome 2 and chromosome 9. For the adzuki bean, the highest (nine) and lowest (one) trihelix genes were observed on chromosomes 2 and 8, respectively. In the case of the mung bean, chromosome 8 comprised the largest (nine) number of genes, whereas chromosome 6 contained only one gene. No trihelix genes were located on chromosome 8 in the cowpea, chromosome 7 in the adzuki bean and chromosome 11 in the mung bean. The cross-species chromosomal positional relationship between trihelix genes in rice and their orthologs in three Vigna species are represented by a Circos plot (Figure 6). This also suggests a genomic rearrangement of the trihelix gene family in the studied Vigna species.

3.4. Expression Profile of Trihelix Genes in Different Tissue in Cowpea

The expression profile of major plant tissues such as leaf, root, stem, pod, seed and mix-tissue across developmental time series of cowpea were investigated to understand the expression profile of the trihelix genes in the cowpea. All 39 cowpea trihelix genes were found expressing in some or other studied tissue type (Figure 7). Most of the trihelix genes showed moderate expression and only a few genes showed a high level of expression. The high level of expression of most genes was found in seed tissues 18 days after pollination and the highest expression of genes (VungMSL36.1) was also observed in the same tissue. Detailed information on all the RNA-Seq data studies for the expression analysis is given in Table S1.

3.5. Analysis of Rice Shattering 1 (Sha1) Orthologs in Vigna Species

To understand the extent of variation or conservation in the orthologs of rice SHA1 in four Vigna species, respective orthologs were studied by multiple sequence alignment with cultivated (OsMSL23) and wild rice (O. rufipogon). The cowpea, adzuki bean and rice bean each have two genes orthologous to OsMSL23. The mung bean does not seem to have a true ortholog of OsMSL23, while the nearest gene, VradMSL12, showed 36 and 35% similarity to the respective orthologs in cultivated and wild rice. Two genes of the cowpea, VungMSL23.1 and VungMSL23.2, showed 73 and 58% similarity with OsMSL23 and SHA1 in cultivated and wild rice, respectively. Adzuki bean and rice bean genes also showed comparable similarity with cultivated and wild rice orthologs.
The sequence alignment of SHA1/OsMSL23 orthologs in the three Vigna species cowpea (VungMSL23.1 and VungMSL23.2), adzuki bean (VangMSL23.1, VangMSL23.2) and rice bean (VumbMSL23.1, VumbMSL23.2) at the nucleotide level showed a wild rice variant base ‘Guanine’ as against ‘Thymine’ in cultivated rice (Figure 8a). At the amino acid level also, all the Vigna counterparts of SHA1/OsMSL23 showed a wild variant of amino acid ‘K’ as against ‘N’ in cultivated rice (Figure 8b). In the case of the mung bean, the closest gene, VradMSL12, did show a nucleotide variant (G) similar to wild rice; however, at the amino acid level, ‘R’ (Arginine) was present in place of ‘K’ in wild rice, which is also a positively charged amino acid as well as ‘K’.
Since a single amino acid change (Lysine to Asparagine) in OsMSL23 (Os04g57530.1) from the SH4 group of rice trihelix genes was found to have a substantial effect on gaining the non-shattering trait in rice [7], the effect of such a change in amino acid on the protein stability of the seven corresponding proteins (VungMSL23.2, VangMSL23.1, VangMSL23.2, VumbMSL23.1, VumbMSL23.2 and VradMSL12) of the four Vigna crops was studied. The 3-D protein structure of all the seven candidate proteins and their mutated variants was constructed by substituting amino acid (K to N) at the respective site (Figure 9). Both versions of each protein were compared on the basis of minimum stable energy and whether the change in the amino acid had a stabilizing or destabilizing effect on the respective proteins. It was observed that both the cowpea proteins (VungMSL23.1, VungMSL23.2) and one protein each from the adzuki bean (VangMSL23.1), rice bean (VumbMSL23.1) and mung bean (VradMSL12) showed a significant reduction in the minimum energy of proteins after the change in amino acid from K to N, which had a stabilizing effect on these proteins (Table 5).

4. Discussion

The Vigna group of crops, including the cowpea, mung bean, adzuki bean and rice bean, are not only economically important but are crucial for nutritional security and alleviating malnutrition in underdeveloped and developing countries. Unlike cereals, these crops have received limited attention from researchers and genomic intervention for genetic improvement. Recently, there have been noteworthy advances in terms of genome and transcriptome sequencing in these crops [30,31,32,33], which could facilitate comparative genomics to solicit benefit from the advances in other model crops. In these perspectives, here an important trihelix transcription factor family has been identified and characterized in four Vigna crops by making use of bioinformatic and comparative genomic approaches.
The total number of trihelix genes identified in the cowpea (39), mung bean (35), adzuki bean (41) and rice bean (50) was comparable to previous studies on rice (41), arabidopsis (28), tomato (36), chrysanthemum (20), Brassica rapa (52) and tartary buckwheat (31); however, the total number was less than in soybean (71) and wheat (94) [18,19,20,21,51,52,53,54]. In the rice bean, the number of trihelix genes identified in this study may change with the availability of the whole genome sequence in the public domain.
To facilitate unified nomenclature for easy comparison, the orthology-based nomenclature was proposed for trihelix genes in four Vigna crops with the basis of rice trihelix genes. The orthology-based nomenclature was earlier followed for the MAPK, MAPKK and CDPK gene families in rice, poplar and pigeon pea [55,56,57]. Such systematic nomenclature eases the species-to-species comparisons and helps in the functional characterization of homologs that show clear orthologous relationships with candidates from the model plant species. However, the conservation of biological functions within such relationships is to be resolved [56,57].
Previously, the phylogeny-based classification of the rice trihelix transcription factor family was given with three distinctive sub-families GTα, GTβ and GTγ [58]. Further study on arabidopsis, rice, chrysanthemum, brassica and medicago further expanded the family into five sub-divisions GT-1, GT-2, GTγ, S1P1 and SH4 [19,20,21,51,59]. A new subfamily of trihelix, GTδ, was proposed in tomato and rice [18,21]. In the present study, the classification of the trihelix gene family in Vigna species was followed according to the classification in rice [21]. The predicted trihelix genes in Vigna species were classified into five subfamilies, GT, GTγ, S1P1, SH4 and GTδ, based on the evolutionary analysis with rice. In addition, all the predicted proteins of the trihelix family from Vigna species were predicted to be localized in the nucleus except OsMSL12 orthologs, which are found in the mitochondria and chloroplast (Table 1, Table 2, Table 3 and Table 4), and VumbMSL16.3, present in cytoplasm. In congruence with this, in rice OsMSL12 was reported to be localized in chloroplast [21]. Transcription factors are known to be localized in the nucleus for their role in gene regulation. Recently, NAC102, an NAC family transcription factor, has been identified for the first time to directly regulate chloroplast gene expression [60]. It will be intriguing to see if the trihelix transcription factor particularly, an ortholog of OsMSL12 in Vigna species, has any role to play in chloroplastic gene regulation.
The exon numbers in the genes and gene structure were often found consistent for orthologous groups and for a given subgroup [21]. Further, with respect to the conserved motifs in all the trihelix proteins in Vigna species, a total 10 motifs were found to be distributed differentially, although mostly with similar patterns within a subfamily. Similar observations have also been reported in other crops, including soybean and rice (21,53). Mostly, the trihelix members from the same clade and particularly the same orthologs share a similar distribution of motif as well as exon\intron structure.
A fourth amphipathic α-helix with a strongly conserved motif ((F/Y)-(F/Y)-x-x-(L/I/M)-x-x-(L/I/M) followed by another three in the trihelix protein were previously shown to be located on the trihelix DNA binding domain of the GT-1, GT-2, GTγ and S1P1 clades in arabidopsis [13]. Almost all Vigna trihelix proteins contain this fourth helix with the same general sequences downstream of the other three helices on the DNA binding domain of the GT, S1P1 and GTγ clades (Figures S1, S2 and S4). This was not present in the SH4 (Figure 4) and GTδ (Figure S3) clades; rather, the SH4 clade carries an extended third helix. This fourth helix is shown to be required for DNA binding [13]. The non-synonymous to synonymous substitution rate (dN/dS) for trihelix genes with their rice ortholog revealed that the ratio was significantly less than 1 for most of the trihelix genes (Tables S2–S5), indicating purifying or stabilizing selection in the course of evolution. Similar observations have been reported for other gene families in rice [61].
Domestication traits such as the loss/reduction of seed shattering or pod dehiscence were acquired in several crop plants independently by convergent phenotypic evolution [4]. Independent evolution for pod shattering in two closely related species, the common bean and cowpea, showed that a convergent domestication might take place when mutation occurs at orthologous loci [62]. Similarly, in arabidopsis and the common bean, AtMYB26 and its orthologue, PvMYB26, have been identified for pod shattering, which shows that orthologous genes preserved shattering function, and the associated pathways between and beyond closely related species [63]. MYB26 as a target for the non-shattering trait in the adzuki bean and yard-long bean has also been shown [64]. The genetic basis of domestication in many of the Vigna species is still unresolved and yield loss due to pod shattering is reported as a major concern [65,66].
In rice (Oryza sativa), the non-shattering trait was gained due to single nucleotide substitution from G to T at SHA1/Shattering 4 (Sh4), which encodes trihelix transcription factor Os04g57530.1 (OsMSL23) [7,8].
It was observed in our study that the ortholog of OsMSL23 in all three Vigna crops (cowpea, adzuki bean and rice bean) harbors the nucleotide (G) and amino acid variant ‘K’ as that of wild rice (Figure 8). In the mung bean, although a true ortholog of OsMSL23 is not present, the nearest gene, VradMSL12, also possesses the same wild variant at the nucleotide level (G), amino acid (R) however, which is of the same group as Lysine. It is interesting to know that Vigna crops display shattering to different degrees [65,66]. Besides the single amino acid variant, for the gene there is overall substantial sequence homology at the nucleotide and amino acid levels between Vigna crops and rice (Figure 8).
It will be interesting to see if the non-shattering trait can be gained in the four Vigna crops by attempting the same mutation as in cultivated rice. The expression study of the trihelix genes in the cowpea does show expression of VungMSL23.1 and VungMSL23.2 in seed tissues, albeit at a lower level (Figure 7).
In this direction, the 3-D protein structures of OsMSL23 orthologs in the cowpea (VungMSL23.1 VungMSL23.2), adzuki bean (VangMSL23.1, VangMSL23.2), rice bean (VumbMSL23.1, VumbMSL23.2) and the nearest protein in mung bean (VradMSL12) with the ‘K’ and mutated ‘N’ amino acid variants at respective positions were constructed and assessed for stability. It was found that five of the seven proteins showed lower minimum energy than the native proteins and had a stabilizing effect (Figure 9 and Table 5). In the cowpea, VungMSL23.1 showed better stability than VungMSL23.2. Accordingly, in the cowpea, VungMSL23.1, adzuki bean, VangMSL23.1, rice bean, VumbMSL23.1 and in the mung bean, VradMSL12 could form a potential target for genome editing for the non-shattering trait. It is, however, imperative to undertake the wet-lab validation and detailed characterization of the same. The effect of single amino acid change on protein stability has also been shown in other crops [67].
Recently, a genome editing technique has been suggested as a potential tool to target genes for the accelerating domestication of semi-domesticated or wild plants. It has been advocated that if the complete genome sequence of the desired plant is available then, based on the established domesticated genes, the target orthologous genes can be identified in that species for genome editing to accelerate domestication [68]. Indeed, recently ‘De novo domestication’ or ‘Neo-domestication’ has been proposed as a novel strategy for crop breeding for gaining the desired domestication traits [68,69,70].
Probably, the time clock has struck for ‘De novo domestication’ or ‘Neo-domestication’ for the accelerated and targeted introduction of domestication traits using genome editing in crop plants [68,71].

5. Conclusions

In this study, through genome-wide identification, the trihelix gene family comprising 39, 35, 41 and 50 genes have been identified in the cowpea, mung bean, adzuki bean and rice bean, respectively. The identified genes in the four Vigna crops have been characterized using bioinformatics and comparative genomic tools. The gene structure and conserved motifs in proteins were relatively consistent within the sub-groups of the gene family. The non-synonymous to synonymous substitution rate for trihelix genes in the four Vigna crops suggested purifying or stabilizing selection. Further, potential candidate genes were identified to target the non-shattering traits in the Vigna crops. The proposed change in the single nucleotide leading to the substitution of an amino acid showed a stabilizing effect at the protein level. It was proposed to apply the genome editing tools on the identified target trihelix genes in the four Vigna crops to have further insight into their role in the non-shattering phenotype, thereby accelerating domestication. The study gives an elaborate understanding of various aspects of the trihelix transcription factor family in four Vigna crops.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agriculture12122172/s1, Figure S1: Sequence alignment of trihelix domain of GT group present in four Vigna species. Figure S2: Sequence alignment of trihelix domain of S1P1 group present in four Vigna species. Figure S3: Sequence alignment of trihelix domain of GTδ group present in four Vigna species. Figure S4: Sequence alignment of trihelix domain of GTγ group present in four Vigna species. Supplementary tables; Table S1: Detailed information RNASeq data of cowpea used for gene expression analysis of trihelix genes. Table S2: Non- Synonymous/Synonymous substitution rate for Trihelix genes of cowpea; Table S3: Non- Synonymous/Synonymous substitution rate for Trihelix genes of mung bean. Table S4: Non- Synonymous/Synonymous substitution rate for Trihelix genes of adzuki bean. Table S5: Non- Synonymous/Synonymous substitution rate for Trihelix genes of rice bean.

Author Contributions

Conceptualization, D.P.W. and S.A.; validation, S.K. and S.M.; formal analysis, S.K. and R.M.; investigation, S.K.; resources, S.A. and A.R.; writing—original draft preparation, S.K. and D.P.W.; writing—review and editing, D.P.W. and S.A.; supervision, S.A., A.R., S.J. and D.P.W.; funding acquisition, S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. The APC was funded by ICAR National Fellow project.

Data Availability Statement

Not applicable.

Acknowledgments

During the period of study, Shweta Kumari was supported by a fellowship from ICAR-IARI, New Delhi. Sunil Archak was supported by an ICAR National fellowship. Computational facilities provided by ICAR-NBPGR and ICAR-IASRI are gratefully acknowledged.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Rice bean Trihelix genes and protein sequences.

References

  1. Ross-Ibarra, J.; Morrell, P.L.; Gaut, B.S. Plant domestication, a unique opportunity to identify the genetic basis of adaptation. Proc. Natl. Acad. Sci. USA 2007, 104 (Suppl. S1), 8641–8648. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Olsen, K.M.; Wendel, J.F. A bountiful harvest: Genomic insights into crop domestication phenotypes. Annu. Rev. Plant. Biol. 2013, 64, 47–70. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Dong, Y.; Wang, Y.Z. Seed shattering: From models to crops. Front. Plant. Sci. 2015, 6, 476. [Google Scholar] [CrossRef] [PubMed]
  4. Vittori, V.D.; Gioia, T.; Rodriguez, M.; Bellucci, E.; Bitocchi, E.; Nanni, L.; Attene, G.; Rau, D.; Papa, R. Convergent evolution of the seed shattering trait. Genes 2019, 10, 68. [Google Scholar] [CrossRef] [Green Version]
  5. Lin, Z.; Li, X.; Shannon, L.M.; Yeh, C.T.; Wang, M.L.; Bai, G.; Peng, Z.; Li, J.; Trick, H.N.; Clemente, T.E.; et al. Parallel domestication of the Shattering1 genes in cereals. Nat. Genet. 2012, 44, 720–724. [Google Scholar] [CrossRef] [Green Version]
  6. Li, C.; Zhou, A.; Sang, T. Genetic analysis of rice domestication syndrome with the wild annual species, Oryza nivara. New Phytol. 2006, 170, 185–193. [Google Scholar] [CrossRef]
  7. Li, C.; Zhou, A.; Sang, T. Rice domestication by reducing shattering. Science 2006, 311, 1936–1939. [Google Scholar] [CrossRef] [Green Version]
  8. Lin, Z.; Griffith, M.E.; Li, X.; Zhu, Z.; Tan, L.; Fu, Y.; Zhang, W.; Wang, X.; Xie, D.; Sun, C. Origin of seed shattering in rice (Oryza sativa L.). Planta 2007, 226, 11–20. [Google Scholar] [CrossRef]
  9. Zhou, D.X. Regulatory mechanism of plant gene transcription by GT-elements and GT-factors. Trends Plant. Sci. 1999, 4, 210–214. [Google Scholar] [CrossRef]
  10. Qin, Y.; Ma, X.; Yu, G.; Wang, Q.; Wang, L.; Kong, L.; Kim, W.; Wang, H.W. Evolutionary history of trihelix family and their functional diversification. DNA Res. 2014, 21, 499–510. [Google Scholar] [CrossRef]
  11. Nagano, Y. Several features of the GT-factor trihelix domain resemble those of the Myb DNA-binding domain. Plant Physiol. 2000, 124, 491–494. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Riechmann, J.L.; Heard, J.; Martin, G.; Reuber, L.; Jiang, C.; Keddie, J.; Adam, L.; Pineda, O.; Ratcliffe, O.J.; Samaha, R.R.; et al. Sequence-specific interactions of a pea nuclear factor with light-responsive elements upstream of the rbcS-3A gene. EMBO J. 1987, 6, 2543–2549. [Google Scholar]
  13. Kaplan-Levy, R.N.; Brewer, P.B.; Quon, T.; Smyth, D.R. The trihelix family of transcription factors-light, stress and development. Trends Plant. Sci. 2012, 17, 163–171. [Google Scholar] [CrossRef]
  14. Gilmartin, P.M.; Memelink, J.; Hiratsuka, K.; Kay, S.A.; Chua, N.H. Characterization of a gene encoding a DNA binding protein with specificity for a light-responsive element. Plant Cell. 1992, 4, 839–849. [Google Scholar] [PubMed] [Green Version]
  15. Kay, S.A.; Keith, B.; Shinozaki, K.; Chye, M.L.; Chua, N.H. The rice phytochrome gene: Structure, autoregulated expression, and binding of GT-1 to a conserved site in the 5’upstream region. Plant Cell. 1989, 1, 351–360. [Google Scholar] [PubMed]
  16. Dehesh, K.; Bruce, W.B.; Quail, P.H. A trans-acting factor that binds to a GT-motif in a phytochrome gene promoter. Science 1990, 250, 1397–1399. [Google Scholar] [CrossRef] [PubMed]
  17. Dehesh, K.; Hung, H.; Tepperman, J.M.; Quail, P.H. GT-2: A transcription factor with twin autonomous DNA-binding domains of closely related but different target sequence specificity. EMBO J. 1992, 11, 4131–4144. [Google Scholar] [CrossRef] [PubMed]
  18. Yu, C.; Cai, X.; Ye, Z.; Li, H. Genome-wide identification and expression profiling analysis of trihelix gene family in tomato. Biochem. Biophys. Res. Commun. 2015, 468, 653–659. [Google Scholar] [CrossRef]
  19. Yasmeen, E.; Riaz, M.; Sultan, S.; Azeem, F.; Abbas, A.; Riaz, K.; Ali, M.A. Genome-wide analysis of trihelix transcription factor gene family in Arabidopsis thaliana. Pak. J. Agric. Sci. 2016, 53. [Google Scholar]
  20. Song, A.; Wu, D.; Fan, Q.; Tian, C.; Chen, S.; Guan, Z.; Xin, J.; Zhao, K.; Chen, F. Transcriptome-wide identification and expression profiling analysis of chrysanthemum trihelix transcription factors. Int. J. Mol. Sci. 2016, 17, 198. [Google Scholar] [CrossRef] [Green Version]
  21. Li, J.; Zhang, M.; Sun, J.; Mao, X.; Wang, J.; Wang, J.; Liu, H.; Zheng, H.; Zhen, Z.; Zhao, H.; et al. Genome-Wide Characterization and Identification of Trihelix Transcription Factor and Expression Profiling in Response to Abiotic Stresses in Rice (Oryza sativa L.). Int. J. Mol. Sci. 2019, 20, 251. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Brewer, P.B.; Howles, P.A.; Dorian, K.; Griffith, M.E.; Ishida, T.; Kaplan-Levy, R.N.; Kilinc, A.; Smyth, D.R. PETAL LOSS, a trihelix transcription factor gene, regulates perianth architecture in the Arabidopsis flower. Development 2004, 131, 4035–4045. [Google Scholar] [CrossRef] [PubMed]
  23. Breuer, C.; Kawamura, A.; Ichikawa, T.; Tominaga-Wada, R.; Wada, T.; Kondou, Y.; Muto, S.; Matsui, M.; Sugimoto, K. The trihelix transcription factor GTL1 regulates ploidy-dependent cell growth in the Arabidopsis trichome. Plant Cell 2009, 21, 2307–2322. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Gao, M.J.; Lydiate, D.J.; Li, X.; Lui, H.; Gjetvaj, B.; Hegedus, D.D.; Rozwadowski, K. Repression of seed maturation genes by a trihelix transcriptional repressor in Arabidopsis seedlings. Plant Cell 2009, 21, 54–71. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Barr, M.S.; Willmann, M.R.; Jenik, P.D. Is there a role for trihelix transcription factors in embryo maturation? Plant Signal. Behav. 2012, 7, 205–209. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Pratap, A.; Malviya, N.; Tomar, R.; Gupta, D.S.; Kumar, J. Vigna. In Alien Gene Transfer in Crop Plants., 1st ed.; Pratap, A., Kumar, J., Eds.; Springer: New York, NY, USA, 2014; Volume 2, pp. 163–189. [Google Scholar]
  27. Kang, Y.J.; Satyawan, D.; Shim, S.; Lee, T.; Lee, J.; Hwang, W.J.; Lee, S.H. Draft genome sequence of adzuki bean, Vigna angularis. Sci. Rep. 2015, 5. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Kang, Y.J.; Kim, S.K.; Kim, M.Y.; Lestari, P.; Kim, K.H.; Ha, B.K.; Lee, S.H. Genome sequence of mung bean and insights into evolution within Vigna species. Nat. Commun. 2014, 5, 1–9. [Google Scholar] [CrossRef] [Green Version]
  29. Lonardi, S.; Muñoz-Amatriaín, M.; Liang, Q.; Shu, S.; Wanamaker, S.I.; Lo, S.; Close, T.J. The genome of cowpea (Vigna unguiculata [L.] Walp.). Plant J. 2019, 98, 767–782. [Google Scholar] [CrossRef] [Green Version]
  30. Kaul, T.; Eswaran, M.; Thangaraj, A.; Meyyazhagan, A.; Nehra, M.; Raman, N.M.; Balamurali, B. Rice Bean (Vigna umbellata) draft genome sequence: Unravelling the late flowering and unpalatability related genomic resources for efficient domestication of this underutilized crop. bioRxiv 2019, 816595. [Google Scholar] [CrossRef] [Green Version]
  31. Goodstein, D.M.; Shu, S.; Howson, R.; Neupane, R.; Hayes, R.D.; Fazo, J.; Mitros, T.; Dirks, W.; Hellsten, U.; Putnam, N.; et al. Phytozome: A comparative platform for green plant genomics. Nucleic Acids Res. 2011, 40, D1178–D1186. [Google Scholar] [CrossRef]
  32. Grabherr, M.G.; Haas, B.J.; Yassour, M.; Levin, J.Z.; Thompson, D.A.; Amit, I.; Adiconis, X.; Fan, L.; Raychowdhury, R.; Zeng, Q.; et al. Trinity: Reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nat. Biotechnol. 2011, 29, 644–652. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Li, W.; Godzik, A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. J. Bioinform. 2006, 22, 1658–1659. [Google Scholar] [CrossRef] [PubMed]
  34. Cantarel, B.L.; Korf, I.; Robb, S.M.; Parra, G.; Ross, E.; Moore, B.; Yandell, M. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008, 18, 188–196. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Stanke, M.; Tzvetkova, A.; Morgenstem, B. AUGUSTUS at EGASP: Using, EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol. 2006, 7, S11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  36. Notredame, C.; Higgins, D.G.; Heringa, J. T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000, 302, 205–217. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Marchler-Bauer, A.; Derbyshire, M.K.; Gonzales, N.R.; Lu, S.; Chitsaz, F.; Geer, L.Y.; Geer, R.C.; He, J.; Gwadz, M.; Hurwitz, D.I.; et al. CDD: NCBI’s conserved domain database. Nucleic Acids Res. 2015, 43, D222–D226. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  38. Finn, R.D.; Bateman, A.; Clements, J.; Coggill, P.; Eberhardt, R.Y. Pfam: The protein families database. Nucleic Acids Res. 2014, 42, D222–D230. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Kumar, S.; Nei, M.; Dudley, J.; Tamura, K. MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief. Bioinform. 2008, 9, 299–306. [Google Scholar] [CrossRef] [Green Version]
  40. Gasteiger, E.; Gattiker, A.; Hoogland, C.; Ivanyi, I.; Appel, R.D.; Bairoch, A. ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 2003, 31, 3784–3788. [Google Scholar] [CrossRef] [Green Version]
  41. Bailey, T.L.; Boden, M.; Buske, F.A.; Frith, M.; Grant, C.E.; Clementi, L. MEME SUITE: Tools for motif discovery and searching. Nucleic Acids Res. 2009, 37 (Suppl. S2), W202–W208. [Google Scholar] [CrossRef]
  42. Hu, B.; Jin, J.; Guo, A.Y.; Zhang, H.; Luo, J.; Gao, G. GSDS 2.0: An upgraded gene feature visualization server. Bioinformatics 2015, 31, 1296–1297. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Yu, C.S.; Cheng, C.W.; Su, W.C.; Chang, K.C.; Huang, S.W.; Hwang, J.K.; Lu, C.H. CELLO2GO: A web server for protein subCELlular LOcalization prediction with functional gene ontology annotation. PLoS ONE 2014, 9, e99368. [Google Scholar] [CrossRef] [PubMed]
  44. Voorrips, R.E. MapChart: Software for the graphical presentation of linkage maps and QTLs. J. Hered. 2002, 93, 77–78. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. Krzywinski, M.; Schein, J.; Birol, I.; Connors, J.; Gascoyne, R. Circos: An information aesthetic for comparative genomics. Genome Res. 2009, 19, 1639–1645. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  46. Suyama, M.; Torrents, D.; Bork, P. PAL2NAL: Robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006, 34 (Suppl. 2), W609–W612. [Google Scholar] [CrossRef] [Green Version]
  47. Yao, S.; Jiang, C.; Huang, Z.; Torres-Jerez, I. The Vigna unguiculata Gene Expression Atlas (VuGEA) from de novo assembly and quantification of RNA-seq data provides insights into seed maturation mechanisms. Plant J. 2016, 88, 318–327. [Google Scholar] [CrossRef]
  48. Trapnell, C.; Roberts, A.; Goff, L.; Pertea, G. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2012, 7, 562–578. [Google Scholar] [CrossRef] [Green Version]
  49. Yang, J.; Zhang, Y. I-TASSER server: New development for protein structure and function predictions. Nucleic Acids Res. 2015, 43, W174–W181. [Google Scholar] [CrossRef] [Green Version]
  50. Abraham, M.J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J.C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1, 19–25. [Google Scholar] [CrossRef] [Green Version]
  51. Wang, W.; Wu, P.; Liu, T.; Ren, H.; Li, Y.; Hou, X. Genome-wide analysis and expression divergence of the trihelix family in Brassica rapa: Insight into the evolutionary patterns in plants. Sci. Rep. 2017, 7, 6463. [Google Scholar] [CrossRef] [Green Version]
  52. Ma, Z.; Liu, M.; Sun, W.; Huang, L.; Wu, Q.; Bu, T.; Chen, H. Genome-wide identification and expression analysis of the trihelix transcription factor family in tartary buckwheat (Fagopyrum tataricum). BMC Plant Biol. 2019, 19, 344. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  53. Liu, W.; Zhang, Y.; Li, W.; Lin, Y.; Wang, C.; Xu, R.; Zhang, L. Genome-wide characterization and expression analysis of soybean trihelix gene family. PeerJ 2020, 8, 8753. [Google Scholar] [CrossRef] [PubMed]
  54. Xiao, J.; Hu, R.; Gu, T.; Han, J.; Qiu, D. Genome-wide identification and expression profiling of trihelix gene family under abiotic stresses in wheat. BMC Genom. 2019, 20, 287. [Google Scholar] [CrossRef] [PubMed]
  55. Hamel, L.P.; Nicole, M.C.; Sritubtim, S.; Morency, M.J.; Ellis, M.; Ehlting, J.; Beaudoin, N.; Barbazuk, B.; Klessig, D.; Lee, J.; et al. Ancient signals: Comparative genomics of plant MAPK and MAPKK gene families. Trends Plant Sci. 2006, 11, 192–198. [Google Scholar] [CrossRef] [PubMed]
  56. Hamel, L.P.; Sheen, J.; Seguin, A. Ancient signals: Comparative genomics of green plant CDPKs. Trends Plant Sci. 2014, 19, 79–89. [Google Scholar] [CrossRef] [Green Version]
  57. Wankhede, D.P.; Aravind, J.; Mishra, S.P. Identification of Genic SNPs from ESTs and Effect of Non-synonymous SNP on Proteins in Pigeonpea. Proc. Natl. Acad. Sci. USA 2019, 89, 595–603. [Google Scholar] [CrossRef]
  58. Fang, Y.; Xie, K.; Hou, X.; Hu, H.; Xiong, L. Systematic analysis of GT factor family of rice reveals a novel subfamily involved in stress responses. Mol. Genet. Genom. 2010, 283, 157–169. [Google Scholar] [CrossRef]
  59. Liu, X.; Zhang, H.; Ma, L.; Wang, Z.; Wang, K. Genome-Wide Identification and Expression Profiling Analysis of the Trihelix Gene Family Under Abiotic Stresses in Medicago truncatula. Genes 2020, 11, 1389. [Google Scholar] [CrossRef]
  60. Xin, K.; Pan, T.; Gao, S.; Yan, S.A. Transcription Factor Regulates Gene Expression in Chloroplasts. Int. J. Mol. Sci. 2021, 22, 6769. [Google Scholar] [CrossRef]
  61. Yadav, S.K.; Santosh Kumar, V.V.; Verma, R.K. Genome-wide identification and characterization of ABA receptor PYL gene family in rice. BMC Genom. 2020, 21, 676. [Google Scholar] [CrossRef]
  62. Rau, D.; Murgia, M.L.; Rodriguez, M.; Bitocchi, E.; Bellucci, E. Genomic dissection of pod shattering in common bean: Mutations at non-orthologous loci at the basis of convergent phenotypic evolution under domestication of leguminous species. Plant J. 2019, 97, 693–714. [Google Scholar] [CrossRef] [PubMed]
  63. Vittori, D.V.; Bitocchi, E.; Rodriguez, M.; Alseekh, S.; Bellucci, E. Pod indehiscence in common bean is associated to the fine regulation of PvMYB26 and a non-functional abscission layer. BioRxiv 2020, 021972. [Google Scholar] [CrossRef]
  64. Takahashi, Y.; Kongjaimun, A.; Muto, C.; Kobayashi, Y.; Kumagai, M. Same locus for non-shattering seed pod in two independently domesticated legumes, Vigna angularis and Vigna unguiculata. Front. Genet. 2020, 11, 748. [Google Scholar] [CrossRef] [PubMed]
  65. Isemura, T.; Kaga, A.; Konishi, S.; Ando, T.; Tomooka, N.; Han, O.K.; Vaughan, D.A. Genome dissection of traits related to domestication in azuki bean (Vigna angularis) and comparison with other warm-season legumes. Ann. Bot. 2007, 100, 1053–1071. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  66. Vairam, N.; Lavanya, S.A.; Vanniarajan, C. Screening for pod shattering in mutant population of mung bean (Vigna radiata (L.) Wilczek). J. Appl. Nat. Sci. 2017, 9, 1787–1791. [Google Scholar] [CrossRef] [Green Version]
  67. Wankhede, D.P.; Kumari, M.; Richa, T.; Aravind, J.; Rajkumar, S. Genome wide identification and characterization of Calcium Dependent Protein Kinase gene family in Cajanus cajan. J. Environ. Biol. 2017, 38, 167. [Google Scholar] [CrossRef]
  68. Osterberg, J.T.; Xiang, W.; Olsen, L.I.; Edenbrandt, A.K.; Vedel, S.E. Accelerating the domestication of new crops: Feasibility and approaches. Trends Plant Sci. 2017, 22, 373–384. [Google Scholar] [CrossRef]
  69. Yu, H.; Li, J. Breeding future crops to feed the world through de novo domestication. Nat Commun. 2022, 13, 1171. [Google Scholar] [CrossRef]
  70. Zsogon, A.; Cermak, T.; Voytas, D.; Peres, L.E. Genome editing as a tool to achieve the crop ideotype and de novo domestication of wild relatives: Case study in tomato. Plant Sci. 2017, 256, 120–130. [Google Scholar] [CrossRef]
  71. Pisias, M.T.; Bakala, H.S.; McAlvay, A.C.; Mabry, M.E.; Birchler, J.A.; Yang, B.; Pires, J.C. Prospects of Feral Crop De Novo Re-Domestication. Plant Cell Physiol. 2022, 63, 1641–1653. [Google Scholar] [CrossRef]
Figure 1. Phylogenetic tree based on protein sequence of trihelix genes in cowpea, mung bean, adzuki bean, rice bean and rice. The maximum likelihood tree was created using MEGAX (bootstrap value = 1000). Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE. The clade in red represents GT family, blue represents GTγ family, brown represents S1P1 family, green represents SH4 family and magenta represents GTδ family. ‘*’, ‘**’ and ‘***’ represent member of GT, GTγ and SH4 family clustered elsewhere.
Figure 1. Phylogenetic tree based on protein sequence of trihelix genes in cowpea, mung bean, adzuki bean, rice bean and rice. The maximum likelihood tree was created using MEGAX (bootstrap value = 1000). Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE. The clade in red represents GT family, blue represents GTγ family, brown represents S1P1 family, green represents SH4 family and magenta represents GTδ family. ‘*’, ‘**’ and ‘***’ represent member of GT, GTγ and SH4 family clustered elsewhere.
Agriculture 12 02172 g001
Figure 2. Gene structure of trihelix family genes in Vigna species along with its rice orthologs. The red bars represent exons and the black line between them represents introns. The scale at the bottom indicates the length of exons and introns. Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE.
Figure 2. Gene structure of trihelix family genes in Vigna species along with its rice orthologs. The red bars represent exons and the black line between them represents introns. The scale at the bottom indicates the length of exons and introns. Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE.
Agriculture 12 02172 g002
Figure 3. The conserved motif of trihelix family genes in Vigna species along with its rice orthologs. The different color box represents the different 10 motifs. Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE.
Figure 3. The conserved motif of trihelix family genes in Vigna species along with its rice orthologs. The different color box represents the different 10 motifs. Vung, V. unguiculata; Vrad, V. radiata; Vang, V. angularis; Vumb, V. umbellata; Os, Oryza sativa. MSL, Myb/SANT-LIKE.
Agriculture 12 02172 g003
Figure 4. Sequence alignment of trihelix domain of SH4 group present in Vigna species. The three boxes represent the three helices of trihelix domain and sequence included in orange box represents conservation of tryptophan (W) present in the domain.
Figure 4. Sequence alignment of trihelix domain of SH4 group present in Vigna species. The three boxes represent the three helices of trihelix domain and sequence included in orange box represents conservation of tryptophan (W) present in the domain.
Agriculture 12 02172 g004
Figure 5. Chromosomal locations of trihelix genes of cowpea, mung bean and adzuki bean. Chromosome number is mentioned at the top. The chromosomes are presented in the order— cowpea (Vung, V. unguiculata), mung bean (Vrad, V. radiata) and then adzuki bean (Vang, V. angularis).
Figure 5. Chromosomal locations of trihelix genes of cowpea, mung bean and adzuki bean. Chromosome number is mentioned at the top. The chromosomes are presented in the order— cowpea (Vung, V. unguiculata), mung bean (Vrad, V. radiata) and then adzuki bean (Vang, V. angularis).
Agriculture 12 02172 g005
Figure 6. Circos map showing chromosomal position and collinearity of trihelix genes of cowpea, mung bean, adzuki bean and rice chromosomes. The red, green, blue and black represent chromosome of cowpea, mung bean, adzuki bean and rice, respectively. The black line shows connection of the respective orthologs of two species in Vigna and rice.
Figure 6. Circos map showing chromosomal position and collinearity of trihelix genes of cowpea, mung bean, adzuki bean and rice chromosomes. The red, green, blue and black represent chromosome of cowpea, mung bean, adzuki bean and rice, respectively. The black line shows connection of the respective orthologs of two species in Vigna and rice.
Agriculture 12 02172 g006
Figure 7. Expression of cowpea trihelix genes of major plant tissues at different developmental stages. (S8: Seed 8 days after pollination, S10: Seed 10 days after pollination, S14: Seed 14 days after pollination, S18: Seed 18 days after pollination, P6: Pod 6 days after pollination, P10: Pod 10 days after pollination, P16: Pod 16 days after pollination, Mix_MiSeq: Mix tissues). Heatmap was generated using cummerbund R package based on the FPKM (Fragments per kilobase of transcript per million mapped reads) value. Expression levels are depicted by same color with varying gradient from dark to light on the scale. Expression high: Expression value > 2, Expression medium: Expression value > 1, Expression low: 0 < Expression value < 1.
Figure 7. Expression of cowpea trihelix genes of major plant tissues at different developmental stages. (S8: Seed 8 days after pollination, S10: Seed 10 days after pollination, S14: Seed 14 days after pollination, S18: Seed 18 days after pollination, P6: Pod 6 days after pollination, P10: Pod 10 days after pollination, P16: Pod 16 days after pollination, Mix_MiSeq: Mix tissues). Heatmap was generated using cummerbund R package based on the FPKM (Fragments per kilobase of transcript per million mapped reads) value. Expression levels are depicted by same color with varying gradient from dark to light on the scale. Expression high: Expression value > 2, Expression medium: Expression value > 1, Expression low: 0 < Expression value < 1.
Agriculture 12 02172 g007
Figure 8. Multiple sequence alignment of SHA1/Sh4 orthologs in Vigna crops (a) Alignment between the coding sequences (a) and amino acid sequences (b) of SHA1 of O. rufipogon, OsMSL23 of cultivated rice and respective orthologs in Vigna. The box with orange lines highlights single nucleotide (G to T) and amino acid (K to N) variations in respective crops.
Figure 8. Multiple sequence alignment of SHA1/Sh4 orthologs in Vigna crops (a) Alignment between the coding sequences (a) and amino acid sequences (b) of SHA1 of O. rufipogon, OsMSL23 of cultivated rice and respective orthologs in Vigna. The box with orange lines highlights single nucleotide (G to T) and amino acid (K to N) variations in respective crops.
Agriculture 12 02172 g008
Figure 9. The 3D structure of proteins representing two variants of OsMSL23 homolog in Vigna crops showing two variants of each protein (K/R to N substitution).
Figure 9. The 3D structure of proteins representing two variants of OsMSL23 homolog in Vigna crops showing two variants of each protein (K/R to N substitution).
Agriculture 12 02172 g009
Table 1. Basic characteristics of Cowpea (Vigna unguiculata) Trihelix genes.
Table 1. Basic characteristics of Cowpea (Vigna unguiculata) Trihelix genes.
SubgroupAssigned NameLocus IDExon (No.)Amino Acid (No.)a MW (kDa)b PIGC%c Loc.Rice Orthologs
GTVungMSL08.1Vigun06g111200256964.216.1442.6NOsMSL08 (Os02g01380.1)
VungMSL08.2Vigun07g154500261770.956.7840.9N
VungMSL21Vigun01g244400437743.145.8337.6N, COsMSL21 (Os04g40930.1)
SH4VungMSL11.1Vigun05g020800332636.075.2338NOsMSL11 (Os02g31160.1)
VungMSL11.2Vigun07g298000234938.866.537.4N
VungMSL11.3Vigun07g298000226029.006.0337.4N
VungMSL11.4Vigun03g052500233938.026.8741.9N
VungMSL12.1Vigun03g3146001788398.298.7633.7ChOsMSL12 (Os02g33610.1)
VungMSL12.2Vigun09g1018001772379.938.6435.5M
VungMSL12.3Vigun09g1018001572379.938.6435.5M
VungMSL23.1Vigun07g196800234639.759.0343.6NOsMSL23 (Os04g57530.1)
VungMSL23.2Vigun02g002400234439.147.7332.8N
S1P1VungMSL20Vigun10g160800137640.799.7347.4NOsMSL20 (Os04g36790.1)
VungMSL25.1Vigun06g191000230133.975.1539NOsMSL25 (Os05g48690.1)
VungMSL25.2Vigun05g208200230033.849.7635.7N
VungMSL31Vigun10g048800126230.166.6840.7NOsMSL31 (Os08g37810.1)
VungMSL34.1Vigun11g011300233938.169.2530.5NOsMSL34 (Os09g38570.1)
VungMSL34.2Vigun01g027600234938.868.938.2N
VungMSL36.1Vigun05g050600324527.3210.4761.3N, MOsMSL36 (Os10g41460.1)
VungMSL36.2Vigun03g429900131034.739.3449.3N
GTδVungMSL16Vigun06g021100549657.107.9935.6N, COsMSL16 (Os03g44130.1)
GTγVungMSL06.1Vigun04g017300135240.734.6239.5NOsMSL06 (Os01g52090.1)
VungMSL06.2Vigun05g286700139044.554.739.3N
VungMSL13Vigun07g065400147454.346.1244NOsMSL13 (Os02g33770.1)
VungMSL15.1Vigun07g217300265572.325.9444.7NOsMSL15 (Os02g43300.1)
VungMSL15.2Vigun04g131900266173.015.341.5N
VungMSL15.3Vigun04g131800258466.445.6138.4N
VungMSL15.4Vigun09g125900250356.996.4443.6N
VungMSL15.5Vigun07g193200251959.55.7641.3N
VungMSL15.6Vigun01g170400258667.66.3841.4N
VungMSL15.7Vigun07g067700260269.776.3241.7N
VungMSL22.1Vigun07g217100281592.695.2441.3NOsMSL22 (Os04g45750.1)
VungMSL22.2Vigun06g151000226831.347.6335.5N
VungMSL24.1Vigun10g080500227432.5286.6538.5NOsMSL24 (Os05g03740.1
VungMSL24.2Vigun05g100400228934.716.9932.9N
VungMSL24.3Vigun02g104900230936.267.7439N
VungMSL40.1Vigun07g217400145051.346.238.8NOsMSL40 (Os12g06640.1)
VungMSL40.2Vigun05g263000144750.735.7839.2N
VungMSL40.3Vigun01g168100142949.486.2244.7N
a molecular weight b isoelectric point c subcellular localization.
Table 2. Basic characteristics of Vigna radiata Trihelix genes.
Table 2. Basic characteristics of Vigna radiata Trihelix genes.
Sub-groupAssigned NameLocus IDExon (No.)Amino Acid (No.)a MW (kDa)b PIGC%c Loc.Rice Orthologs
GTVradMSL03.1Vradi09g07980.1220323.457.6942.5NOsMSL03 (Os01g34400.1)
VradMSL03.2Vradi01g07000.15987110.717.145N
VradMSL08.1Vradi10g07270.1328232.007.7741.8NOsMSL08 (Os02g01380.1)
VradMSL08.2Vradi10g05520.1355662.568.1843.9N
VradMSL08.3Vradi08g13180.1259067.75742.9N
VradMSL21Vradi02g12120.1739645.155.949.37N, COsMSL21 (Os04g40930.1)
SH4VradMSL09Vradi06g09390.1545151.295.237.8NOsMSL09 (Os02g07800.1)
VradMSL11.1Vradi08g23420.1323126.23935.4NOsMSL11 (Os02g31160.1)
VradMSL11.2Vradi04g10090.1332635.995.2438.4N
VradMSL11.3Vradi07g29850.1330034.519.341.4N
VradMSL11.4Vradi07g01760.1422726.209.230.9N
VradMSL12Vradi0043s000601587396.848.535.6MOsMSL12 (Os02g33610.1)
S1P1VradMSL05Vradi10g10130.1425829.1010.0633.1N, MOsMSL05 (Os01g48320.1)
VradMSL20Vradi09g02750.1323926.129.9843.7NOsMSL20 (Os04g36790.1)
VradMSL25Vradi0023s00720.1221224.229.8559.17NOsMSL25 (Os05g48690.1)
VradMSL34Vradi05g17220.1314916.789.8633.2NOsMSL34 (Os09g38570.1)
GTδVradMSL16Vradi05g11080.17877101.455.439.48NOsMSL16 (Os03g44130.1)
VradMSL27Vradi07g00070.1676287.926.2540.37N, COsMSL27 (Os07g02500.1)
VradMSL28Vradi03g06280.1448255.856.4939.61NOsMSL28 (Os07g10950.1)
GTγVradMSL06.1Vradi05g20210.1227831.725.3641.34NOsMSL06 (Os01g52090.1)
VradMSL06.2Vradi01g04230.1324227.804.851.17N, C
VradMSL15.1Vradi08g18150.1244549.919.8146.52NOsMSL15 (Os02g43300.1)
VradMSL15.2Vradi08g16250.1251859.585.7845.54N
VradMSL15.3Vradi08g06550.13497581386.1843.91N
VradMSL19Vradi08g05950.1648553.115.8748.9N, ChOsMSL19 (Os04g30890.1)
VradMSL22.1Vradi0160s00310.1530735.269.1749.78NOsMSL22 (Os04g45750.1)
VradMSL22.2Vradi01g12370.1444651.309.344.82N
VradMSL22.3Vradi01g12360.14277321474.9350.84N, C
VradMSL22.4Vradi08g18140.1236741.809.4445.56N
VradMSL22.5Vradi03g04320.1259067.776.2943.36N
VradMSL24Vradi04g00170.1223328.789.5746.87N, MOsMSL24 (Os05g03740.1)
VradMSL40.1Vradi08g18160.1335140.075.4942.05NOsMSL40 (Os12g06640.1)
VradMSL40.2Vradi03g04100.1122525.548.8946.9N
VradMSL40.3Vradi04g01660.1234138.418.9343.47N
VradMSL40.4Vradi08g05970.1234139.889.5946.2N
a molecular weight b isoelectric point c subcellular localization. N- Nucleus, C-Cytoplasm, Ch- Chloroplast.
Table 3. Basic Characteristics of Vigna angularis Trihelix genes.
Table 3. Basic Characteristics of Vigna angularis Trihelix genes.
SubgroupAssigned NameLocus IDExon (No.)Amino Acid (No.)a MW (kDa)b PIGC (%)c Loc.Rice Orthologs
GTVangMSL03Vigan10g08090041005112.7851.0NOsMSL03 (Os01g34400.1)
VangMSL08.1Vigan09g161100258265.576.145.7NOsMSL08 (Os02g01380.1)
VangMSL08.2Vigan10g097100258867.286.546.5N
VangMSL21Vigan03g313100539244.865.949.1N, COsMSL21 (Os04g40930.1)
SH4VangMSL11.1Vigan03g008100332235.665.454.4NOsMSL11 (Os02g31160.1)
VangMSL11.2Vigan02g258700235739.586.139.3N
VangMSL11.3Vigan01g003800335940.525.648.3N
VangMSL12.1Vigan1112s0003001684994.218.942.2MOsMSL12 (Os02g33610.1)
VangMSL12.2Vigan01g226300216718.565.338.8N
VangMSL23.1Vigan02g183900234639.309.252.8NOsMSL23 (Os04g57530.1)
VangMSL23.2Vigan06g146100234439.148.649.2N
S1P1VangMSL20Vigan11g006000137440.649.756.8NOsMSL20 (Os04g36790.1)
VangMSL25.1Vigan05g139300158265.576.155.7NOsMSL25 (Os05g48690.1)
VangMSL25.2Vigan09g234100133037.535.155.7N
VangMSL31Vigan11g133500126129.996.357.6NOsMSL31 (Os08g37810.1)
VangMSL34.1Vigan05g004000234938.839.245.9NOsMSL34 (Os09g38570.1)
VangMSL34.2Vigan08g003100232536.908.947.6N
VangMSL36.1Vigan03g034300131735.209.561.6NOsMSL36 (Os10g41460.1)
VangMSL36.2Vigan01g314100130934.659.356.2N
GTδVangMSL16Vigan04g227300782996.055.339.7NOsMSL16 (Os03g44130.1)
GTγVangMSL06.1Vigan10g017200134940.514.652.5NOsMSL06 (Os01g52090.1)
VangMSL06.2Vigan05g214400138744.214.741.8N
VangMSL15.1Vigan10g174500212013.598.744.9NOsMSL15 (Os02g43300.1)
VangMSL15.2Vigan10g132200265772.525.350.7N
VangMSL15.3Vigan02g203500368275.066.050.4N
VangMSL15.4Vigan02g203300248954.30945.9N
VangMSL15.5Vigan10g132400362571.035.545.2N
VangMSL15.6Vigan02g203400360568.215.646N
VangMSL15.7Vigan04g134500251457.986.649.6N
VangMSL15.8Vigan02g180100251659.465.745.3N
VangMSL15.9Vigan03g238700232536.688.448.5N
VangMSL15.10Vigan03g238600255763.516.346.8N
VangMSL22.1Vigan09g201300232937.716.347.2NOsMSL22 (Os04g45750.1)
VangMSL22.2Vigan02g080400255363.515.845.8N
VangMSL24.1Vigan11g113800227032.156.647.3NOsMSL24 (Os05g03740.1)
VangMSL24.2Vigan06g074900230936.338.349.3N
VangMSL37Vigan641s002000141047.626.446.5NOsMSL37 (Os11g06410.1)
VangMSL40.1Vigan02g203600144851.276.242.2NOsMSL40 (Os12g06640.1)
VangMSL40.2Vigan10g131700144250.576.242.5N
VangMSL40.3Vigan05g194200144851.025.942.6N
VangMSL40.4Vigan02g076800143049.576.146.9N
a molecular weight b isoelectric point c subcellular localization. N- Nucleus, C-Cytoplasm.
Table 4. Basic Characteristics of Vigna umbellata Trihelix genes.
Table 4. Basic Characteristics of Vigna umbellata Trihelix genes.
SubgroupAssigned NameLocus IDLength a MWb PIGC%c Loc.Rice Orthologs
GTVumbMSL08.1Gene_2867042449.137.2446.4NOsMSL08 (Os02g01380.1)
VumbMSL08.2Gene_1344858265.566.1845.7N
VumbMSL08.3Gene_1343958867.376.5846.3N
VumbMSL08.4Gene_1344045952.327.7745.6N
VumbMSL21Gene_2266937743.155.8349.4N, COsMSL21 (Os04g40930.1)
SH4VumbMSL11.1Gene_171321023.085.4060.5NOsMSL11 (Os02g31160.1)
VumbMSL11.2Gene_171432235.665.4254.6N
VumbMSL11.3Gene_588536340.076.1050.4N
VumbMSL11.4Gene_3053028532.255.4548N
VumbMSL12Gene_2128786796.378.7641.8ChOsMSL12 (Os02g33610.1)
VumbMSL23.1Gene_3253734038.619.0353.2NOsMSL23 (Os04g57530.1)
VumbMSL23.2Gene_2958834038.778.8549.5N
S1P1VumbMSL20.1Gene_2393137440.649.7356.8NOsMSL20 (Os04g36790.1)
VumbMSL20.2Gene_2393237440.649.7356.8N
VumbMSL25.1Gene_1442630133.975.1557.6NOsMSL25 (Os05g48690.1)
VumbMSL25.2Gene_1442533037.595.1755.6N
VumbMSL31Gene_204126130.146.6257.5NOsMSL31 (Os08g37810.1)
VumbMSL34.1Gene_1848720622.519.1649.5NOsMSL34 (Os09g38570.1)
VumbMSL34.2Gene_1848834938.839.1546.4N
VumbMSL34.3Gene_1848933337.769.0247.5N
VumbMSL36.1Gene_348031735.209.5561.6NOsMSL36 (Os10g41460.1)
VumbMSL36.2Gene_347930934.689.3456.6N
GTδVumbMSL16.1Gene_2610648756.856.6340.4NOsMSL16 (Os03g44130.1)
VumbMSL16.2Gene_26107469548.756.9740.3N
VumbMSL16.3Gene_2869950558.679.2639.2C
GTγVumbMSL06.1Gene_2583234940.504.6952.4NOsMSL06 (Os01g52090.1)
VumbMSL06.2Gene_2583334940.504.6952.4N
VumbMSL06.3Gene_904738744.214.6841.7N
VumbMSL15.1Gene_1603255562.555.6149.5NOsMSL15 (Os02g43300.1)
VumbMSL15.2Gene_1603365772.555.3550.8N
VumbMSL15.3Gene_1602854961.886.2548.6N
VumbMSL15.4Gene_1603065272.095.8250.3N
VumbMSL15.5Gene_1602913015.134.4352.7N, C
VumbMSL15.6Gene_1603163071.075.6846.5N
VumbMSL15.7Gene_1602660168.345.6945.5N
VumbMSL15.8Gene_2410163271.535.9650N
VumbMSL15.9Gene_19551859.555.8045.5N
VumbMSL15.10Gene_1886860770.296.3346.5N
VumbMSL15.11Gene_1886958967.606.2246.6N
VumbMSL22Gene_1602512314.534.2642.9NOsMSL22 (Os04g45750.1)
VumbMSL24.1Gene_2867125629.887.6642.4NOsMSL24 (Os05g03740.1)
VumbMSL24.2Gene_1888626931.966.3947.4N
VumbMSL24.3Gene_1084230936.338.3649.4N
VumbMSL37Gene_513714917.219.5545.9NOsMSL37 (Os11g06410.1)
VumbMSL40.1Gene_1143544851.276.2042.3NOsMSL40 (Os12g06640.1)
VumbMSL40.2Gene_1143644851.276.2042.3N
VumbMSL40.3Gene_1143744851.276.2042.3N
VumbMSL40.4Gene_1143844851.276.2042.3N
VumbMSL40.5Gene_1143944250.586.1742.7N
VumbMSL40.6Gene_2872844850.995.9442.5N
a molecular weight b isoelectric point c subcellular localization.
Table 5. Minimum stable potential energy of candidate protein for shattering and its mutation variants.
Table 5. Minimum stable potential energy of candidate protein for shattering and its mutation variants.
Candidate
Protein
Minimum Potential EnergyMutation Variants of Candidate ProteinMinimum Potential EnergyEffect of Mutation
VungMSL23.1−1,652,396.5VungMSL23.1_K70N−1,878,625.2Stabilizing
VungMSL23.2−1,605,236.2VungMSL23.2_K39N−1,676,144Stabilizing
VangMSL23.1−1,678,593.1VangMSL23.1_K69N−1,734,201.9Stabilizing
VangMSL23.2−1,567,263.5VangMSL23.2_ K39N−1,546,699.9Destabilizing
VumbMSL23.1−1,555,491.4VumbMSL23.1_K68N−1,579,698.2Stabilizing
VumbMSL23.2−1,669,826.2VumbMSL23.2_K39N−1,653,073.4Destabilizing
VradMSL12−3,030,210VradMSL12 _R794N−3,738,055.5Stabilizing
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kumari, S.; Wankhede, D.P.; Murmu, S.; Maurya, R.; Jaiswal, S.; Rai, A.; Archak, S. Genome-Wide Identification and Characterization of Trihelix Gene Family in Asian and African Vigna Species. Agriculture 2022, 12, 2172. https://doi.org/10.3390/agriculture12122172

AMA Style

Kumari S, Wankhede DP, Murmu S, Maurya R, Jaiswal S, Rai A, Archak S. Genome-Wide Identification and Characterization of Trihelix Gene Family in Asian and African Vigna Species. Agriculture. 2022; 12(12):2172. https://doi.org/10.3390/agriculture12122172

Chicago/Turabian Style

Kumari, Shweta, Dhammaprakash Pandhari Wankhede, Sneha Murmu, Ranjeet Maurya, Sarika Jaiswal, Anil Rai, and Sunil Archak. 2022. "Genome-Wide Identification and Characterization of Trihelix Gene Family in Asian and African Vigna Species" Agriculture 12, no. 12: 2172. https://doi.org/10.3390/agriculture12122172

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop