Phylogenetic and Expression Studies of Small GTP-Binding Proteins in Solanum lycopersicum Super Strain B

This investigation involved a comparative analysis of the small GTPase superfamily in S. lycopersicum super strain B compared to their analogues in leguminous and other non-leguminous species. The small GTPases superfamily members were recognized by tBLASTn searches. The sequences of amino acid were aligned using Clustal Omega and the analysis of phylogeny was performed with the MEGA7 package. Protein alignments were applied for all studied species. Three-dimensional models of RABA2, ROP9, and ROP10 from Solanum lycopersicum “Super strain B” were performed. The levels of mRNA of the Rab, Arf, Rop, and Ran subfamilies were detected in aerial tissues vs. roots. Significant divergences were found in the number of members and groups comprising each subfamily of the small GTPases and Glycine max had the highest count. High expression of Rab and Arf proteins was shown in the roots of legumes whilst in non-legume plants, the highest values were recorded in aerial tissues. S. lycopersicum super strain B had the highest expression of Rab and Arf proteins in its aerial tissues, which may indicate that diazotroph strains have supreme activities in the aerial tissues of strain B and act as associated N-fixing bacteria. The phylogenies of the small GTPase superfamily of the studied plants did not reveal asymmetric evolution of the Ra, Arf, Rop, and Ran subfamilies. Multiple sequence alignments derived from each of the Rab, Arf, and Rop proteins of S. lycopersicum super strain B showed a low frequency of substitutions in their domains. GTPases superfamily members have definite functions during infection, delivery, and maintenance of N2-fixing diazotroph but show some alterations in their function among S. lycopersicum super strain B, and other species.


Introduction
Solanum lycopersicum L. (tomato) is a vegetable crop cultivated all over the world for its high agro-economic importance [1]. It requires heavy manure and an adequate nitrogen supply to obtain the highest yields [2]. It appears that S. lycopericum obtains its nitrogen from both chemical fertilization with organic and inorganic manure [3] and acetylene reduction performed by diazotrophic bacteria present on the rhizoplane and the rhizosphere soil [4,5].
Various molecular components involved with diazotrophs infection have been highlighted as facilitating intracellular membrane trafficking [6][7][8], cytoskeleton relatedproteins [9,10], and cell-wall degeneration enzymes. Among the proteins related to vesicle membrane trafficking are small GTPases, which have essential contributions to plant growth and development, including in the first diazotroph contagion process, root hair formation,

Results and Discussion
The count of each small GTPases subfamily member in S. lycopersicum "super strain B" compared to the non-leguminous and leguminous species is represented in Table 1. The table showed high divergence in the RAB number present in legume and non-legume species, where G. max recorded the highest member of RAB members (94). L. japonicus and O. sativa possessed the lowest number of members in the RAB subfamily. Between the highest and the lowest RAB number, the other species were 64 members in M. truncatula, followed by 57 in A. thaliana, 53 in Z. mays, 50 in P. vulgaris, and 46 in S. lycopersicum "super strain B". Regarding the ARF subfamily, the maximum numbers were in G. max (41) and the least were in L. japonicus (13). The other species comprised nearly half the number of members in G. max (19)(20)(21), except in case of Z. mays, which has slightly higher numbers of members (23). Concerning the ROP subfamily, G. max still had the highest number of members (20) and the other legumes and non-legumes species had fewer. P. vulgaris and A. thaliana had 11 members, S. lycopersicum had 10, Z. mays and S. lycopersicum "super strain B" had 9, L. japonicas and O. sativa had 8, and finally, M. truncatula had the least members (7). Members of the RAN subfamily were higher in G. max (7) and lower in both L. japonicus and O. sativa (2). The other legumes and non-legume species possessed around three to four RAN members ( Table 1). The results indicated the presence of significant variations in the score of members representing each subfamily of the small GTPases superfamily found in leguminous and non-leguminous species, in which G. max scored the highest number of members. According to Singh and Hymowitz [28], the drastic number of soybean GTPases subfamilies may refer to its genomic nature as a partially diploidized tetraploid species.  RAB  46  57  37  53  30  64  50  94  ARF  21  21  21  25  13  19  20  41  ROP  9  11  8  9  8  7  11  20  RAN  4  4  2  3  2  4  3  7   Total  80  93  68  90  53  94  84  162   Table 2 illustrates the number of members in each group of the small GTPases Rab subfamily. Group A of the Rab subfamily had the highest number of members along with S. lycopersicum super strain B (186), in which the leguminous plants has more Rab participants (99) than the non-leguminous plants (87), despite the presence of the highest representatives in G. max (41). The other groups of the Rab subfamily acquired less than a quarter of the total members present in group A. From all groups, group B had the lowest total number of members (26), with nearly equal numbers of representatives in both legumes and non-legume species. Table 2 also reveals that the total number of members of Rab Group C was asymmetrically split between the legume and non-legume plants, in which the members in legumes (26) were about triple those in non-legumes species (9).  A  21  26  17  23  12  23  23  41  B  2  3  3  4  2  7  1  4  C  3  3  0  3  4  6  5  11  D  5  4  4  6  1  4  4  7  E  5  5  3  5  3  6  5  8  F  4  3  4  3  3  5  4  7  G  3  8  4  5  3  9  4  8  H  3  5  2  4  2  4  4  8   Total  46  57  37  53  30  64  50  94 The number of members in each group of the small GTPases Arf subfamily is presented in Table 3. Group (B + C + D) had the highest number of members (58) among all studied species. Although the non-leguminous species had the same number of participants in the group (A + B + C), G. max still had the highest number of individuals (12). Group (ARLC) was the minor group of members, which consisted of a total of 9 members nearly equal distributed between leguminous and non-leguminous species. The majority of the Arf members in groups (A, ARLA, and ARLB) were from the non-leguminous plants as compared to the leguminous ones. However, the opposite was recorded in group SARA, where the highest total Arf numbers of individuals were detected in leguminous plants (19), which were mainly from G. max (10), Table 3. Table 3. Number of small GTPases Arf subfamily members expressed in Solanum lycopersicum super strain "B" compared with some non-leguminous and leguminous plants retrieved from the data base. Gene expression of the GTP protein families often appears to vary in spatio-temporal control between different species. Table 4 delineated the amounts of mRNA expressed in the members of each small GTPases subfamily of S. lycopersicum super strain B, and other non-legume and legume plants, present in aerial tissue vs. those of root. Using root samples as a reference, normalized values were derived for cross referencing to other tissues. In all species in this study, GTPases were almost accumulated at higher levels in the roots compared to aerial tissues. The members of the Rab and Rop subfamilies of S. lycopersicum super strain B showed a higher mRNA level in their aerial tissues compared to the roots, consistent with those in the other species, except in P. vulgaris and G. max, where Rops also demonstrated increased levels of mRNA in its aerial tissues (Table 4). Only one member of Ran subfamily in both O. sativa and L. japonicus had high mRNA while the other species had none. All species have members of the Rab, Arf, and Rop subfamilies with consistent amounts of mRNA levels in their aerial tissues, in which G. max had higher levels ( Table 4). The results of analyzing the expression of small GTPases subfamily members in both leguminous and non-leguminous species indicated that a higher number of Rab and Arf members were upregulated in aerial tissues than roots in non-leguminous plants, especially in S. lycopersicum super strain B. However, the members of each subfamily (Rab, Arf, Rop, and Ran) with unchanged levels of mRNA in aerial tissues were comparable in leguminous and non-leguminous species of the study. In addition, the highest downregulation of each small GTPases subfamily member was observed in G. max. The high accumulation of mRNA of both Rab and Arf proteins in the roots of legumes may indicate that they are the main proteins involved in the symbiotic relation between legumes and rhizobia. Probable tissue-specific functionalization of Rab/Arf small-GTP binding genes/proteins was suggested to participate in the genesis, development, and maintenance of nodulations in the roots of legume plant as reported by several investigators of Rab in soybean and Vigna aconitifolia [29], Lotus japonicus [30], soybean [31], Medicago sp [12,32], kidney bean [6], and Rab/Arf in Medicago truncatula [13]. Concerning non-leguminous plants, the high expression of Rab and Arf proteins in the aerial tissues, especially in S. lycopersicon strain B, may disclose the presence of another mechanism different from nodulations that involve N 2 uptake and fixation. In this context, Mohandas [33] revealed the domestication of some rhizobacteria in the roots and leaves and on the rhizoplane and phylloplane of tomato (L. esculentum Mill "Pusa Ruby") as associated N 2 -fixing bacteria. In addition, Dent and Cocking [34] reported that diazotroph strains can intracellularly colonize, under specific conditions, the roots (or root hairs) and shoots of non-legume plants without nodulation in cereals, such as wheat, maize, and rice, in addition to some crops, such as potato, oilseed rape, and tomato. Moreover, Collavino et al. [35] reported that the diazotrophic populations inside the stem and root of tomato plants play a critical function in the early growth phases and are distinctively influenced by N fertilization. From all the above, we can speculate that the high gene expressions may mean that diazotroph strains are colonized more in the aerial tissues than the roots (either inside or on the surface) of S. lycopersicum super strain B. Table 4. The number of each GTPase subfamily member that shows no change (N), reduced (−), or increased (+) levels of mRNA in Solanum lycopersicum super strain "B" compared with some non-leguminous and leguminous plants retrieved from the database in aerial tissue vs. the root.

Plant Species
Rab Note: a compared to leaf; b compared to shoot (2-week-old); c compared to stem; d compared to leaf (6-week-old).
Monomeric GTPase sequences of amino acids from tBLASTn searches were applied to recognize the individuals of the small GTPases superfamily of S. lycopersicum super strain B and those retrieved from the genomic databases (O. sativa, A. thaliana, S. lycopersicum, L. japonicus, Z. mays, M. truncatula, G. max, and P. vulgaris). Amino acid sequences of those proteins were employed to create phylogenetic trees of those species, permitting their categorization using small GTPases subfamilies into Rab (green), Arf (blue), Rop (Pink), and Ran (violet), (Figure 1). The phylogenetic inspection of the small GTPase superfamily of the studied leguminous and non-leguminous plants did not reveal asymmetric evolution of the Ra, Arf, Rop, and Ran subfamilies. These results were in accordance with those of Flores et al. [27].  Monomeric GTPase sequences of amino acids from tBLASTn searches were applied to recognize the individuals of the small GTPases superfamily of S. lycopersicum super strain B and those retrieved from the genomic databases (O. sativa, A. thaliana, S. lycopersicum, L. japonicus, Z. mays, M. truncatula, G. max, and P. vulgaris). Amino acid sequences of those proteins were employed to create phylogenetic trees of those species, permitting their categorization using small GTPases subfamilies into Rab (green), Arf (blue), Rop (Pink), and Ran (violet), (Figure 1). The phylogenetic inspection of the small GTPase superfamily of the studied leguminous and non-leguminous plants did not reveal asymmetric evolution of the Ra, Arf, Rop, and Ran subfamilies. These results were in accordance with those of Flores et al. [27].   All alignments of ROP9 proteins manifested powerful amino acid sequen vation across the studied plants, which were clarified by the presence of 7 doma 3). Three positions (amino acids 53, 129, and 130) showed conserved substitut domains of ROP9 proteins, and 3 other substitutions were out of it (amino acid 175). The former substitution was at the border of II, where isoleucine in legum cies was switched with threonine (T) in non-leguminous ones. The second and ones were in the mid-region of the domain number V. In the second substitutio (C) was found in leguminous plants and phenylalanine (F) in non-leguminou terestingly, the third substitution was variable, in which a valine residue was r only 4 non-leguminous plants (O. sativa, S. lycopersicum, Z. mays, S. lycopersi strain B). A. thaliana, however, differed from its non-legume species and has residue like the leguminous ones except for P. vulgaris, which has leucine (L) in ure 3). All alignments of ROP9 proteins manifested powerful amino acid sequence conservation across the studied plants, which were clarified by the presence of 7 domains (Figure 3). Three positions (amino acids 53, 129, and 130) showed conserved substitutions in the domains of ROP9 proteins, and 3 other substitutions were out of it (amino acids 151, 164, 175). The former substitution was at the border of II, where isoleucine in leguminous species was switched with threonine (T) in non-leguminous ones. The second and the third ones were in the mid-region of the domain number V. In the second substitution, cysteine (C) was found in leguminous plants and phenylalanine (F) in non-leguminous ones. Interestingly, the third substitution was variable, in which a valine residue was recorded in only 4 non-leguminous plants (O. sativa, S. lycopersicum, Z. mays, S. lycopersicum super strain B). A. thaliana, however, differed from its non-legume species and has isoleucine residue like the leguminous ones except for P. vulgaris, which has leucine (L) instead (Figure 3).   (Phvul.002G106600), and G. max (Glyma01g3 boxes represent the identical residues while gray ones represent conservative substitut ments were performed with Clustal Omega in MEGA7 followed with Boxshade. The red ignates a conservative amino acid substitution in legume against non-legume sequenc M. truncatula (Medtr4g064897), P. vulgaris (Phvul.002G106600), and G. max (Glyma01g36880). Black boxes represent the identical residues while gray ones represent conservative substitutions. Alignments were performed with Clustal Omega in MEGA7 followed with Boxshade. The red arrow designates a conservative amino acid substitution in legume against non-legume sequences. The conserved domains of Rabs were indicated by green lines. The conserved domains of ROPs were indicated by green lines.
By aligning the sequence of ROP10 proteins of all the species studied, we found the presence of 4 domains. ROP10 protein alignments also possessed substitutions in legume species (Figure 4), but only one of those swaps affects the ROP domain. This substitution was at the beginning of III, where leucine was found in P. vulgaris and G. max, valine in M. truncatula, L. japonicas, S. lycopersicum and Z. mays, and isoleucine in A. thaliana, O. sativa and S. lycopersicum super strain B.  .002G106600), and G. max (Glyma01g boxes represent the identical residues while gray ones represent conservative substitu ments were performed with Clustal Omega in MEGA7 followed with Boxshade. The re ignates a conservative amino acid substitution in legume against non-legume sequen served domains of Rabs were indicated by green lines. The conserved domains of RO cated by green lines. By aligning the sequence of ROP10 proteins of all the species studied, w presence of 4 domains. ROP10 protein alignments also possessed substitution species (Figure 4), but only one of those swaps affects the ROP domain. This was at the beginning of III, where leucine was found in P. vulgaris and G. ma M. truncatula, L. japonicas, S. lycopersicum and Z. mays, and isoleucine in A. sativa and S. lycopersicum super strain B. By comparing the sequence alignment of the RABA2, ROP9, and ROP10 S. lycopersicum super strain B to their analogues in non-legume and legume found that RABA2 has a single conserved substitution while ROP9 and ROP10 (Figures 2-4). Those substitutions were affirmed by the predicted 3D config RABA2, ROP9, and ROP10 proteins of S. lycopersicum super strain B (Figure sequence alignments that were obtained for each of Rab, Arf, and Rop protein persicum super strain B and their analogues in non-legume and legume plants a By comparing the sequence alignment of the RABA2, ROP9, and ROP10 proteins of S. lycopersicum super strain B to their analogues in non-legume and legume plants, we found that RABA2 has a single conserved substitution while ROP9 and ROP10 have three (Figures 2-4). Those substitutions were affirmed by the predicted 3D configurations of RABA2, ROP9, and ROP10 proteins of S. lycopersicum super strain B ( Figure 5). Multiple sequence alignments that were obtained for each of Rab, Arf, and Rop proteins of S. lycopersicum super strain B and their analogues in non-legume and legume plants as illustrated in Figures 2-4 showed a low frequency of substitutions in their domains. This may indicate the strong conservation of amino acid sequence across the leguminous and non-leguminous plants analyzed and it is proposed that those proteins were put through powerful discriminatory pressure as reported by Flores et al. [27]. Multiple sequence alignments of the leguminous plants' proteins revealed that the domains of RABs proteins had no conserved substitutions while both ROP9 and ROP10 had one each. ROP9 showed a conserved substitution (in amino acid 130) in domain V where leucine was in P. vulgaris (Phvul.002G106600) and isoleucine was in the analogue of the other three legumes [L. japonicus (chr2.CM0272.860.r2.m), M. truncatula (Medtr4g064897), and G. max (Glyma01g36880)]. Moreover, ROP10 had conserved residues (in amino acid 115) in domain III where leucine was found in kidney bean (Phvul.009G180800) and soybean (Glyma04g35110) while valine in M. truncatula ROP10 (Medtr3g078260) and L. japonicus (chr1.CM0166.830.r2.m). Jiang and Ramachandran [24] and Yuksel and Memon [36] reported that small GTP-binding proteins in most plants are functionally very well conserved, but some could follow functional variations in divergent lineages to regulate some lineage-specific functional roles, such as nodulation in legume plants. So, the variations in the conserved residues in those legume plants may reflect the specific contribution of those proteins in the legumes-rhizobia symbiosis relationship [27].
(Glyma01g36880)]. Moreover, ROP10 had conserved residues (in amino acid 115) in domain III where leucine was found in kidney bean (Phvul.009G180800) and soybean (Glyma04g35110) while valine in M. truncatula ROP10 (Medtr3g078260) and L. japonicus (chr1.CM0166.830.r2.m). Jiang and Ramachandran [24] and Yuksel and Memon [36] reported that small GTP-binding proteins in most plants are functionally very well conserved, but some could follow functional variations in divergent lineages to regulate some lineage-specific functional roles, such as nodulation in legume plants. So, the variations in the conserved residues in those legume plants may reflect the specific contribution of those proteins in the legumes-rhizobia symbiosis relationship [27].

Identification of Small GTPases from Different Species
The small GTPases superfamily members were recognized by tBLASTn searches [37] using the amino acid sequence of all small GTPase family individuals that were previously described and categorized in Arabidopsis [15,26]. These genes were selected and categorized manually following a systematic phylogenetic analysis.

Phylogenetic Analysis
The sequences of amino acids were aligned using Clustal Omega (http://www.ebi.ac .uk/Tools/msa/clustalo (accessed on 15 March 2020)) [38] and the analysis of phylogeny was carried out with the MEGA7 package (http://www.megasoftware.net (accessed on 15 March 2020)) [39] using the method of neighbor-joining [40]. The distances of evolution were computed by the difference's method [41]. All positions of gaps and data missing were omitted from the dataset.

Protein Alignments
Small GTPases amino acid sequences that participated in the initiation of the symbiotic relation between rhizobia and legumes were applied to identify members from other species by BLASTP. Te PvRabA2 and PvArfA1 were selected from Phaseolus vulgaris (common bean), LjRop6 from Lotus japonicus, and MtRab7, MtRop9, and MtRop10 from Medicago truncatula as queries. The sequences with the lowest E value were applied to create multiple

Statistical Analysis of Expression Data
Small GTPase members were retrieved using the public datasets for each species. Their expressions manifested in roots (reference organ) and in other organs of the eight species in study. The values of expression were normalized for each tissue and the statistical analyses were achieved using CuffDiff [50] to identify the expressed genes.

Conclusions
The highest numbers of Rab and Arf proteins were expressed in tomato super strain B and all the species compared in this study. The levels of Rab and Rop mRNA in aerial tissues were higher than in roots, but in contrast, Arf mRNA levels was higher in roots than in aerial tissues. The Ran subfamily showed the least expression in different tissues of tomato super strain B.
Supplementary Materials: The following are available online at https://www.mdpi.com/article /10.3390/plants11050641/s1, Table S1: Expression data of the Arabidopsis thaliana small GTPase superfamily, Table S2: Expression data of the Oryza sativa small GTPase superfamily by RNA sequencing, Table S3: Expression data of the Solanum lycopersicum small GTPase superfamily, Table S4: Expression data of the Zea mays small GTPase superfamily by RNA sequencing, Table S5: Expression data of the Lotus japonicus small GTPase superfamily, Table S6: Expression data of the Medicago truncatula small GTPase superfamily by microarray, Table S7: Expression data of the Phaseolus vulgaris small GTPase superfamily by RNA sequencing, Table S8: Expression data of the Glycine max small GTPase superfamily by RNA sequencing, Table S9: RNA-seq data for M. truncatula small GTPases detected in different regions of the nodule.