<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Genes</journal-id>
<journal-title>Genes</journal-title>
<issn pub-type="epub">2073-4425</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/genes2041017</article-id>
<article-id pub-id-type="publisher-id">genes-02-01017</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Plant-Bacteria Association and Symbiosis: Are There Common Genomic Traits in <italic>Alphaproteobacteria</italic>?</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Pini</surname><given-names>Francesco</given-names></name><xref ref-type="author-notes" rid="fn1-genes-02-01017"><sup>†</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Galardini</surname><given-names>Marco</given-names></name><xref ref-type="author-notes" rid="fn1-genes-02-01017"><sup>†</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Bazzicalupo</surname><given-names>Marco</given-names></name></contrib>
<contrib contrib-type="author">
<name><surname>Mengoni</surname><given-names>Alessio</given-names></name><xref ref-type="corresp" rid="c1-genes-02-01017"><sup>*</sup></xref></contrib>
<aff id="af1-genes-02-01017">Department of Evolutionary Biology, University of Florence, via Romana 17, 50125 Firenze, Italy; E-Mails: <email>francesco.pini@unifi.it</email> (F.P.); <email>marco.galardini@unifi.it</email> (M.G.); <email>marco.bazzicalupo@unifi.it</email> (M.B.)</aff></contrib-group>
<author-notes><fn id="fn1-genes-02-01017" fn-type="equal">
<label>†</label>
<p>These authors contributed equally to this work.</p></fn>
<corresp id="c1-genes-02-01017">
<label>*</label> Author to whom correspondence should be addressed; E-Mail: <email>alessio.mengoni@unifi.it</email>; Tel. +39-0552288246; Fax +39-0552288250.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2011</year></pub-date>
<pub-date pub-type="epub">
<day>29</day>
<month>11</month>
<year>2011</year></pub-date>
<volume>2</volume>
<issue>4</issue>
<fpage>1017</fpage>
<lpage>1032</lpage>
<history>
<date date-type="received">
<day>29</day>
<month>09</month>
<year>2011</year></date>
<date date-type="rev-recd">
<day>08</day>
<month>11</month>
<year>2011</year></date>
<date date-type="accepted">
<day>09</day>
<month>11</month>
<year>2011</year></date></history>
<permissions>
<copyright-statement>© 2011 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2011</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).</p></license></permissions>
<abstract>
<p><italic>Alphaproteobacteria</italic> show a great versatility in adapting to a broad range of environments and lifestyles, with the association between bacteria and plants as one of the most intriguing, spanning from relatively unspecific nonsymbiotic association (as rhizospheric or endophytic strains) to the highly species-specific interaction of rhizobia. To shed some light on possible common genetic features in such a heterogeneous set of plant associations, the genomes of 92 <italic>Alphaproteobacteria</italic> strains were analyzed with a fuzzy orthologs-species detection approach. This showed that the different habitats and lifestyles of plant-associated bacteria (soil, plant colonizers, symbiont) are partially reflected by the trend to have larger genomes with respect to nonplant-associated species. A relatively large set of genes specific to symbiotic bacteria (73 orthologous groups) was found, with a remarkable presence of regulators, sugar transporters, metabolic enzymes, nodulation genes and several genes with unknown function that could be good candidates for further characterization. Interestingly, 15 orthologous groupspresent in all plant-associated bacteria (symbiotic and nonsymbiotic), but absent in nonplant-associated bacteria, were also found, whose functions were mainly related to regulation of gene expression and electron transport. Two of these orthologous groups were also detected in fully sequenced plant-associated <italic>Betaproteobacteria</italic> and <italic>Gammaproteobacteria.</italic> Overall these results lead us to hypothesize that plant-bacteria associations, though quite variable, are partially supported by a conserved set of unsuspected gene functions.</p></abstract>
<kwd-group>
<kwd>bacterial genomes</kwd>
<kwd>plant</kwd>
<kwd>symbiosis</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>The phylum <italic>Proteobacteria</italic> is the most numerous group currently recognized in the domain Bacteria [<xref ref-type="bibr" rid="b1-genes-02-01017">1</xref>]. Within this group, the class of <italic>Alphaproteobacteria</italic> harbors a miscellaneous set of metabolisms, cellular phenotypes and a wide range of habitats, including phototrophic genera (<italic>Rhodobacter</italic>), symbionts of plants (<italic>Rhizobium</italic>, <italic>Sinorhizobium</italic>, <italic>Mesorhizobium</italic> and <italic>Azorhizobium</italic> [<xref ref-type="bibr" rid="b2-genes-02-01017">2</xref>]), animal and plant pathogens (<italic>Rickettsia</italic>, <italic>Brucella</italic>, <italic>Agrobacterium</italic>) and also genera able to metabolize C1 compounds (<italic>Methylobacterium</italic>). In addition, mitochondria have a common origin with SAR11 clade, as a sister group of the order <italic>Rickettsiales</italic> [<xref ref-type="bibr" rid="b3-genes-02-01017">3</xref>]. Habitats that are colonized by <italic>Alphaproteobacteria</italic>, range from the ocean floor volcanic environments, to soil, in which they may interact with plant roots, to surface waters of oceans [<xref ref-type="bibr" rid="b1-genes-02-01017">1</xref>].</p>
<p><italic>Alphaproteobacteria</italic>, with nearly 600 completely sequenced genomes, is one of the most studied bacterial classes [<xref ref-type="bibr" rid="b1-genes-02-01017">1</xref>], showing a large heterogeneity in genome size, from 1.1 to 9.2 Mbp [<xref ref-type="bibr" rid="b4-genes-02-01017">4</xref>] and genome architecture, with the presence of additional replicons, such as chromids [<xref ref-type="bibr" rid="b5-genes-02-01017">5</xref>], and plasmids [<xref ref-type="bibr" rid="b6-genes-02-01017">6</xref>]. Because of these genomic traits, and also thanks to their versatility in adapting to different habitats, <italic>Alphaproteobacteria</italic> constitute an excellent model system to study how bacterial genomes evolve and how genomic features are related to environmental adaptation [<xref ref-type="bibr" rid="b1-genes-02-01017">1</xref>,<xref ref-type="bibr" rid="b4-genes-02-01017">4</xref>].</p>
<p>Particularly intriguing is the alphaproteobacterial ability to interact with plants, as pathogens and as nonpathogenic mutualist/commensals (symbionts/nonsymbionts) (e.g., <italic>Rhizobium</italic>, <italic>Azospirillum</italic>). Plant-associated bacteria <italic>sensu lato</italic> can be found in, and around roots, in the vasculature, and on aerial tissues or in specifically developed organs (e.g., root nodules) [<xref ref-type="bibr" rid="b7-genes-02-01017">7</xref>], allowing to categorize strains as phyllospheric, rhizospheric and endophytic.</p>
<p>Phyllospheric bacteria inhabits the aerial parts of the plant (leaves, stems, buds, flowers and fruits), possibly affecting plant fitness and productivity of agricultural crops [<xref ref-type="bibr" rid="b8-genes-02-01017">8</xref>]. The rhizosphere is the part of soil around plant roots populated by microbes (bacteria and fungi); microorganisms from the rhizosphere interact with roots in several process such as the decomposition of organic matter, the maintenance of soil structure and water relationships, as a consequence rhizosphere is a fundamental niche of the soil ecosystem [<xref ref-type="bibr" rid="b9-genes-02-01017">9</xref>]. Endophytic bacteria can be defined as those bacteria that colonize the internal tissue of the plants (endosphere) with no external sign of infection or negative effect on their host [<xref ref-type="bibr" rid="b10-genes-02-01017">10</xref>]; they can be classified as ‘obligate’ or ‘facultative’ endophytes in accordance with their life strategies. Obligate endophytes are strictly dependent on the host plant for their growth and survival and transmission to other plants could occur only by seeds or via vectors, while facultative endophytes could grow outside host plants [<xref ref-type="bibr" rid="b11-genes-02-01017">11</xref>]. Finally, a noteworthy endophytic example within <italic>Alphaproteobacteria</italic>, is the nitrogen-fixing symbiosis established with leguminous plants by rhizobia, which is coupled with the development of a new plant structure, the nodule, in the root or in the stem of the plant [<xref ref-type="bibr" rid="b12-genes-02-01017">12</xref>]. All these heterogenous phenotypes suggest that it could be difficult to find common genetic traits for Plant-associated bacteria.</p>
<p>An additional degree of complexity is given by the fact that single species or even single strains inhabit both soil and plant tissues and can show multiple types of plant association. For example, <italic>Azospirillum</italic> strains are known as model plant-growth promoting rhizosphere (PGPR) bacteria, but they have also been shown within plant tissue, as endophytes of cereals [<xref ref-type="bibr" rid="b13-genes-02-01017">13</xref>]; on the other hand, the specific alfalfa symbiont <italic>Sinorhizobium meliloti</italic> is also able to grow as rhizospheric of nontarget host plants and it behaves as endophytes with cereals like rice [<xref ref-type="bibr" rid="b14-genes-02-01017">14</xref>], besides free-living in bulk soil. Such observations led to doubt whether a genetic common background is present within all plant-associated <italic>Alphaproteobacteria</italic>. In fact, concerning symbiotic species, it is fairly accepted that the symbiotic lifestyle needs some specific genetic functions (e.g., <italic>nod</italic> genes), which are not present in nonsymbiotic nitrogen fixers [<xref ref-type="bibr" rid="b15-genes-02-01017">15</xref>]. However, stem-nodulating bradyrhizobia have shown that a <italic>nod</italic>-independent symbiosis can be established [<xref ref-type="bibr" rid="b15-genes-02-01017">15</xref>,<xref ref-type="bibr" rid="b16-genes-02-01017">16</xref>]. Two questions therefore arise: (i) Is the symbiotic lifestyle in α-rhizobia characterized by the presence of a common gene set? (ii) Do all plant-associated species (both symbiotic and nonsymbiotic) share some common genes conferring the ability to associate with plants? One way to begin to answer these questions is to apply a comparative genomics approach. Previous investigations on the comparison of α- and β-rhizobia have been performed [<xref ref-type="bibr" rid="b15-genes-02-01017">15</xref>,<xref ref-type="bibr" rid="b17-genes-02-01017">17</xref>] as well as the comparison of some Plant-associated endophytes in <italic>Gammaproteobacteria</italic> [<xref ref-type="bibr" rid="b18-genes-02-01017">18</xref>], however no systematic analyses have been attempted in <italic>Alphaproteobacteria</italic>.</p>
<p>Here we report a bioinformatic analysis aimed at the scanning of all the alphaproteobacterial sequenced genomes trying to sort out the possible exclusive or distinctive genes which enable some of the <italic>Alphaproteobacteria</italic> to be associated with plants, evaluating if plant-bacteria association needs a specific assortment of gene functions or if, as suggested by its phenotypic heterogeneity, it is rather unrelated to the presence of a dedicated set of genes.</p></sec>
<sec sec-type="results|discussion">
<label>2.</label>
<title>Results and Discussion</title>
<sec>
<label>2.1.</label>
<title>Plant-Associated Bacteria Have Larger Genomes than Nonplant-Associated</title>
<p>First a dataset of the relevant <italic>Alphaproteobacteria</italic> (“alphas” for short) was constructed by downloading all the alphaproteobacterial genomes available in NCBI genome database. All animal obligate pathogens were excluded, since they show extensive genome reductions, linked with intracellular lifestyle [<xref ref-type="bibr" rid="b4-genes-02-01017">4</xref>,<xref ref-type="bibr" rid="b19-genes-02-01017">19</xref>], as well as the SAR11 clade due to the extensive gene loss described for this group [<xref ref-type="bibr" rid="b1-genes-02-01017">1</xref>]. A total of 92 genomes were then analyzed (<xref ref-type="fig" rid="f1-genes-02-01017">Figure 1</xref> and Supplementary Material S1), and divided into three groups: (i) solely free-living, (ii) plant-associated and (iii) symbiont (that is a sub-set of plant-associated) combining the information available on GOLD database [<xref ref-type="bibr" rid="b20-genes-02-01017">20</xref>,<xref ref-type="bibr" rid="b21-genes-02-01017">21</xref>], Bergey's manual of systematic bacteriology [<xref ref-type="bibr" rid="b22-genes-02-01017">22</xref>] and bibliographic search on Pubmed. Plant-associated bacteria include 27 genomes (2 pathogens, 7 associated and 18 symbionts), all but two (25/27) grouped within the order <italic>Rhizobiales</italic> (<xref ref-type="fig" rid="f1-genes-02-01017">Figure 1</xref>), the only exceptions are the species <italic>Gluconacetobacter diazotrophicus</italic> and <italic>Azospirillum</italic> B510 which fall in the order <italic>Rhodospirillales</italic>; of course we cannot exclude that among the 65 nonplant-associated bacteria some could have also experienced the plant environment, even if those putative events have not been reported.</p>
<p>A quick look at these genomes shows a wide range of genome sizes, spanning from 2.9 Mbp (<italic>Parvularcula bermudensis</italic> HTCC2503) to 9.2 Mbp (the magnetotactic bacterium <italic>Magnetospirillum magnetotacticum</italic> MS1). The average genome size (±standard deviation) of the dataset is 5.04 ± 1.53 Mbp. Plant-associated bacteria have significant (<italic>P</italic> &lt; 0.0001, one-way ANOVA) larger genomes (6.73 ± 1.26 Mbp) than nonplant-associated ones (4.34 ± 0.99 Mbp), as previously noticed [<xref ref-type="bibr" rid="b19-genes-02-01017">19</xref>]. The same trend was observed considering only the order <italic>Rhizobiales</italic>, which accounts for near half of the entire dataset (42 out of 92 genomes with average length of 5.83 ± 1.53 Mbp), with plant-associated <italic>Rhizobiales</italic> having genomes larger than those of nonplant-associated <italic>Rhizobiales</italic> (6.81 Mbp and 4.47 Mbp, respectively, P &lt; 0.0001). Average GC content is 63.1 ± 4.4% and is similar for plant-associated and nonplant-associated genomes (63.03% and 62.71%, respectively, <italic>P</italic> &lt; 0.8) and ranges from 45.2% (<italic>Hirschia baltica</italic> ATCC49814) to 71.1% (<italic>Phenylobacterium zucineum</italic> HLK1); within the <italic>Rhizobiales</italic> we observed the same trend: an average of 63.1% with 63.0% for plant-associated and 63.1% for nonplant-associated).</p></sec>
<sec>
<label>2.2.</label>
<title>Are There Life-Style Specific Genes in Alphaproteobacteria?</title>
<p>To answer this question we first peformed a genome clusterization of all protein coding genes present in the 92 genomes, obtaining 40,960 groups of orthologs (out of a total number of 434,411 proteins analyzed). Next, starting from these groups of orthologs, we proceeded trying to extract four groups named as: (i) Alpha Core (common to all the analyzed organisms), (ii) Plant-Associated (common to and exclusive of plant associated bacteria), (iii) Plant-Symbionts (common to and exclusive of plant symbionts), and (iv) NonPlant-Associated (common to and exclusive of nonplant-associated bacteria). Since genetic elements inside bacteria are prone to horizontal gene transfer and therefore genes that may be specific for a certain life-style may be found also in other bacterial species, by chance or because they might carry out a different function, we developed an orthologs-species clustering approach capable of taking into account this dynamical behavior. This “Fuzzy orthologs-species clusterization” analysis (see Materials and Methods) sorted out 998 orthologous groups for the Alpha Core subset, while life-style specific subset of NonPlant-Associated, Plant-Associated and Plant-Symbionts accounted for 88, 15 and 73 orthologous groups, respectively (<xref ref-type="fig" rid="f2-genes-02-01017">Figure 2</xref> and Supplementary Material S2). As expected, the Non-Plant subset of orthologous groups was found to be inconsistent, since a series of random subsets having the same number of species (see materials and methods) showed a similar number of orthologous groups. This finding is not surprising, since the NonPlant subset consists of species with no unique distinctive habitat, varying from soil to marine and freshwater organisms. On the contrary, for the other subsets, the number of orthologous groups generated from random species list was zero; the same result was observed when sampling only inside the order <italic>Rhizobiales</italic> (where most of the Plant-associated species are), retrieving 1.7 orthologous groups on average (data not shown). Notably, when we did not apply fuzziness to the orthologs-species clusterization, we did not find anyorthologous groups in any subset, but only in the Alpha Coreone. This discrepancy suggests that the plant association behavior may not be dependent on a strongly conserved defined set of genes strictly common to all the species of the subset. However, interestingly, by applying the “Fuzzy orthologs-species clusterization”, the larger Plant-Associated subset was found to contain only 15 orthologous groups, while in contrast the smaller Plant Symbionts subset contains 73 orthologous groups. This finding again confirms that the generic plant association behavior does not require a large repertoire of specific genes, while the symbiotic interaction in α-rhizobia is more dependent on the presence of specific gene traits [<xref ref-type="bibr" rid="b23-genes-02-01017">23</xref>].</p></sec>
<sec>
<label>2.3.</label>
<title>Which Biological Functions Are Encoded by Life-Style Associated Orthologous Groups?</title>
<p>An overview of the distribution among the COGs (Cluster of Orthologous Groups) categories of the genes present in the Alpha Core, Plant-Associated and Plant Symbionts subsets is reported in <xref ref-type="fig" rid="f3-genes-02-01017">Figure 3</xref>. Regarding the Plant-Associated and the Plant Symbionts subsets, COG categories related to basic cell functions (L, Replication, recombination and repair; B, Chromatin structure and dynamics D, Cell cycle control, cell division, chromosome partitioning; V, Defense mechanisms) are not represented, since they are mostly present in the Alpha Core subset, as expected; on the other hand, COG categories poorly or not characterized (S and X), are the most represented in both plant related subsets. The categories related to regulation of gene expression (K and J) and energy production and conversion (C) show a slightly higher proportion in the Plant-Associated subset. Regarding the Plant Symbionts subset, a slightly higher over-representation of the carbohydrate transport and metabolism category (G) was observed, suggesting the key role of carbohydrate metabolism for establishing nitrogen fixing symbiosis (<italic>i.e</italic>., for the formation of the so-called Nod factors as well as for bacteroid trophism [<xref ref-type="bibr" rid="b23-genes-02-01017">23</xref>]). Indeed, some plant symbionts, as for instance <italic>Sinorhizobium meliloti</italic>, contain large genomic regions or replicons mainly devoted to carbohydrate transport and metabolism [<xref ref-type="bibr" rid="b24-genes-02-01017">24</xref>-<xref ref-type="bibr" rid="b26-genes-02-01017">26</xref>]. However, the percentages of COG categories represented in the different subsets are not statistically different (Spearman Rank Correlation and Chi-square test with Monte Carlo simulation, data not shown).</p>
<p>The analysis of the GO (Gene Ontology) categories in the Plant-Associated subset (Supplementary Material S3) shows that the most represented biological process is related to electron carrier activity (4 groups), while for the Plant Symbionts the functions encoded appear to be more heterogeneous (<xref ref-type="fig" rid="f4-genes-02-01017">Figure 4</xref>). In particular, proteins involved in many process were found, ranging from symbiosis specific functions, like nodulation (4 orthologous groups: 3672, 4012 and 4954 encoding NodA, NodC, and the NodJ protein respectively, plus group 3700 also encoding NfeD another protein necessary for nodulation [<xref ref-type="bibr" rid="b27-genes-02-01017">27</xref>]) to trascriptional regulation and to more general biological functions (especially oxidation-reduction), with a slightly higher presence of transport-related functions (11 groups, with 3 of them probably involved in osmolarity control). As reported in <xref ref-type="fig" rid="f4-genes-02-01017">Figure 4</xref>, sugar transport (4 groups out of 11 transporter) and metabolism (5 groups), are highly represented in the symbiont subset. Within this category, of particular interest is group 2572 encoding for MocE, a Rieske non-heme iron oxygenase essential in the catabolism of rhizopines (3-<italic>O</italic>-methyl<italic>scyllo</italic>-inosamine, 3-<italic>O</italic>-M<italic>S</italic>I) a nodule-specific compounds that confer an intraspecies competitive nodulation advantage to strains able to utilize them [<xref ref-type="bibr" rid="b28-genes-02-01017">28</xref>]. Another intriguing group is 3933 which encodes for a protein belonging to the senescence marker protein 30 (SMP-30)/gluconolaconase superfamily which contains many mammalian sequences [<xref ref-type="bibr" rid="b29-genes-02-01017">29</xref>]; this protein was found to accommodate multiple functions [<xref ref-type="bibr" rid="b30-genes-02-01017">30</xref>], among which calcium regulation (as a regucalcin) [<xref ref-type="bibr" rid="b31-genes-02-01017">31</xref>]; that is particularly intriguing as the involvement of Ca<sup>2+</sup> in the symbiotic signaling pathway activated by flavonoids was found in <italic>Rhizobium leguminosarum</italic> bv. viciae [<xref ref-type="bibr" rid="b32-genes-02-01017">32</xref>]. Another interesting function which could be related to the plant symbiosis is cell motility, encoded by orthologous group 4140 (protein FliG). No nitrogen-fixation related proteins were found as exclusive, due to the presence in alphas of free-living nitrogen-fixing species (<italic>Xanthobacter autotrophicus</italic> [<xref ref-type="bibr" rid="b22-genes-02-01017">22</xref>]) and to the presence of the <italic>fix</italic> signaling module also in <italic>Caulobacter</italic> [<xref ref-type="bibr" rid="b33-genes-02-01017">33</xref>].</p>
<p>Interestingly, none of the functions previously putatively associated with endophytic life-style was detected, as type IV pili [<xref ref-type="bibr" rid="b34-genes-02-01017">34</xref>] or other metabolic or hormonal-related activities [<xref ref-type="bibr" rid="b35-genes-02-01017">35</xref>]. This is possibly due to the wide range of associations of our Plant-Associated subset, which includes also symbiotic and rhizospheric interaction; absence of type IV pili could be also linked to their involvement in many other processes in other species not engaged in plant interactions.</p></sec>
<sec>
<label>2.4.</label>
<title>Taxonomic Range of the Life-Style Associated Genes Outside Alphaproteobacteria</title>
<p>Once the list of life-style related orthologous groups was defined, we looked for their presence in the other branches of the bacterial taxonomic tree, in order to understand if such functions are specific for alphas or are widespread in other taxas, thus giving an insight into the evolutionary pathways of those functions. Each life-style associated orthologous group, was then used as a query on the GenBank database to find homologous sequences in all bacterial taxa; results of the analysis are reported in <xref ref-type="fig" rid="f5-genes-02-01017">Figure 5</xref> and Supplementary Material S4. <xref ref-type="fig" rid="f5-genes-02-01017">Figure 5</xref> offers an overview of the proportion of the orthologous groups occurring in each subset which have hits in the different bacterial classes. As expected, most of the hits were scored within <italic>Proteobacteria</italic>, in particular in the classes <italic>Beta</italic> and <italic>Gamma</italic>, possibly reflecting both a higher phylogenetic proximity and a general bias of the database which is abundant in sequences from members of such classes (<italic>Alphaproteobacteria</italic> were excluded from the analysis).</p>
<p>A high proportion of Alpha Core genes have at least one hit in almost all the taxa probed by the analysis, with an average of 10.5% of the Alpha Core subset having an hit in the selected taxa, while on average, only 4.2% and 5.0% of the Plant Associated and Plant Symbionts orthologous groups have at least a hit in each taxa; this observation suggests that plant association genes tend to be conserved only inside <italic>Alphaproteobacteria</italic> and to a lesser extent inside <italic>Beta</italic>- and <italic>Gamma-Proteobacteria</italic> (36% average), while just few genes have an homolog in phylogenetically distant species, where they might not be related to a plant association behavior.</p>
<p>To further elucidate this point, all the taxa found by this approach were investigated for their plant-association life-style, according to the GOLD database annotation; 33.3% of the Plant-Associated orthologous groups have at least one hit in species associated with plants, followed by the Plant Symbionts (30.1%) and the Alpha Core (24.6%). The two plant related subsets have plant association hits only inside the <italic>Protebacteria</italic> class (<italic>Beta-</italic> and <italic>Gamma-Proteobacteria</italic>), while the Alpha Core hits are distributed in a broader range, including <italic>Actinobacteria</italic>, <italic>Cyanobacteria</italic> and <italic>Firmicutes</italic> (Supplementary Material S5); the results of this analysis then imply that the plant association related genes are rather specific of the <italic>Proteobacteria</italic> class, while the housekeeping genes exhibit an higher degree of sequence conservation across all the bacterial phylogenetic tree.</p>
<p>To shed some light on the possibility that Plant-associated specific genes are involved in plant association at a broader taxonomic level, the protein coding genes found as exclusively present in Plant-Associated alphas were checked for their presence in the genomes of four known and fully-sequenced plant-associated <italic>Proteobacteria</italic>, in particular in the class of <italic>Beta-Proteobacteria</italic> the strains <italic>Cuprividus taiwanensis</italic> [<xref ref-type="bibr" rid="b17-genes-02-01017">17</xref>] and <italic>Azoarcus</italic> sp. BH72 [<xref ref-type="bibr" rid="b34-genes-02-01017">34</xref>] and in the class of <italic>Gamma-Proteobacteria</italic> the species <italic>Enterobacter</italic> sp. 638 [<xref ref-type="bibr" rid="b35-genes-02-01017">35</xref>] and <italic>Klebsiella pneumoniae</italic> 342 [<xref ref-type="bibr" rid="b36-genes-02-01017">36</xref>]. Results of the comparison are shown in <xref ref-type="table" rid="t1-genes-02-01017">Table 1</xref>. Interestingly, two out of 15 orthologous groups are present in all the four strains selected, namely orthologous group 2149 (Transcriptional regulator) and 2774 (endoribonuclease l-psp), suggesting that regulation (either by transcriptional regulation and RNA stability) may play pivotal roles in establishing the association with plant. Moreover 5 other genes were found to be present in at least 2 of the 4 species investigated. A previous work dealing with the description of the genome sequence of the β-rhizobium <italic>Cuprividus taiwanensis</italic> [<xref ref-type="bibr" rid="b17-genes-02-01017">17</xref>], found no gene both common and specific to all rhizobia, suggesting that symbiotic association with plants evolved with multiple strategies, even though genes preferentially associated (on a statistical basis) with plant symbiosis were detected. However, our findings suggest that these two genes, putatively needed in <italic>Alphaproteobacteria</italic> to establish a successful interaction with plants, are also present in the <italic>Beta</italic> and <italic>Gamma-Proteobacteria</italic> model organisms for plant association. Endoribonuclease L-PSP is involved in single-stranded mRNA cleavage in <italic>Leishmania infantum</italic>, a parasite that in its life cycle alternates two stages, and is hypothesized to be involved in specific post-transcriptional regulation of gene expression [<xref ref-type="bibr" rid="b37-genes-02-01017">37</xref>]; characterization of mutants for those orthologs is however necessary to fully elucidate their role in plant association in such a broad taxonomic background.</p></sec></sec>
<sec>
<label>3.</label>
<title>Experimental Section</title>
<sec>
<label>3.1.</label>
<title>Phylogenetic Tree</title>
<p>To construct our reference phylogenetic tree, all 16S rRNA gene sequences were aligned using MUSCLE [<xref ref-type="bibr" rid="b38-genes-02-01017">38</xref>], alignment was manually checked. The alignment was used with the software Mega 5.05 [<xref ref-type="bibr" rid="b39-genes-02-01017">39</xref>] to generate a phylogenetic tree. A Model test (Supplementary Material S6) was performed before running the Maximum Likelihood algorithm, with 1,000 bootstrap replicates and the Tamura-Nei model of evolution.</p></sec>
<sec>
<label>3.2.</label>
<title>Genomes Clusterization</title>
<p>The 92 genomes were clustered together using the approach proposed by Kim and collaborators [<xref ref-type="bibr" rid="b40-genes-02-01017">40</xref>], using the PanGenomer software (available upon request); a total number of 8,464 pairwise InParanoid analyses with no thresholds were generated and the results were merged in a single file as an input for MCL [<xref ref-type="bibr" rid="b41-genes-02-01017">41</xref>], using an inflation factor of 5.0, a pruning threshold of 30,000 and a selection number of 5,000. To test the clusterization the presence of 5 well known orthologous genes, involved in cell cycle regulation and DNA replication (<italic>ctrA</italic>, <italic>dnaA</italic>, <italic>rpoE</italic>, <italic>gyrB</italic>, <italic>dnaQ</italic>), was used as a positive control (referred to as group 4, 315, 158, 524 and 26 respectively).</p></sec>
<sec>
<label>3.3.</label>
<title>Fuzzy Orthologs-Species Clusterization</title>
<p>The obtained orthologous groups were mapped to the four species subsets (<xref ref-type="fig" rid="f1-genes-02-01017">Figure 1</xref>) looking at the species from which each protein belonging to that group came from, using a so-called “Fuzzy” approach: an orthologous group was regarded as specific for one of the four subsets when its species list was comprised between 80% and 110% of the subset list. The biological value of each subset was tested generating 10 random organism lists with the same length of the subset and looking at how many orthologous groups were then retrieved.</p></sec>
<sec>
<label>3.4.</label>
<title>Orthologous Groups Annotation</title>
<p>The orthologous groups belonging to the four subsets were annotated, using ten proteins from each group (selected randomly) to speed-up the analysis. Each protein was mapped to the COG database [<xref ref-type="bibr" rid="b42-genes-02-01017">42</xref>] using rpsblast 2.2.25+ and an e-value threshold of 1e-10; the domain content and the GO [<xref ref-type="bibr" rid="b43-genes-02-01017">43</xref>] annotation were obtained using Iprscan 4.8 [<xref ref-type="bibr" rid="b44-genes-02-01017">44</xref>] with the InterPro database release 33.0.</p></sec>
<sec>
<label>3.5.</label>
<title>Taxonomic Analysis</title>
<p>The protein sequence similarity across the bacterial kingdom of each orthologous group was inspected using TaxonomyBlaster (available upon request); the same proteins used for the annotation were analyzed using a series of taxonomically-restricted portions of the NCBI nr database (downloaded on 1 July 2011): all the taxonomic classes (excluding the “environmental samples”) inside the <italic>Bacteria</italic> kingdom were iteratively used; the <italic>Proteobacteria</italic> class was further divided into the distinct classes, excluding the <italic>Alphaproteobacteria</italic> and the Proteobacterial “environmental samples”. BLAST was run using the BLOSUM45 matrix, the soft masking option, a fixed database size of 500,000,000 and the Smith-Waterman local optimal alignments option. Those hits showing an e-value below 1e-10, a query coverage above 66% and an homology index above 0.33, were retained. The obtained species were marked as Plant-Associated looking at the available information in the GOLD database [<xref ref-type="bibr" rid="b20-genes-02-01017">20</xref>].</p></sec></sec>
<sec sec-type="conclusions">
<label>4.</label>
<title>Conclusions</title>
<p>As vector-borne intracellular <italic>Alphaproteobacteria</italic> have evolved towards smaller genome [<xref ref-type="bibr" rid="b4-genes-02-01017">4</xref>], the trend in plant-associated <italic>Alphaproteobacteria</italic> seems to be headed in the opposite direction towards an increase of genome size, probably due to the different habitats colonized including soil and plant tissues [<xref ref-type="bibr" rid="b19-genes-02-01017">19</xref>]. Here, for the first time to our best knowledge, we report an investigation of the genomic features, as different genes, which could be related, on a genomic basis, to the symbiotic and nonsymbiotic plant-association in <italic>Alphaproteobacteria.</italic> This analysis was carried out with a novel orthologs-species clusterization approach that was able to take into account the natural horizontal gene transfer dynamics, allowing us to also identify those genes that are (partially) shared with other species or that are not present in all the life-style related species.</p>
<p>Interestingly, a relatively large set of genes shared by an exclusive symbiotic alphaproteobacterial species was found, suggesting that a common genomic base is indeed present; tough multiple “recipes” for plant association are present [<xref ref-type="bibr" rid="b15-genes-02-01017">15</xref>]. This set includes functions previously known to be linked to the symbiotic interaction, but also others, which were previously unsuspected. In particular, genes necessary for plant-bacteria communication were retrieved as well as an enrichment in protein coding genes involved in sugar transport and metabolism. Most of these orthologs could likely be associated with metabolic exchanges and communication between plant cells and bacteroids; interestingly an ortholog of SMP30/gluconolaconase family (regucalcin) was also found suggesting a link between nodulation and calcium spiking in the rhizobial cell, in agreement with recent experimental findings[<xref ref-type="bibr" rid="b32-genes-02-01017">32</xref>].</p>
<p>Contrary to what could be expected by this highly heterogeneous phenotype, concerning all plant-associated species obtained, results showed a numerically low, but computationally consistent, set of genes which could account for their ability to associate with plants as both symbiont and nonsymbiont (<italic>i.e</italic>., rhizospheric, pathogen, endophyte). Interestingly, several functions were related to the regulation of gene expression, which makes sense considering the pivotal role of the perception of environmental signals for association with plants. This set of putatively plant-associated genes showed two apparently contradictory properties: a relatively high degree of conservation of these few genes inside <italic>Proteobacteria</italic> (when compared to the other branches of the bacterial tree) but also a certain degree of conservation across phylogenetically distant plant-associated species. This evidence could mean that even though there are no common genetic traits that distinguish this ecologically heterogeneous group of species, single genetic “pieces” may be shared, in a vast phylogenetic range, with other plant-associated species. We can then speculate that association with plants is therefore addressed using several pathways and mechanisms (which mirror the different types of association), even within a relatively narrow taxonomic range.</p>
<p>In conclusion, while symbiotic lifestyle needs a defined gene set, nonsymbiotic plant-bacteria association can occur through multiple strategies with functions specific for the single interaction.</p></sec>
<sec>
<title>Supplementary Material</title>
<list list-type="simple">
<list-item>
<p><bold>S1.</bold> Alphaproteobacteria dataset.</p></list-item>
<list-item>
<p><bold>S2.</bold> Clusterization of orthologous groups in the four subset analyzed (Alphas, NonPlant-Associated, Plant-Associated, Symbionts).</p></list-item>
<list-item>
<p><bold>S3.</bold> COG names list, Gene Ontology and COGs categories for Plant-associated and Symbionts.</p></list-item>
<list-item>
<p><bold>S4.</bold> Taxonomic sharing of life-style associated genes.</p></list-item>
<list-item>
<p><bold>S5.</bold> Taxonomic sharing of life-style associated genes in other plant-associated species.</p></list-item>
<list-item>
<p><bold>S6.</bold> Model test.</p></list-item></list></sec></body>
<back>
<sec sec-type="display-objects">
<title>Figures and Table</title>
<fig id="f1-genes-02-01017" position="float">
<label>Figure 1.</label>
<caption>
<p>Phylogenetic tree based on 16S rRNA gene sequence for the 92 selected organisms. Names in green and cyan indicate plant-associated species (green, symbionts; cyan, nonsymbionts). The dimension of the circles is proportional to the genome size, while the color of the circles indicates the GC content.</p></caption>
<graphic xlink:href="genes-02-01017f1.gif"/></fig>
<fig id="f2-genes-02-01017" position="float">
<label>Figure 2.</label>
<caption>
<p>Number of orthologous groups found inside each life-style species list. Circles sizes are not in scale.</p></caption>
<graphic xlink:href="genes-02-01017f2.gif"/></fig>
<fig id="f3-genes-02-01017" position="float">
<label>Figure 3.</label>
<caption>
<p>Percent distribution of orthologous groups belonging to the different subsets (Core, Plant-Associated and Plant symbionts) among Cluster of Orthologous Groups (COG) categories. Note that each orthologous group can be mapped to more than one category. The list of COG codes is reported in Supplementary Material S3.</p></caption>
<graphic xlink:href="genes-02-01017f3.gif"/></fig>
<fig id="f4-genes-02-01017" position="float">
<label>Figure 4.</label>
<caption>
<p>Overview of the cellular functions of the Plant Symbionts gene set. Go categories are color coded. Numbers represent orthologous groups (see Supplementary Material S2).</p></caption>
<graphic xlink:href="genes-02-01017f4.gif"/></fig>
<fig id="f5-genes-02-01017" position="float">
<label>Figure 5.</label>
<caption>
<p>Taxonomic sharing of life-style associated genes. For each taxonomic division (according to NCBI), the proportion of the life-style related orthologous groups having at least one significant hit is shown.</p></caption>
<graphic xlink:href="genes-02-01017f5.gif"/></fig>
<table-wrap id="t1-genes-02-01017" position="float">
<label>Table 1.</label>
<caption>
<p>Phylogenetic conservation of plant-associated orthologous groups in other plant-associated bacteria. The hit for each genome is indicated as GenBank accession number of the corresponding protein.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top"><bold>Orthologous group</bold></th>
<th align="left" valign="top"><bold>Function</bold></th>
<th align="left" valign="top"><bold><italic>Azoarcus</italic> BH72</bold></th>
<th align="left" valign="top"><bold><italic>Cupriavidus taiwanensis</italic></bold></th>
<th align="left" valign="top"><bold><italic>Enterobacter</italic> 638</bold></th>
<th align="left" valign="top"><bold><italic>Klebsiella pneumoniae</italic></bold></th></tr></thead>
<tbody>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="middle" rowspan="3"><bold>2149</bold></td>
<td align="left" valign="middle" rowspan="3">Transcriptional regulator</td>
<td align="left" valign="middle" rowspan="3">YP_932298</td>
<td align="left" valign="middle" rowspan="3">YP_002005188<break/>YP_002007781</td>
<td align="left" valign="top">YP_001177142</td>
<td align="left" valign="middle" rowspan="3">YP_002237096<break/>YP_002237759</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top">YP_001177763</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top">YP_001177947</td></tr>
<tr>
<td align="left" valign="middle" rowspan="3"><bold>2248</bold></td>
<td align="left" valign="middle" rowspan="3">Transcriptional regulator</td>
<td align="left" valign="middle" rowspan="3"/>
<td align="left" valign="middle" rowspan="3">YP_001795747</td>
<td align="left" valign="top">YP_001177733</td>
<td align="left" valign="middle" rowspan="3">YP_002239091<break/>YP_002240771</td></tr>
<tr>
<td align="left" valign="top">YP_001177763</td></tr>
<tr>
<td align="left" valign="top">YP_001177947</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top"><bold>2654</bold></td>
<td align="left" valign="top">Adenylate cyclase</td>
<td align="left" valign="top">YP_932132</td>
<td align="left" valign="top">YP_002008552</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="middle" rowspan="2"><bold>2734</bold></td>
<td align="left" valign="middle" rowspan="2">ABC transporter</td>
<td align="left" valign="top" rowspan="2"/>
<td align="left" valign="top" rowspan="2"/>
<td align="left" valign="top">YP_001175837</td>
<td align="left" valign="top">YP_002237478</td></tr>
<tr>
<td align="left" valign="top">YP_001177423</td>
<td align="left" valign="top">YP_002239806</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top"><bold>2737</bold></td>
<td align="left" valign="top">Unknown</td>
<td align="left" valign="top"/>
<td align="left" valign="top">YP_002005759</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="middle" rowspan="2"><bold>2774</bold></td>
<td align="left" valign="middle" rowspan="2">Endoribonuclease l-psp</td>
<td align="left" valign="middle" rowspan="2">YP_931980</td>
<td align="left" valign="top">YP_002008711</td>
<td align="left" valign="middle" rowspan="2">YP_001178228</td>
<td align="left" valign="top">YP_002237662</td></tr>
<tr>
<td align="left" valign="top">YP_002008874</td>
<td align="left" valign="top">YP_002238064</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top"><bold>2791</bold></td>
<td align="left" valign="top">Phosphoesterase</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr>
<td align="left" valign="middle"><bold>2853</bold></td>
<td align="left" valign="top">Electron transfer flavoprotein, beta subunit</td>
<td align="left" valign="top"/>
<td align="left" valign="middle">YP_001796225</td>
<td align="left" valign="middle"/>
<td align="left" valign="middle"/></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top"><bold>2898</bold></td>
<td align="left" valign="top">Unknown</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top">YP_002236173</td></tr>
<tr>
<td align="left" valign="middle"><bold>2908</bold></td>
<td align="left" valign="top">Electron transfer flavoprotein, alpha subunit</td>
<td align="left" valign="top"/>
<td align="left" valign="middle">YP_001796224</td>
<td align="left" valign="middle"/>
<td align="left" valign="middle"/></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="middle" rowspan="2"><bold>2912</bold></td>
<td align="left" valign="middle" rowspan="2">Methyltransferase</td>
<td align="left" valign="middle" rowspan="2">YP_935409</td>
<td align="left" valign="top" rowspan="2"/>
<td align="left" valign="top">YP_001176000</td>
<td align="left" valign="top">YP_002237443</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top">YP_001177854</td>
<td align="left" valign="top">YP_002239590</td></tr>
<tr>
<td align="left" valign="middle"><bold>2927</bold></td>
<td align="left" valign="top">Aminoacid aldolase or racemase</td>
<td align="left" valign="top"/>
<td align="left" valign="middle">YP_002007445</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="middle" rowspan="2"><bold>2981</bold></td>
<td align="left" valign="top" rowspan="2">Mg<sup>2+</sup> and Co<sup>2+</sup> transporters</td>
<td align="left" valign="top" rowspan="2"/>
<td align="left" valign="middle" rowspan="2">YP_002006900</td>
<td align="left" valign="middle" rowspan="2">YP_001176870</td>
<td align="left" valign="top">YP_002238775</td></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top">YP_002238859</td></tr>
<tr>
<td align="left" valign="top"><bold>3082</bold></td>
<td align="left" valign="top">Ferredoxin-like protein</td>
<td align="left" valign="top"/>
<td align="left" valign="top">YP_001796222</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/></tr>
<tr content-type="background-color:#C7C9CB">
<td align="left" valign="top"><bold>3137</bold></td>
<td align="left" valign="top">Unknown</td>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top"/>
<td align="left" valign="top">YP_002236905</td></tr></tbody></table></table-wrap></sec>
<ack>
<p>This work has been partially supported by the Italian Ministry of Research (PRIN 2008 research grant contract No. TCKNJL, “Il pangenoma di <italic>Sinorhizobium meliloti</italic>: L'uso della genomica per il miglioramento agronomico dell'erba medica”) and intramural funding of the University of Florence to M.B. and A.M. F.P. is supported by a research fellowship of Ente Cassa di Risparmio di Firenze charity trust. M.G. is supported by a PhD fellowship by the University of Florence.</p></ack>
<ref-list>
<title>References</title>
<ref id="b1-genes-02-01017"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ettema</surname><given-names>T.J.</given-names></name><name><surname>Andersson</surname><given-names>S.G.</given-names></name></person-group><article-title>The α-proteobacteria: The Darwin finches of the bacterial world</article-title><source>Biol. Lett.</source><year>2009</year><volume>5</volume><fpage>429</fpage><lpage>432</lpage><pub-id pub-id-type="doi">10.1098/rsbl.2008.0793</pub-id><pub-id pub-id-type="pmid">19324639</pub-id></citation></ref>
<ref id="b2-genes-02-01017"><label>2.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Rhijn</surname><given-names>P.</given-names></name><name><surname>Vanderleyden</surname><given-names>J.</given-names></name></person-group><article-title>The <italic>Rhizobium-plant</italic> symbiosis</article-title><source>Microbiol. Rev.</source><year>1995</year><volume>59</volume><fpage>124</fpage><lpage>142</lpage><pub-id pub-id-type="pmid">7708010</pub-id></citation></ref>
<ref id="b3-genes-02-01017"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giovannoni</surname><given-names>S.J.</given-names></name><name><surname>Tripp</surname><given-names>H.J.</given-names></name><name><surname>Givan</surname><given-names>S.</given-names></name><name><surname>Podar</surname><given-names>M.</given-names></name><name><surname>Vergin</surname><given-names>K.L.</given-names></name><name><surname>Baptista</surname><given-names>D.</given-names></name><name><surname>Bibbs</surname><given-names>L.</given-names></name><name><surname>Eads</surname><given-names>J.</given-names></name><name><surname>Richardson</surname><given-names>T.H.</given-names></name><name><surname>Noordewier</surname><given-names>M.</given-names></name><etal/></person-group><article-title>Genome streamlining in a cosmopolitan oceanic bacterium</article-title><source>Science</source><year>2005</year><volume>309</volume><fpage>1242</fpage><lpage>1245</lpage><pub-id pub-id-type="doi">10.1126/science.1114057</pub-id><pub-id pub-id-type="pmid">16109880</pub-id></citation></ref>
<ref id="b4-genes-02-01017"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sallstrom</surname><given-names>B.</given-names></name><name><surname>Andersson</surname><given-names>S.G.</given-names></name></person-group><article-title>Genome reduction in the α-Proteobacteria</article-title><source>Curr. Opin. Microbiol.</source><year>2005</year><volume>8</volume><fpage>579</fpage><lpage>585</lpage><pub-id pub-id-type="doi">10.1016/j.mib.2005.08.002</pub-id><pub-id pub-id-type="pmid">16099701</pub-id></citation></ref>
<ref id="b5-genes-02-01017"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harrison</surname><given-names>P.W.</given-names></name><name><surname>Lower</surname><given-names>R.P.</given-names></name><name><surname>Kim</surname><given-names>N.K.</given-names></name><name><surname>Young</surname><given-names>J.P.</given-names></name></person-group><article-title>Introducing the bacterial ‘chromid’: Not a chromosome, not a plasmid</article-title><source>Trends Microbiol.</source><year>2010</year><volume>18</volume><fpage>141</fpage><lpage>148</lpage><pub-id pub-id-type="doi">10.1016/j.tim.2009.12.010</pub-id><pub-id pub-id-type="pmid">20080407</pub-id></citation></ref>
<ref id="b6-genes-02-01017"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moreno</surname><given-names>E.</given-names></name></person-group><article-title>Genome evolution within the alpha-<italic>Proteobacteria</italic>: Why do some bacteria not possess plasmids and others exhibit more than one different chromosome?</article-title><source>FEMS Microbiol. Rev.</source><year>1998</year><volume>22</volume><fpage>255</fpage><lpage>275</lpage><pub-id pub-id-type="doi">10.1111/j.1574-6976.1998.tb00370.x</pub-id><pub-id pub-id-type="pmid">9862123</pub-id></citation></ref>
<ref id="b7-genes-02-01017"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Danhorn</surname><given-names>T.</given-names></name><name><surname>Fuqua</surname><given-names>C.</given-names></name></person-group><article-title>Biofilm formation by plant-associated bacteria</article-title><source>Annu. Rev. Microbiol.</source><year>2007</year><volume>61</volume><fpage>401</fpage><lpage>422</lpage><pub-id pub-id-type="doi">10.1146/annurev.micro.61.080706.093316</pub-id><pub-id pub-id-type="pmid">17506679</pub-id></citation></ref>
<ref id="b8-genes-02-01017"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Whipps</surname><given-names>J.M.</given-names></name><name><surname>Hand</surname><given-names>P.</given-names></name><name><surname>Pink</surname><given-names>D.</given-names></name><name><surname>Bending</surname><given-names>G.D.</given-names></name></person-group><article-title>Phyllosphere microbiology with special reference to diversity and plant genotype</article-title><source>J. Appl. Microbiol.</source><year>2008</year><volume>105</volume><fpage>1744</fpage><lpage>1755</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2672.2008.03906.x</pub-id><pub-id pub-id-type="pmid">19120625</pub-id></citation></ref>
<ref id="b9-genes-02-01017"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Singh</surname><given-names>B.K.</given-names></name><name><surname>Millard</surname><given-names>P.</given-names></name><name><surname>Whiteley</surname><given-names>A.S.</given-names></name><name><surname>Murrell</surname><given-names>J.C.</given-names></name></person-group><article-title>Unravelling rhizosphere-microbial interactions: Opportunities and limitations</article-title><source>Trends Microbiol.</source><year>2004</year><volume>12</volume><fpage>386</fpage><lpage>393</lpage><pub-id pub-id-type="doi">10.1016/j.tim.2004.06.008</pub-id><pub-id pub-id-type="pmid">15276615</pub-id></citation></ref>
<ref id="b10-genes-02-01017"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ryan</surname><given-names>R.P.</given-names></name><name><surname>Germaine</surname><given-names>K.</given-names></name><name><surname>Franks</surname><given-names>A.</given-names></name><name><surname>Ryan</surname><given-names>D.J.</given-names></name><name><surname>Dowling</surname><given-names>D.N.</given-names></name></person-group><article-title>Bacterial endophytes: Recent developments and applications</article-title><source>FEMS Microbiol. Lett.</source><year>2008</year><volume>278</volume><fpage>1</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1111/j.1574-6968.2007.00918.x</pub-id><pub-id pub-id-type="pmid">18034833</pub-id></citation></ref>
<ref id="b11-genes-02-01017"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rajkumar</surname><given-names>M.</given-names></name><name><surname>Ae</surname><given-names>N.</given-names></name><name><surname>Freitas</surname><given-names>H.</given-names></name></person-group><article-title>Endophytic bacteria and their potential to enhance heavy metal phytoextraction</article-title><source>Chemosphere</source><year>2009</year><volume>77</volume><fpage>153</fpage><lpage>160</lpage><pub-id pub-id-type="doi">10.1016/j.chemosphere.2009.06.047</pub-id><pub-id pub-id-type="pmid">19647283</pub-id></citation></ref>
<ref id="b12-genes-02-01017"><label>12.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Sadowsky</surname><given-names>M.</given-names></name><name><surname>Graham</surname><given-names>P.</given-names></name></person-group><article-title>Root and Stem Nodule Bacteria of Legumes</article-title><source>The Prokaryotes</source><person-group person-group-type="editor"><name><surname>Dworkin</surname><given-names>M.</given-names></name><name><surname>Falkow</surname><given-names>S.</given-names></name><name><surname>Rosenberg</surname><given-names>E.</given-names></name><name><surname>Schleifer</surname><given-names>K.-H.</given-names></name><name><surname>Stackebrandt</surname><given-names>E.</given-names></name></person-group><publisher-name>Springer</publisher-name><publisher-loc>New York, NY, USA</publisher-loc><year>2006</year><volume>2</volume><fpage>818</fpage><lpage>841</lpage></citation></ref>
<ref id="b13-genes-02-01017"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bashan</surname><given-names>Y.</given-names></name><name><surname>Holguin</surname><given-names>G.</given-names></name><name><surname>de-Bashan</surname><given-names>L.E.</given-names></name></person-group><article-title><italic>Azospirillum-plant</italic> relationships: Physiological, molecular, agricultural, and environmental advances (1997-2003)</article-title><source>Can. J. Microbiol.</source><year>2004</year><volume>50</volume><fpage>521</fpage><lpage>577</lpage><pub-id pub-id-type="doi">10.1139/w04-035</pub-id><pub-id pub-id-type="pmid">15467782</pub-id></citation></ref>
<ref id="b14-genes-02-01017"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chi</surname><given-names>F.</given-names></name><name><surname>Shen</surname><given-names>S.H.</given-names></name><name><surname>Cheng</surname><given-names>H.P.</given-names></name><name><surname>Jing</surname><given-names>Y.X.</given-names></name><name><surname>Yanni</surname><given-names>Y.G.</given-names></name><name><surname>Dazzo</surname><given-names>F.B.</given-names></name></person-group><article-title>Ascending migration of endophytic rhizobia, from roots to leaves, inside rice plants and assessment of benefits to rice growth physiology</article-title><source>Appl. Environ. Microbiol.</source><year>2005</year><volume>71</volume><fpage>7271</fpage><lpage>7278</lpage><pub-id pub-id-type="doi">10.1128/AEM.71.11.7271-7278.2005</pub-id><pub-id pub-id-type="pmid">16269768</pub-id></citation></ref>
<ref id="b15-genes-02-01017"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Masson-Boivin</surname><given-names>C.</given-names></name><name><surname>Giraud</surname><given-names>E.</given-names></name><name><surname>Perret</surname><given-names>X.</given-names></name><name><surname>Batut</surname><given-names>J.</given-names></name></person-group><article-title>Establishing nitrogen-fixing symbiosis with legumes: How many rhizobium recipes?</article-title><source>Trends Microbiol.</source><year>2009</year><volume>17</volume><fpage>458</fpage><lpage>466</lpage><pub-id pub-id-type="doi">10.1016/j.tim.2009.07.004</pub-id><pub-id pub-id-type="pmid">19766492</pub-id></citation></ref>
<ref id="b16-genes-02-01017"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giraud</surname><given-names>E.</given-names></name><name><surname>Moulin</surname><given-names>L.</given-names></name><name><surname>Vallenet</surname><given-names>D.</given-names></name><name><surname>Barbe</surname><given-names>V.</given-names></name><name><surname>Cytryn</surname><given-names>E.</given-names></name><name><surname>Avarre</surname><given-names>J.C.</given-names></name><name><surname>Jaubert</surname><given-names>M.</given-names></name><name><surname>Simon</surname><given-names>D.</given-names></name><name><surname>Cartieaux</surname><given-names>F.</given-names></name><name><surname>Prin</surname><given-names>Y.</given-names></name><etal/></person-group><article-title>Legumes symbioses: Absence of <italic>nod</italic> genes in photosynthetic bradyrhizobia</article-title><source>Science</source><year>2007</year><volume>316</volume><fpage>1307</fpage><lpage>1312</lpage><pub-id pub-id-type="doi">10.1126/science.1139548</pub-id><pub-id pub-id-type="pmid">17540897</pub-id></citation></ref>
<ref id="b17-genes-02-01017"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amadou</surname><given-names>C.</given-names></name><name><surname>Pascal</surname><given-names>G.</given-names></name><name><surname>Mangenot</surname><given-names>S.</given-names></name><name><surname>Glew</surname><given-names>M.</given-names></name><name><surname>Bontemps</surname><given-names>C.</given-names></name><name><surname>Capela</surname><given-names>D.</given-names></name><name><surname>Carrere</surname><given-names>S.</given-names></name><name><surname>Cruveiller</surname><given-names>S.</given-names></name><name><surname>Dossat</surname><given-names>C.</given-names></name><name><surname>Lajus</surname><given-names>A.</given-names></name><etal/></person-group><article-title>Genome sequence of the β-rhizobium <italic>Cupriavidus taiwanensis</italic> and comparative genomics of rhizobia</article-title><source>Genome Res.</source><year>2008</year><volume>18</volume><fpage>1472</fpage><lpage>1483</lpage><pub-id pub-id-type="doi">10.1101/gr.076448.108</pub-id><pub-id pub-id-type="pmid">18490699</pub-id></citation></ref>
<ref id="b18-genes-02-01017"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taghavi</surname><given-names>S.</given-names></name><name><surname>Garafola</surname><given-names>C.</given-names></name><name><surname>Monchy</surname><given-names>S.</given-names></name><name><surname>Newman</surname><given-names>L.</given-names></name><name><surname>Hoffman</surname><given-names>A.</given-names></name><name><surname>Weyens</surname><given-names>N.</given-names></name><name><surname>Barac</surname><given-names>T.</given-names></name><name><surname>Vangronsveld</surname><given-names>J.</given-names></name><name><surname>van der Lelie</surname><given-names>D.</given-names></name></person-group><article-title>Genome survey and characterization of endophytic bacteria exhibiting a beneficial effect on growth and development of poplar trees</article-title><source>Appl. Environ. Microbiol.</source><year>2009</year><volume>75</volume><fpage>748</fpage><lpage>757</lpage><pub-id pub-id-type="doi">10.1128/AEM.02239-08</pub-id><pub-id pub-id-type="pmid">19060168</pub-id></citation></ref>
<ref id="b19-genes-02-01017"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Batut</surname><given-names>J.</given-names></name><name><surname>Andersson</surname><given-names>S.G.</given-names></name><name><surname>O'Callaghan</surname><given-names>D.</given-names></name></person-group><article-title>The evolution of chronic infection strategies in the alpha-Proteobacteria</article-title><source>Nat. Rev. Microbiol.</source><year>2004</year><volume>2</volume><fpage>933</fpage><lpage>945</lpage><pub-id pub-id-type="doi">10.1038/nrmicro1044</pub-id><pub-id pub-id-type="pmid">15550939</pub-id></citation></ref>
<ref id="b20-genes-02-01017"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernal</surname><given-names>A.</given-names></name><name><surname>Ear</surname><given-names>U.</given-names></name><name><surname>Kyrpides</surname><given-names>N.</given-names></name></person-group><article-title>Genomes OnLine Database (GOLD): A monitor of genome projects world-wide</article-title><source>Nucleic Acids Res.</source><year>2001</year><volume>29</volume><fpage>126</fpage><lpage>127</lpage><pub-id pub-id-type="doi">10.1093/nar/29.1.126</pub-id><pub-id pub-id-type="pmid">11125068</pub-id></citation></ref>
<ref id="b21-genes-02-01017"><label>21.</label><citation citation-type="web"><person-group person-group-type="author"><collab>GOLD database</collab></person-group><comment>Available online: <ext-link xlink:href="http://www.genomesonline.org/cgi-bin/index.cgi/" ext-link-type="uri">http://www.genomesonline.org/cgi-bin/index.cgi/</ext-link> (accessed on 8 November 2011)</comment></citation></ref>
<ref id="b22-genes-02-01017"><label>22.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Krieg</surname><given-names>N.R.</given-names></name><name><surname>Holt</surname><given-names>J.G.</given-names></name><name><surname>Bergey</surname><given-names>D.H.</given-names></name></person-group><source>Bergey's Manual of Systematic Bacteriology</source><publisher-name>Williams &amp; Wilkins</publisher-name><publisher-loc>Baltimore, MD, USA</publisher-loc><year>1984</year><volume>2</volume><fpage>1</fpage><lpage>574</lpage></citation></ref>
<ref id="b23-genes-02-01017"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gibson</surname><given-names>K.E.</given-names></name><name><surname>Kobayashi</surname><given-names>H.</given-names></name><name><surname>Walker</surname><given-names>G.C.</given-names></name></person-group><article-title>Molecular determinants of a symbiotic chronic infection</article-title><source>Annu. Rev. Genet.</source><year>2008</year><volume>42</volume><fpage>413</fpage><lpage>441</lpage><pub-id pub-id-type="doi">10.1146/annurev.genet.42.110807.091427</pub-id><pub-id pub-id-type="pmid">18983260</pub-id></citation></ref>
<ref id="b24-genes-02-01017"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galardini</surname><given-names>M.</given-names></name><name><surname>Mengoni</surname><given-names>A.</given-names></name><name><surname>Brilli</surname><given-names>M.</given-names></name><name><surname>Pini</surname><given-names>F.</given-names></name><name><surname>Fioravanti</surname><given-names>A.</given-names></name><name><surname>Lucas</surname><given-names>S.</given-names></name><name><surname>Lapidus</surname><given-names>A.</given-names></name><name><surname>Cheng</surname><given-names>J.F.</given-names></name><name><surname>Goodwin</surname><given-names>L.</given-names></name><name><surname>Pitluck</surname><given-names>S.</given-names></name><etal/></person-group><article-title>Exploring the symbiotic pangenome of the nitrogen-fixing bacterium <italic>Sinorhizobium meliloti</italic></article-title><source>BMC Genomics</source><year>2011</year><volume>12</volume><fpage>235</fpage><pub-id pub-id-type="doi">10.1186/1471-2164-12-235</pub-id><pub-id pub-id-type="pmid">21569405</pub-id></citation></ref>
<ref id="b25-genes-02-01017"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finan</surname><given-names>T.M.</given-names></name><name><surname>Weidner</surname><given-names>S.</given-names></name><name><surname>Wong</surname><given-names>K.</given-names></name><name><surname>Buhrmester</surname><given-names>J.</given-names></name><name><surname>Chain</surname><given-names>P.</given-names></name><name><surname>Vorholter</surname><given-names>F.J.</given-names></name><name><surname>Hernandez-Lucas</surname><given-names>I.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Cowie</surname><given-names>A.</given-names></name><name><surname>Gouzy</surname><given-names>J.</given-names></name><etal/></person-group><article-title>The complete sequence of the 1,683-kb pSymB megaplasmid from the N2-fixing endosymbiont <italic>Sinorhizobium meliloti</italic></article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2001</year><volume>98</volume><fpage>9889</fpage><lpage>9894</lpage><pub-id pub-id-type="doi">10.1073/pnas.161294698</pub-id><pub-id pub-id-type="pmid">11481431</pub-id></citation></ref>
<ref id="b26-genes-02-01017"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galibert</surname><given-names>F.</given-names></name><name><surname>Finan</surname><given-names>T.M.</given-names></name><name><surname>Long</surname><given-names>S.R.</given-names></name><name><surname>Puhler</surname><given-names>A.</given-names></name><name><surname>Abola</surname><given-names>P.</given-names></name><name><surname>Ampe</surname><given-names>F.</given-names></name><name><surname>Barloy-Hubler</surname><given-names>F.</given-names></name><name><surname>Barnett</surname><given-names>M.J.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Boistard</surname><given-names>P.</given-names></name><etal/></person-group><article-title>The composite genome of the legume symbiont <italic>Sinorhizobium meliloti</italic></article-title><source>Science</source><year>2001</year><volume>293</volume><fpage>668</fpage><lpage>672</lpage><pub-id pub-id-type="doi">10.1126/science.1060966</pub-id><pub-id pub-id-type="pmid">11474104</pub-id></citation></ref>
<ref id="b27-genes-02-01017"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Garcia-Rodriguez</surname><given-names>F.M.</given-names></name><name><surname>Toro</surname><given-names>N.</given-names></name></person-group><article-title><italic>Sinorhizobium meliloti</italic> nfe (nodulation formation efficiency) genes exhibit temporal and spatial expression patterns similar to those of genes involved in symbiotic nitrogen fixation</article-title><source>Mol. Plant Microbe Interact.</source><year>2000</year><volume>13</volume><fpage>583</fpage><lpage>591</lpage><pub-id pub-id-type="doi">10.1094/MPMI.2000.13.6.583</pub-id><pub-id pub-id-type="pmid">10830257</pub-id></citation></ref>
<ref id="b28-genes-02-01017"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bahar</surname><given-names>M.</given-names></name><name><surname>de Majnik</surname><given-names>J.</given-names></name><name><surname>Wexler</surname><given-names>M.</given-names></name><name><surname>Fry</surname><given-names>J.</given-names></name><name><surname>Poole</surname><given-names>P.S.</given-names></name><name><surname>Murphy</surname><given-names>P.J.</given-names></name></person-group><article-title>A model for the catabolism of rhizopine in <italic>Rhizobium leguminosarum</italic> involves a ferredoxin oxygenase complex and the inositol degradative pathway</article-title><source>Mol. Plant Microbe Interact.</source><year>1998</year><volume>11</volume><fpage>1057</fpage><lpage>1068</lpage><pub-id pub-id-type="doi">10.1094/MPMI.1998.11.11.1057</pub-id><pub-id pub-id-type="pmid">9805393</pub-id></citation></ref>
<ref id="b29-genes-02-01017"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>C.N.</given-names></name><name><surname>Chin</surname><given-names>K.H.</given-names></name><name><surname>Wang</surname><given-names>A.H.</given-names></name><name><surname>Chou</surname><given-names>S.H.</given-names></name></person-group><article-title>The first crystal structure of gluconolactonase important in the glucose secondary metabolic pathways</article-title><source>J. Mol. Biol.</source><year>2008</year><volume>384</volume><fpage>604</fpage><lpage>614</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2008.09.055</pub-id><pub-id pub-id-type="pmid">18848569</pub-id></citation></ref>
<ref id="b30-genes-02-01017"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fujita</surname><given-names>T.</given-names></name></person-group><article-title>Senescence marker protein-30 (SMP30): Structure and biological function</article-title><source>Biochem. Biophys. Res. Commun.</source><year>1999</year><volume>254</volume><fpage>1</fpage><lpage>4</lpage><pub-id pub-id-type="doi">10.1006/bbrc.1998.9841</pub-id><pub-id pub-id-type="pmid">9920722</pub-id></citation></ref>
<ref id="b31-genes-02-01017"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yamaguchi</surname><given-names>M.</given-names></name></person-group><article-title>Role of regucalcin in calcium signaling</article-title><source>Life Sci.</source><year>2000</year><volume>66</volume><fpage>1769</fpage><lpage>1780</lpage><pub-id pub-id-type="doi">10.1016/S0024-3205(99)00602-5</pub-id><pub-id pub-id-type="pmid">10809175</pub-id></citation></ref>
<ref id="b32-genes-02-01017"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moscatiello</surname><given-names>R.</given-names></name><name><surname>Squartini</surname><given-names>A.</given-names></name><name><surname>Mariani</surname><given-names>P.</given-names></name><name><surname>Navazio</surname><given-names>L.</given-names></name></person-group><article-title>Flavonoid-induced calcium signalling in <italic>Rhizobium leguminosarum</italic> bv</article-title><source>viciae. New Phytol.</source><year>2010</year><volume>188</volume><fpage>814</fpage><lpage>823</lpage><pub-id pub-id-type="doi">10.1111/j.1469-8137.2010.03411.x</pub-id></citation></ref>
<ref id="b33-genes-02-01017"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crosson</surname><given-names>S.</given-names></name><name><surname>McGrath</surname><given-names>P.T.</given-names></name><name><surname>Stephens</surname><given-names>C.</given-names></name><name><surname>McAdams</surname><given-names>H.H.</given-names></name><name><surname>Shapiro</surname><given-names>L.</given-names></name></person-group><article-title>Conserved modular design of an oxygen sensory/signaling network with species-specific output</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2005</year><volume>102</volume><fpage>8018</fpage><lpage>8023</lpage><pub-id pub-id-type="doi">10.1073/pnas.0503022102</pub-id><pub-id pub-id-type="pmid">15911751</pub-id></citation></ref>
<ref id="b34-genes-02-01017"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krause</surname><given-names>A.</given-names></name><name><surname>Ramakumar</surname><given-names>A.</given-names></name><name><surname>Bartels</surname><given-names>D.</given-names></name><name><surname>Battistoni</surname><given-names>F.</given-names></name><name><surname>Bekel</surname><given-names>T.</given-names></name><name><surname>Boch</surname><given-names>J.</given-names></name><name><surname>Bohm</surname><given-names>M.</given-names></name><name><surname>Friedrich</surname><given-names>F.</given-names></name><name><surname>Hurek</surname><given-names>T.</given-names></name><name><surname>Krause</surname><given-names>L.</given-names></name><etal/></person-group><article-title>Complete genome of the mutualistic, N<sub>2</sub>-fixing grass endophyte <italic>Azoarcus</italic> sp. strain BH72</article-title><source>Nat. Biotechnol.</source><year>2006</year><volume>24</volume><fpage>1385</fpage><lpage>1391</lpage><pub-id pub-id-type="doi">10.1038/nbt1243</pub-id><pub-id pub-id-type="pmid">17057704</pub-id></citation></ref>
<ref id="b35-genes-02-01017"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taghavi</surname><given-names>S.</given-names></name><name><surname>van der Lelie</surname><given-names>D.</given-names></name><name><surname>Hoffman</surname><given-names>A.</given-names></name><name><surname>Zhang</surname><given-names>Y.-B.</given-names></name><name><surname>Walla</surname><given-names>M.D.</given-names></name><name><surname>Vangronsveld</surname><given-names>J.</given-names></name><name><surname>Newman</surname><given-names>L.</given-names></name><name><surname>Monchy</surname><given-names>S.</given-names></name></person-group><article-title>Genome sequence of the plant growth promoting endophytic bacterium <italic>Enterobacter</italic> sp. 638</article-title><source>PLoS Genet.</source><year>2010</year><volume>6</volume><fpage>e1000943</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1000943</pub-id><pub-id pub-id-type="pmid">20485560</pub-id></citation></ref>
<ref id="b36-genes-02-01017"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fouts</surname><given-names>D.E.</given-names></name><name><surname>Tyler</surname><given-names>H.L.</given-names></name><name><surname>DeBoy</surname><given-names>R.T.</given-names></name><name><surname>Daugherty</surname><given-names>S.</given-names></name><name><surname>Ren</surname><given-names>Q.</given-names></name><name><surname>Badger</surname><given-names>J.H.</given-names></name><name><surname>Durkin</surname><given-names>A.S.</given-names></name><name><surname>Huot</surname><given-names>H.</given-names></name><name><surname>Shrivastava</surname><given-names>S.</given-names></name><name><surname>Kothari</surname><given-names>S.</given-names></name><etal/></person-group><article-title>Complete genome sequence of the N<sub>2</sub>-fixing broad host range endophyte <italic>Klebsiella pneumoniae</italic> 342 and virulence predictions verified in mice</article-title><source>PLoS Genet.</source><year>2008</year><volume>4</volume><fpage>e1000141</fpage><pub-id pub-id-type="doi">10.1371/journal.pgen.1000141</pub-id><pub-id pub-id-type="pmid">18654632</pub-id></citation></ref>
<ref id="b37-genes-02-01017"><label>37.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alcolea</surname><given-names>P.J.</given-names></name><name><surname>Alonso</surname><given-names>A.</given-names></name><name><surname>Gomez</surname><given-names>M.J.</given-names></name><name><surname>Moreno</surname><given-names>I.</given-names></name><name><surname>Dominguez</surname><given-names>M.</given-names></name><name><surname>Parro</surname><given-names>V.</given-names></name><name><surname>Larraga</surname><given-names>V.</given-names></name></person-group><article-title>Transcriptomics throughout the life cycle of <italic>Leishmania infantum</italic>: High down-regulation rate in the amastigote stage</article-title><source>Int. J. Parasitol.</source><year>2010</year><volume>40</volume><fpage>1497</fpage><lpage>1516</lpage><pub-id pub-id-type="doi">10.1016/j.ijpara.2010.05.013</pub-id><pub-id pub-id-type="pmid">20654620</pub-id></citation></ref>
<ref id="b38-genes-02-01017"><label>38.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname><given-names>R.C.</given-names></name></person-group><article-title>MUSCLE: A multiple sequence alignment method with reduced time and space complexity</article-title><source>BMC Bioinformatics</source><year>2004</year><volume>5</volume><fpage>113</fpage><pub-id pub-id-type="doi">10.1186/1471-2105-5-113</pub-id><pub-id pub-id-type="pmid">15318951</pub-id></citation></ref>
<ref id="b39-genes-02-01017"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tamura</surname><given-names>K.</given-names></name><name><surname>Peterson</surname><given-names>D.</given-names></name><name><surname>Peterson</surname><given-names>N.</given-names></name><name><surname>Stecher</surname><given-names>G.</given-names></name><name><surname>Nei</surname><given-names>M.</given-names></name><name><surname>Kumar</surname><given-names>S.</given-names></name></person-group><article-title>MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods</article-title><source>Mol. Biol. Evol.</source><year>2011</year><volume>28</volume><fpage>2731</fpage><lpage>2739</lpage><pub-id pub-id-type="doi">10.1093/molbev/msr121</pub-id><pub-id pub-id-type="pmid">21546353</pub-id></citation></ref>
<ref id="b40-genes-02-01017"><label>40.</label><citation citation-type="book"><person-group person-group-type="author"><name><surname>Kim</surname><given-names>S.</given-names></name><name><surname>Jung</surname><given-names>K.</given-names></name><name><surname>Ryu</surname><given-names>K.</given-names></name></person-group><article-title>Automatic Orthologous-Protein-Clustering from Multiple Complete-Genomes by the Best Reciprocal BLAST Hits</article-title><source>Data Mining for Biomedical Applications</source><publisher-name>Springer</publisher-name><publisher-loc>Berlin/Heidelberg, Germany</publisher-loc><year>2006</year><volume>3916</volume><fpage>60</fpage><lpage>70</lpage></citation></ref>
<ref id="b41-genes-02-01017"><label>41.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Enright</surname><given-names>A.J.</given-names></name><name><surname>van Dongen</surname><given-names>S.</given-names></name><name><surname>Ouzounis</surname><given-names>C.A.</given-names></name></person-group><article-title>An efficient algorithm for large-scale detection of protein families</article-title><source>Nucleic Acids Res.</source><year>2002</year><volume>30</volume><fpage>1575</fpage><lpage>1584</lpage><pub-id pub-id-type="doi">10.1093/nar/30.7.1575</pub-id><pub-id pub-id-type="pmid">11917018</pub-id></citation></ref>
<ref id="b42-genes-02-01017"><label>42.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tatusov</surname><given-names>R.</given-names></name><name><surname>Fedorova</surname><given-names>N.</given-names></name><name><surname>Jackson</surname><given-names>J.</given-names></name><name><surname>Jacobs</surname><given-names>A.</given-names></name><name><surname>Kiryutin</surname><given-names>B.</given-names></name><name><surname>Koonin</surname><given-names>E.</given-names></name><name><surname>Krylov</surname><given-names>D.</given-names></name><name><surname>Mazumder</surname><given-names>R.</given-names></name><name><surname>Mekhedov</surname><given-names>S.</given-names></name><name><surname>Nikolskaya</surname><given-names>A.</given-names></name><etal/></person-group><article-title>The COG database: An updated version includes eukaryotes</article-title><source>BMC Bioinformatics</source><year>2003</year><volume>4</volume><fpage>41</fpage><pub-id pub-id-type="doi">10.1186/1471-2105-4-41</pub-id><pub-id pub-id-type="pmid">12969510</pub-id></citation></ref>
<ref id="b43-genes-02-01017"><label>43.</label><citation citation-type="journal"><person-group person-group-type="author"><collab>The Gene Ontology Consortium</collab><name><surname>Ashburner</surname><given-names>M.</given-names></name><name><surname>Ball</surname><given-names>C.A.</given-names></name><name><surname>Blake</surname><given-names>J.A.</given-names></name><name><surname>Botstein</surname><given-names>D.</given-names></name><name><surname>Butler</surname><given-names>H.</given-names></name><name><surname>Cherry</surname><given-names>J.M.</given-names></name><name><surname>Davis</surname><given-names>A.P.</given-names></name><name><surname>Dolinski</surname><given-names>K.</given-names></name><name><surname>Dwight</surname><given-names>S.S.</given-names></name><etal/></person-group><article-title>Gene ontology: Tool for the unification of biology</article-title><source>Nat. Genet.</source><year>2000</year><volume>25</volume><fpage>25</fpage><lpage>29</lpage><pub-id pub-id-type="doi">10.1038/75556</pub-id><pub-id pub-id-type="pmid">10802651</pub-id></citation></ref>
<ref id="b44-genes-02-01017"><label>44.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hunter</surname><given-names>S.</given-names></name><name><surname>Apweiler</surname><given-names>R.</given-names></name><name><surname>Attwood</surname><given-names>T.K.</given-names></name><name><surname>Bairoch</surname><given-names>A.</given-names></name><name><surname>Bateman</surname><given-names>A.</given-names></name><name><surname>Binns</surname><given-names>D.</given-names></name><name><surname>Bork</surname><given-names>P.</given-names></name><name><surname>Das</surname><given-names>U.</given-names></name><name><surname>Daugherty</surname><given-names>L.</given-names></name><name><surname>Duquenne</surname><given-names>L.</given-names></name><etal/></person-group><article-title>InterPro: The integrative protein signature database</article-title><source>Nucleic Acids Res.</source><year>2009</year><volume>37</volume><fpage>D211</fpage><lpage>D215</lpage><pub-id pub-id-type="doi">10.1093/nar/gkn785</pub-id><pub-id pub-id-type="pmid">18940856</pub-id></citation></ref></ref-list></back></article>
