<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xml:lang="en" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Genes</journal-id>
<journal-title>Genes</journal-title>
<issn pub-type="epub">2073-4425</issn>
<publisher>
<publisher-name>Molecular Diversity Preservation International (MDPI)</publisher-name></publisher></journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3390/genes2040925</article-id>
<article-id pub-id-type="publisher-id">genes-02-00925</article-id>
<article-categories>
<subj-group>
<subject>Article</subject></subj-group></article-categories>
<title-group>
<article-title>Conservation and Occurrence of Trans-Encoded sRNAs in the Rhizobiales</article-title></title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Reinkensmeier</surname><given-names>Jan</given-names></name><xref ref-type="aff" rid="af1-genes-02-00925"><sup>1</sup></xref><xref ref-type="author-notes" rid="fn1-genes-02-00925"><sup>†</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Schlüter</surname><given-names>Jan-Philip</given-names></name><xref ref-type="aff" rid="af2-genes-02-00925"><sup>2</sup></xref><xref ref-type="author-notes" rid="fn1-genes-02-00925"><sup>†</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Giegerich</surname><given-names>Robert</given-names></name><xref ref-type="aff" rid="af1-genes-02-00925"><sup>1</sup></xref></contrib>
<contrib contrib-type="author">
<name><surname>Becker</surname><given-names>Anke</given-names></name><xref ref-type="aff" rid="af2-genes-02-00925"><sup>2</sup></xref><xref ref-type="corresp" rid="c1-genes-02-00925"><sup>*</sup></xref></contrib></contrib-group>
<aff id="af1-genes-02-00925">
<label>1</label> Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; E-Mails: <email>jreinken@cebitec.uni-bielefeld.de</email> (J.R.); <email>robert@techfak.uni-bielefeld.de</email> (R.G.)</aff>
<aff id="af2-genes-02-00925">
<label>2</label> Institute of Biology III, Faculty of Biology, Albert-Ludwigs-University Freiburg, Schanzlestraße 1, 79102 Freiburg, Germany; E-Mail: <email>jan-philip.schlueter@biologie.uni-freiburg.de</email></aff>
<author-notes><fn id="fn1-genes-02-00925" fn-type="equal">
<label>†</label>
<p>These authors contributed equally to this work.</p></fn>
<corresp id="c1-genes-02-00925">
<label>*</label> Author to whom correspondence should be addressed; E-Mail: <email>anke.becker@uni-freiburg.de</email>; Tel.: +49-761-203-6948.</corresp></author-notes>
<pub-date pub-type="collection">
<year>2011</year></pub-date>
<pub-date pub-type="epub">
<day>18</day>
<month>11</month>
<year>2011</year></pub-date>
<volume>2</volume>
<issue>4</issue>
<fpage>925</fpage>
<lpage>956</lpage>
<history>
<date date-type="received">
<day>31</day>
<month>08</month>
<year>2011</year></date>
<date date-type="rev-recd">
<day>24</day>
<month>10</month>
<year>2011</year></date>
<date date-type="accepted">
<day>26</day>
<month>10</month>
<year>2011</year></date></history>
<permissions>
<copyright-statement>© 2011 by the authors; licensee MDPI, Basel, Switzerland.</copyright-statement>
<copyright-year>2011</copyright-year>
<license>
<p>This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/.)</p></license></permissions>
<abstract>
<p>Post-transcriptional regulation by trans-encoded sRNAs, for example via base-pairing with target mRNAs, is a common feature in bacteria and influences various cell processes, e.g., response to stress factors. Several studies based on computational and RNA-seq approaches identified approximately 180 trans-encoded sRNAs in <italic>Sinorhizobium meliloti</italic>. The initial point of this report is a set of 52 trans-encoded sRNAs derived from the former studies. Sequence homology combined with structural conservation analyses were applied to elucidate the occurrence and distribution of conserved trans-encoded sRNAs in the order of Rhizobiales. This approach resulted in 39 RNA family models (RFMs) which showed various taxonomic distribution patterns. Whereas the majority of RFMs was restricted to <italic>Sinorhizobium</italic> species or the <italic>Rhizobiaceae</italic>, members of a few RFMs were more widely distributed in the Rhizobiales. Access to this data is provided via the RhizoGATE portal [<xref ref-type="bibr" rid="b1-genes-02-00925">1</xref>,<xref ref-type="bibr" rid="b2-genes-02-00925">2</xref>].</p></abstract>
<kwd-group>
<kwd>trans-encoded sRNAs</kwd>
<kwd>comparative analyses</kwd>
<kwd>Rhizobiales</kwd></kwd-group></article-meta></front>
<body>
<sec sec-type="intro">
<label>1.</label>
<title>Introduction</title>
<p>In the past two decades the appreciation of small noncoding RNAs (sRNAs) and their importance rose from the status of an exceptional occurrence to that of a general and ubiquitous feature of gene regulation in prokaryotic and eukaryotic life. sRNAs were characterized to be involved in several cellular processes, e.g., response to a variety of cell stresses, regulation of quorum sensing, and toxin antitoxin systems [<xref ref-type="bibr" rid="b3-genes-02-00925">3</xref>–<xref ref-type="bibr" rid="b5-genes-02-00925">5</xref>]. Depending on their location and perfect or imperfect sequence complementarity to specific mRNA targets two major classes were determined: (i) cis-encoded sRNAs, located in antisense to their target mRNA and thus possessing perfect sequence complementarity, and (ii) trans-encoded sRNAs located independently from potential targets, commonly in intergenic regions (IGRs), where sequence complementarity to a possible target can be imperfect or disrupted [<xref ref-type="bibr" rid="b6-genes-02-00925">6</xref>]. Noncoding transcripts, generally 50–250 nt in length, act as (i) activator or repressor of translation (OxyS/fhlA; DsrA/rpoS), are involved in (ii) mRNA stabilization or degradation (GadY/GadX; RyhB/sodB), or act as (iii) target mimicry (6S RNA/CsrB and CsrC) [<xref ref-type="bibr" rid="b6-genes-02-00925">6</xref>–<xref ref-type="bibr" rid="b14-genes-02-00925">14</xref>].</p>
<sec>
<label>1.1.</label>
<title><italic>In Silico</italic> Prediction of sRNAs</title>
<p>RNA functional analyses revealed the relevance of RNA secondary structure for their function. Including RNA secondary structure information, various bioinformatics approaches were developed to identify and analyze sRNAs. <italic>In silico</italic> screens were performed in several bacteria. In <italic>Escherichia coli</italic>, initially four comprehensive analyses of IGRs were conducted, based on comparative sequence- and secondary structure-analyses as well as promoter and terminator predictions. Several hundred sRNA candidates were identified and 36 validated experimentally [<xref ref-type="bibr" rid="b15-genes-02-00925">15</xref>–<xref ref-type="bibr" rid="b18-genes-02-00925">18</xref>]. Following these studies, dozens of sRNA candidates were predicted in other bacteria using similar approaches, e.g., in <italic>Helicobacter pylori</italic> [<xref ref-type="bibr" rid="b19-genes-02-00925">19</xref>], <italic>Pseudomonas aeruginosa</italic> [<xref ref-type="bibr" rid="b20-genes-02-00925">20</xref>], <italic>Nitrosomonas europaea</italic> [<xref ref-type="bibr" rid="b21-genes-02-00925">21</xref>]. Tools for de novo sRNA gene finding, such as RNA<sc>z</sc> [<xref ref-type="bibr" rid="b22-genes-02-00925">22</xref>] and E<sc>vo</sc>F<sc>old</sc> [<xref ref-type="bibr" rid="b23-genes-02-00925">23</xref>] use multiple genome alignments and focus on sequence and structure conservation. In contrast, the application of <sc>cmsearch</sc> [<xref ref-type="bibr" rid="b24-genes-02-00925">24</xref>] requires prior knowledge about a family of related sRNAs in different species. Scans with <sc>cmsearch</sc> base on a combination of HMMs and covariance models. Agreement between approaches is low, and a potentially large number of false positives is predicted [<xref ref-type="bibr" rid="b25-genes-02-00925">25</xref>]. As validation of candidates by experimental methods is usually required anyway, researchers have increasingly turned towards experimental screens.</p></sec>
<sec>
<label>1.2.</label>
<title>Experimental Screens</title>
<p>High throughput studies based on the deep sequencing and tiling array technologies elevated the potential of sRNA identification enormously. Transcriptome studies of e.g., <italic>H. pylori</italic>, <italic>Caulobacter crescentus</italic>, and <italic>Synechocystis sp</italic>. PCC6803 revealed hundreds of new sRNAs [<xref ref-type="bibr" rid="b26-genes-02-00925">26</xref>–<xref ref-type="bibr" rid="b28-genes-02-00925">28</xref>]. In the order of Rhizobiales, transcriptome analyses, focused on sRNA identification, were reported for <italic>Rhizobium etli</italic>, <italic>Agrobacterium tumefaciens</italic>, and <italic>S. meliloti</italic> 1021. A tiling array study of the <italic>R. etli</italic> transcriptome resulted in identification of 17 putative trans-encoded sRNAs and 49 cis-encoded antisense sRNAs [<xref ref-type="bibr" rid="b29-genes-02-00925">29</xref>,<xref ref-type="bibr" rid="b30-genes-02-00925">30</xref>]. A deep sequencing approach in <italic>A. tumefaciens</italic> C58 identified 228 sRNA transcripts, 22 of which were experimentally confirmed via Northern blot experiments [<xref ref-type="bibr" rid="b31-genes-02-00925">31</xref>]. Beside individually detected and characterized sRNAs in <italic>S. meliloti</italic>, e.g., IncA, tmRNA, 4.5S RNA, and RNase P, bioinformatics based studies identified a set of sRNAs which were further validated by Northern blot experiments [<xref ref-type="bibr" rid="b32-genes-02-00925">32</xref>–<xref ref-type="bibr" rid="b37-genes-02-00925">37</xref>]. Recently, a comprehensive deep sequencing approach combined with microarray analyses extended the number of trans-encoded sRNAs to approximately 180 [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>].</p>
<p>An experimental screen delivers <italic>bona fide</italic> sRNA transcripts, with no obvious hints towards a potential functional role. By necessity, it starts from a single species and does not by itself incorporate phylogenetic information. Hence, it calls for a subsequent <italic>in silico</italic> study where the transcripts obtained for one species are taken as pivot elements to study their conservation and distribution in larger phylogenetic units. One intrinsic limitation of this approach is clear: an sRNA widely distributed, e.g., in the Rhizobiales, but lacking in <italic>S. meliloti</italic>, cannot be found. Hence, a complete survey of the phylogenetic order or even class from a single pivot organism is not possible.</p></sec>
<sec sec-type="methods">
<label>1.3.</label>
<title>Overview of the Present Study</title>
<p>The present study starts from <italic>S. meliloti</italic> 1021 as the pivot organism and from 52 trans-encoded sRNA transcripts obtained in our aforementioned study [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>]. For each transcript, we performed homology searches and constructed RNA family models (RFMs). Our goals are twofold:
<list list-type="bullet">
<list-item>
<p>We want to increase our knowledge about the distribution pattern of potential sRNAs conserved in the Rhizobiales;</p></list-item>
<list-item>
<p>We want to automate the bioinformatics steps that are necessary for RFM construction, as far as it is possible utilizing present-day bioinformatics tools.</p></list-item></list></p>
<p>The present article describes the RFM construction process, and discusses our observations made when applying these models to the Rhizobiales.</p></sec>
<sec>
<label>1.4.</label>
<title>Our Pivot Organism and Its Kind Relation</title>
<p>The endosymbiont <italic>S. meliloti</italic> exists in two different life forms, either in a free-living state as a soil bacterium or in a symbiotic relationship with its leguminous host plants, e.g., <italic>Medicago sativa</italic>. In response to flavonoids secreted by the host plant <italic>S. meliloti</italic> induces the formation of root nodules. These are colonized by the bacteria which inside the nodules differentiate to endosymbiotic bacteroids that are capable of nitrogen fixation. Bacteroids support the plant with ammonia and in turn receive C4-metabolites, e.g., succinate, from the host [<xref ref-type="bibr" rid="b39-genes-02-00925">39</xref>].</p>
<p>The genome of <italic>S. meliloti</italic> consists of three replicons, a single chromosome (3.65 Mbp) and two megaplasmids pSymA (1.35 Mbp) and pSymB (1.68 Mbp). The chromosome encodes 3,351 genes predominantly involved in housekeeping functions. The 1,293 genes on megaplasmid pSymA encode, among other functions, the symbiotic apparatus. pSymB carries 1,583 genes mainly involved in exopolysaccharide synthesis and transporter functions [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>–<xref ref-type="bibr" rid="b42-genes-02-00925">42</xref>].</p>
<p>Within the order of Rhizobiales, sequenced plant symbionts include <italic>Mesorhizobium loti</italic>, <italic>Sinorhizobium fredii</italic>, <italic>R. etli</italic>, and <italic>Sinorhizobium medicae</italic>. The order of Rhizobiales also comprises completely sequenced human-, animal- as well as plant-pathogens. The animal pathogen <italic>B. melitensis</italic>, for example, generally infects sheep and goats, but can act as a human pathogen as well [<xref ref-type="bibr" rid="b43-genes-02-00925">43</xref>]. <italic>Bartonella henselae</italic> is responsible for the cat-scratch disease of humans [<xref ref-type="bibr" rid="b44-genes-02-00925">44</xref>]. A well studied plant-pathogen is <italic>A. tumefaciens</italic>, which infects several dicotyledons and acts as powerful tool in plant genetics [<xref ref-type="bibr" rid="b45-genes-02-00925">45</xref>].</p></sec></sec>
<sec sec-type="results|discussion">
<label>2.</label>
<title>Results and Discussion</title>
<sec>
<label>2.1.</label>
<title>From sRNA Transcripts to Family Models</title>
<p>We define our notion of RNA family models and give an informal overview of how they are constructed, before we proceed to report on the findings obtained with these models. Details of family model construction are presented in the Methods section.</p>
<sec>
<label>2.1.1.</label>
<title>RNA Family Models: Terminology</title>
<p>The deep sequencing approach by Schlüter <italic>et al.</italic> [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>] elucidated the existence of approximately 1,100 noncoding transcripts encoded on the <italic>S. meliloti</italic> genome, about 180 of which were trans-encoded. Due to the presumed function as regulatory sRNAs, a subset of 52 trans-encoded transcripts was chosen for a first comparative study (see Section 3.1). Our pivotal transcripts are named <italic>SmelXnnn</italic>, consistent with Schlüter <italic>et al.</italic> [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>], where <italic>X</italic> ∈ {<italic>A</italic>, <italic>B</italic>, <italic>C</italic>} denotes the location on pSymA, pSymB, or chromosome, respectively. Potentially related sRNAs found by candidate search are simply named RNA1, RNA2, <italic>etc</italic>.</p>
<p>For all transcripts, we constructed RNA family models. Informally speaking, an RFM is a set of related sRNAs combined with a method to scan genomes to search for additional family members. Creating RNA family models is not a fully automated process, but requires both high computational effort and human curation.</p>
<p>We constructed two types of RFMs:
<list list-type="bullet">
<list-item>
<p><italic>Covariance models</italic> (CMs) are stochastic models, capturing sequence and structure conservation in an alignment of family members. CMs can be automatically constructed by <sc>infernal</sc> [<xref ref-type="bibr" rid="b24-genes-02-00925">24</xref>], given such an alignment;</p></list-item>
<list-item>
<p><italic>Thermodynamic matchers</italic> (TDMs) are RNA folding programs, based on the established thermodynamic model, but tailored to a specific structural motif [<xref ref-type="bibr" rid="b46-genes-02-00925">46</xref>]. Production of such matchers is supported by the graphical editor L<sc>ocomotif</sc> [<xref ref-type="bibr" rid="b47-genes-02-00925">47</xref>].</p></list-item></list></p>
<p>Both approaches to RFM construction are complementary. When sequence conservation is high enough such that a trustworthy multiple sequence alignment and consensus structure can be established, CMs can be constructed automatically. TDMs are appropriate if sequence conservation is much weaker than structure conservation, such that no candidates are found by sequence similarity search, or they cannot be aligned well. TDMs focus on structure and folding energy; they can ignore sequence conservation in some parts, e.g., in helices, and yet insist on conserved sequence motifs elsewhere, e.g., in loops. Building such a matcher requires human design decisions and some experimentation, and hence, it is more laborious. In this study, we constructed CMs as a rule and TDMs for selected families of special interest to promote identification of further family members.</p></sec>
<sec>
<label>2.1.2.</label>
<title>Overview of the Model Construction Process</title>
<p><xref ref-type="fig" rid="f1-genes-02-00925">Figure 1</xref> gives an overview of our CM construction pipeline. Phase 1 identifies putative homologous RNAs by iterative searches focusing on sequence similarity. Phase 2 constructs an initial family model based on sequence and conserved structure, and uses this model to search all Rhizobiales for further homologs. After adding these to the family, Phase 2 is also iterated.</p>
<p><xref ref-type="fig" rid="f2-genes-02-00925">Figure 2</xref> gives an overview of the TDM construction process. Here, we start from a transcript that has a well-defined secondary structure. First, we create a graphical description of this structure, using the L<sc>ocomotif</sc> editor. The graphics can be annotated with size constraints for structural components, and with required sequence motifs. The graphics is then compiled by <sc>locomotif</sc> into a TDM. We use the TDM to scan other bacterial genomes in order to find subsequences that fold well into the described structure motif. The assessment of candidates is used to adapt the design of the TDM to be more restrictive or more relaxed.</p>
<p>Both methods of family model construction use the same assessment step, which checks for further evidence: Preservation of synteny, quality of alignment against the pivotal transcript, energy of a free folding. For the details, we refer the reader to the Methods section.</p>
<p>The family models are named after their pivotal elements, e.g., <italic>RFM<sub>SmelA</sub></italic><sub>001</sub>. These RFMs by themselves constitute an essential part of the results of this study, as they can be used (and extended further) to increase our knowledge about sRNAs in bacteria—beyond the findings that are reported here.</p></sec></sec>
<sec>
<label>2.2.</label>
<title>Distribution Pattern of Trans-Encoded sRNAs in the Rhizobiales</title>
<p>The 52 analyzed trans-encoded sRNAs from <italic>S. meliloti</italic> and their relatives are collected in 39 RFMs (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1). At the first glance, they show a distribution of sRNAs in good accordance with phylogeny. In this subsection, we study their distribution in detail, moving from <italic>S. meliloti</italic> species to higher taxonomic levels. We will refer to <xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref> and Table S1 throughout this discussion.</p>
<p>Among our 52 transcripts, 34 map to a single origin in <italic>S. meliloti</italic> 1021 and give rise to 34 RFMs. These 34 transcripts reveal a relative distribution of 50% (17), 29.4% (10) and 20.6% (7) on the chromosome, pSymA and pSymB, respectively. The remaining 18 transcripts reveal strong sequence similarity to other transcripts, originate from multiple loci and replicons, and were summarized to five RFMs <italic>RFM<sub>SmelA</sub></italic><sub>003</sub>, <italic>RFM<sub>SmelA</sub></italic><sub>075</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>044</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>053</sub>, and <italic>RFM<sub>SmelB</sub></italic><sub>126</sub>.</p>
<sec>
<label>2.2.1.</label>
<title>Trans-Encoded sRNAs Delimited to the <italic>S. meliloti</italic> Strains 1021, BL225C, and AK83</title>
<p>Eleven of our transcripts appear to be restricted to <italic>S. meliloti</italic> strains, which share a core genome of approximately 5,100 genes dispersed on three replicons, a single chromosome, a second chromosome/megaplasmid and a symbiotic megaplasmid, respectively [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>–<xref ref-type="bibr" rid="b42-genes-02-00925">42</xref>,<xref ref-type="bibr" rid="b49-genes-02-00925">49</xref>]. However, in AK83 two additional small plasmids were identified with a few genetic features corresponding to syntenic regions of the 1021 and BL225C symbiotic replicons [<xref ref-type="bibr" rid="b49-genes-02-00925">49</xref>]. RFMs of SmelA001, SmelA018, SmelA019, SmelA020, SmelA054, SmelA056, SmelB064, and SmelC032 reveal homologous sequences in the <italic>S. meliloti</italic> strains 1021, BL225C and AK83, while relatives of SmelA014 and SmelA022 are limited to BL225C, the most closely related strain of <italic>S. meliloti</italic> 1021 [<xref ref-type="bibr" rid="b49-genes-02-00925">49</xref>]. No homologous sequences were identified in case of SmelC749. Thus, it represents the only trans-encoded sRNA specific for <italic>S. meliloti</italic> 1021 identified in our study. RFMs deduced from pSymA-located sRNAs are composed of relatives located on the replicons psiNMEB01 and chromosome 3 of <italic>S. meliloti</italic> BL225C and AK83, respectively. Both, psiNMEB01 of <italic>S. meliloti</italic> BL225C and chromosome 3 of AK83 share functional similarities with the symbiotic megaplasmid pSymA. Along this line, the RFM members of SmelB064 (pSymB) are located on pSymB-like replicons (BL225C psiNMEB02 and AK83 chromosome 2), while SmelC032 relatives are located on chromosomal-like replicons (BL225C chromosome and AK83 chromosome 1) [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>,<xref ref-type="bibr" rid="b41-genes-02-00925">41</xref>,<xref ref-type="bibr" rid="b49-genes-02-00925">49</xref>].</p>
<p>The RFMs of SmelA001, SmelA014, SmelA018, SmelA019, SmelA020, SmelA022, SmelA054, SmelA056, SmelB064, and SmelC032 contain relatives delimited to the three <italic>S. meliloti</italic> strains, which predominantly map to their particular symbiotic plasmids. Consequentially, on an evolutionary scale the emergence of these transcripts is a more recent <italic>S. meliloti</italic>-specific incident rather than a loss of the sRNA during evolution of all non-<italic>S. meliloti</italic> strains. This is in good agreement with Galibert <italic>et al</italic>. [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>], who concluded that pSymA was acquired more recently in the evolutionary history of <italic>S. meliloti</italic>. Genomic analyses revealed a divergence of genome contents between pSymA and the two remaining replicons [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>]. González <italic>et al.</italic> [<xref ref-type="bibr" rid="b50-genes-02-00925">50</xref>] hypothesized that emergence, remodeling, and annihilation of accessory plasmids in general, is highly variable within the Rhizobiales [<xref ref-type="bibr" rid="b50-genes-02-00925">50</xref>,<xref ref-type="bibr" rid="b51-genes-02-00925">51</xref>]. Our findings support the conclusion that emergence of trans-encoded sRNAs on accessory plasmids in general and on <italic>S. meliloti</italic>-specific symbiotic plasmids in particular is predominantly a recent evolutionary event. However, SmelB044 and SmelB064 have emerged on the pSymB-like replicon while SmelC032 and SmelC749 evolved on the ancestral chromosome. Thus, <italic>S. meliloti</italic>-specific sRNA emergence is not restricted to the symbiotic plasmids.</p></sec>
<sec>
<label>2.2.2.</label>
<title>Trans-Encoded sRNAs in the Genus <italic>Sinorhizobium</italic></title>
<p>18 RFMs show an extended set of relatives in the <italic>Sinorhizobium/Ensifer</italic> group. The genus, among others, is composed of the most closely related bacteria <italic>S. meliloti</italic> 1021, BL225C, and AK83, the next related <italic>S. medicae</italic> WSM419, and <italic>S. fredii</italic> NGR234 with the biggest phylogenetic gap to <italic>S. meliloti</italic>.</p>
<p>Trans-encoded sRNAs of <italic>RFM<sub>SmelA</sub></italic><sub>003</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>003</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>008</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>009</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>033</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>044</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>075</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>095</sub>, and <italic>RFM<sub>SmelB</sub></italic><sub>126</sub> were identified in <italic>S. medicae</italic> but not in <italic>S. fredii</italic>. The <italic>S. medicae</italic> WSM419 genome consists of four replicons, a circular chromosome and three plasmids, pSMED01 (1.5 Mbp), pSMED02 (1.2 Mbp), and pSMED03 (0.2 Mbp) [<xref ref-type="bibr" rid="b52-genes-02-00925">52</xref>]. The genomic distribution of these sRNAs reveals a strong overrepresentation (89%) on pSymB-like replicons (which are represented by pSMED01 in <italic>S. medicae</italic>).</p>
<p>In contrast, <italic>RFM<sub>SmelC</sub></italic><sub>055</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>416</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>434</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>500</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>507</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>549</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>601</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>775</sub>, and <italic>RFM<sub>SmelC</sub></italic><sub>776</sub> commonly include additional relatives in both <italic>S. medicae</italic> and <italic>S. fredii. S. fredii</italic> NGR234 has a single chromosome (3.93 Mbp) and two additional plasmids, pNGR234a (0.54 Mbp) and pNGR234b (2.43 Mbp), whereof the smaller plasmid encodes the symbiotic features [<xref ref-type="bibr" rid="b51-genes-02-00925">51</xref>]. RFMs derived from trans-encoding sRNAs occurring in all <italic>Sinorhizobium</italic> strains, including <italic>S. fredii</italic>, are composed of members that predominantly map to the chromosomal replicons. This is a notable difference to the 20 RFMs which are restricted to <italic>S. meliloti</italic> and <italic>S. medicae</italic> strains and whose members predominantly map to the megaplasmids. An exception is given by a single relative of SmelC434 that is located on megaplasmid pNGR234b in <italic>S. fredii</italic>. With exception of the multicopy sRNAs SmelA003 and SmelB126, not in a single case unique sRNAs were found on the symbiotic plasmids. This is in good agreement with the strong fluctuation of accessory plasmids [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>]. Furthermore, sizes of the symbiotic plasmids in <italic>S. fredii</italic> (0.54 Mbp) and <italic>S. meliloti</italic> strains (1.3 Mbp) differ by approximately 0.8 Mbp [<xref ref-type="bibr" rid="b41-genes-02-00925">41</xref>,<xref ref-type="bibr" rid="b51-genes-02-00925">51</xref>] and thus indicate a broad remodeling pattern even in the closely related members of the <italic>Sinorhizobium/Ensifer</italic> group.</p></sec>
<sec>
<label>2.2.3.</label>
<title>Trans-Encoded sRNAs in the <italic>Rhizobiaceae</italic></title>
<p>Most of the analyzed sRNA families (32 out of 39) are restricted to the <italic>Rhizobiaceae</italic>. RFMs of SmelA033, SmelC151, and SmelC165 hold members with origins in the <italic>Sinorhizobium</italic>, <italic>Rhizobium</italic>, and <italic>Agrobacterium</italic> species. <italic>RFM<sub>SmelA</sub></italic><sub>033</sub> shows an unusual distribution pattern with representatives in <italic>S. meliloti</italic>, <italic>S. medicae</italic>, and <italic>R. etli. RFM<sub>SmelC</sub></italic><sub>151</sub> comprises members distributed in the whole <italic>Rhizobiaceae</italic> except for the <italic>Candidatus liberibacter</italic> genus and <italic>S. meliloti</italic> AK83. <italic>RFM<sub>SmelC</sub></italic><sub>165</sub> displays a similar pattern but lacks relatives in <italic>A. tumefaciens</italic> and <italic>A. radiobacter</italic>.</p></sec>
<sec>
<label>2.2.4.</label>
<title>Complex Distribution of Trans-Encoded sRNAs in the Order of Rhizobiales</title>
<p>Slater <italic>et al.</italic> [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>] proposed that the ancestor of the Rhizobiales is an unichromosomal organism that acquired an additional, ancestral plasmid [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>]. This was underlined by a high proportion of conserved, primary chromosomes in Rhizobiales genera, e.g., <italic>Rhizobium</italic>, <italic>Sinorhizobium</italic>, <italic>Brucella</italic>, <italic>Bradyrhizobium</italic>, and <italic>Mesorhizobium</italic>, respectively. The theory of an ancestral plasmid was supported by the existence of several gene clusters that are conserved on the second chromosomes/megaplasmids while generally missing on the primary chromosomes [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>]. Further, due to intragenomic gene transfers essential genes, e.g., the tRNA-Arg encoding gene in <italic>S. meliloti</italic>, are sporadically rearranged to the second chromosomes/megaplasmids [<xref ref-type="bibr" rid="b40-genes-02-00925">40</xref>,<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>]. Further replicons, in addition to the ancestral chromosome and plasmid, were determined as accessory plasmids with beneficial but non-essential features [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>]. All this is in good agreement with our findings about sRNAs in the Rhizobales.</p>
<p>The comprehensive RFMs of SmelA075, SmelA099, SmelB053, SmelC023, SmelC289, SmelC291, and SmelC671 comprise members in the <italic>Phyllobacteriaceae</italic>, <italic>Brucellaceae</italic>, <italic>Bartonellaceae</italic>, <italic>Bradyrhizobiaceae</italic>, <italic>Methylobacteriaceae</italic>, <italic>Beijerinckaceae</italic>, and <italic>Hyphomicrobiaceae</italic>. The occurrence of these transcripts is restricted to the chromosomal replicons with an exception of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub>, <italic>RFM<sub>SmelA</sub></italic><sub>099</sub>, and <italic>RFM<sub>SmelB</sub></italic><sub>053</sub>, whose members occur several times in each genome with copies on each replicon. RFMs of SmelC023 and SmelC289 show an equal distribution pattern in the Rhizobiales. Both are represented by 29 sRNA relatives in the <italic>Rhizobiaceae</italic>, <italic>Brucellaceae</italic>, and <italic>Phyllobacteriaceae</italic>. SmelC671 has additional relatives in <italic>Bradyrhizobiaceae</italic> and <italic>Methylobacteriaceae. RFM<sub>SmelC</sub></italic><sub>291</sub> exhibits a more fragmentary occurrence in the Rhizobiales with relatives in the <italic>Rhizobiaceae</italic>, <italic>Phyllobacteriaceae</italic>, <italic>Xanthobacteriaceae</italic>, <italic>Beijerinckaceae</italic>, and <italic>Hyphomicrobiaceae</italic>.</p>
<p>RFMs of SmelA075, SmelC023, SmelC289, SmelC291, and SmelC671 show a broad distribution pattern within the Rhizobiales. Each sRNA family has relatives on primary chromosomes in the Rhizobiales strains and thus an ancestral trans-encoded sRNA for each of these models presumably arose in the beginning of the Rhizobiales evolution. However, <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> consists of presumably paralogous copies on different replicons, e.g., each replicon in <italic>S. meliloti</italic> 1021 harbors at least a single copy. The strong conservation in the Rhizobiales and the occurrence of at least a subset of copies on primary chromosomes suggest that the duplication and transfer events have initially been emanated from origins on ancestral chromosomes. Members of <italic>RFM<sub>SmelC</sub></italic><sub>291</sub> and <italic>RFM<sub>SmelC</sub></italic><sub>671</sub> indeed are distributed to the whole Rhizobiales, but are differentially lacking in several taxonomy families, e.g., <italic>Bartonellaceae</italic>, <italic>Bradyrhizobiaceae</italic> and <italic>Xanthobacteriaceae</italic>.</p>
<p>Similar to that, the <italic>Rhizobiaceae</italic>-specific <italic>RFM<sub>SmelA</sub></italic><sub>033</sub> and <italic>RFM<sub>SmelC</sub></italic><sub>165</sub> show a dispersed occurrence pattern, the former with representatives only in the <italic>R. etli</italic> strains. The latter is widely distributed in the <italic>Rhizobiaceae</italic> but does not occur in <italic>A. tumefaciens</italic> C58 and <italic>A. sp.</italic> H13-3. Presumably, the functional relevance of these transcripts has been lost since the specific emergence of these taxonomy families in their ecological niche. Generally, trans-encoded sRNAs act via base pairing with their mRNA targets or interact with RNA binding proteins [<xref ref-type="bibr" rid="b6-genes-02-00925">6</xref>]. In a precedent evolutionary step the target mRNA was presumably removed from the genome or somehow disrupted. This event in turn left a redundant, non-functional sRNA that was removed in the course of time. <italic>RFM<sub>SmelC</sub></italic><sub>151</sub> has relatives in the <italic>Rhizobium/Agrobacterium</italic> as well as the <italic>Sinorhizobium/Ensifer</italic> group and serves as a good example for a <italic>Rhizobia</italic>-specifc sRNA.</p>
<p>Our study reveals relatives of SmelA033, SmelA075, SmelA099, SmelB053, SmelC023, SmelC151, SmelC165, SmelC289, SmelC291, and SmelC671 in both <italic>Rhizobium etli</italic> species. A genome wide tiling array study for <italic>R. etli</italic> CFN42 [<xref ref-type="bibr" rid="b30-genes-02-00925">30</xref>] identified, among others, 17 noncoding RNAs. ReC06, ReC25, ReC26, and ReC71 were re-identified within this study as relatives of SmelC023, SmelC289, SmelC291, and SmelC671, respectively. Related transcripts of <italic>RFM<sub>SmelC</sub></italic><sub>291</sub> were confirmed via Northern blot analyses in <italic>S. meliloti</italic>, <italic>S. fredii</italic>, <italic>R. etli</italic> and <italic>R. leguminosarum</italic> strains and thus underlines the informative value of the comparative approach applied in this study [<xref ref-type="bibr" rid="b54-genes-02-00925">54</xref>]. Recently, a deep sequencing study using the 454-pyrosequencing technology identified about 228 noncoding transcripts located on the three <italic>A. tumefaciens</italic> C58 replicons. A subset of 22 sRNAs were additionally confirmed via Northern blot analyses. Eight of the RFMs computed in our study comprise members in <italic>A. tumefaciens</italic> and five, namely C1 (RNA8 of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub>), C2 (RNA12 of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub>), C5 (RNA11 of <italic>RFM<sub>SmelC</sub></italic><sub>289</sub>), C6 (RNA12 of <italic>RFM<sub>SmelC</sub></italic><sub>291</sub>), and L5 (RNA25 of <italic>RFM<sub>SmelA</sub></italic><sub>099</sub>) were experimentally verified in <italic>A. tumefaciens</italic> by both deep sequencing and Northern blot experiments [<xref ref-type="bibr" rid="b31-genes-02-00925">31</xref>].</p>
<p>Remarkably, each RFM with relatives beyond the <italic>Sinorhizobium/Ensifer</italic> group has no corresponding sRNA in the genus <italic>Liberibacter</italic>. Phylogenetic analyses identified the <italic>Liberibacter</italic> species as the most divergent species within the <italic>Rhizobiaceae</italic> with the largest distance to the root of this family. Thus, it might explain the lack of homologs of <italic>Rhizobiaceae</italic>-specific sRNAs in this genus [<xref ref-type="bibr" rid="b55-genes-02-00925">55</xref>]. Even for the most conserved models, e.g., <italic>RFM<sub>SmelC</sub></italic><sub>671</sub>, relatives in the <italic>Liberibacter</italic> genus were not found. In this context, it has to be noted that the genome of the <italic>Liberibacter</italic> genus consists of a relatively small chromosome, only 1.2 Mbp in size. Compared to the remaining <italic>Rhizobiaceae</italic>, enormous genomic capacity, including open reading frames and sRNA genes, has been lost within the <italic>Liberibacter</italic> lineage.</p></sec></sec>
<sec>
<label>2.3.</label>
<title>Microsynteny</title>
<p>Microsynteny means the preservation of the adjacent protein-coding gene upstream or downstream of a putative sRNA locus. Gene function is usually not affected by its location in relation to its genomic neighborhood. Consequentially, the degree of synteny is lost much faster than sequence similarity and represents a sensitive indicator for genome evolution [<xref ref-type="bibr" rid="b56-genes-02-00925">56</xref>–<xref ref-type="bibr" rid="b58-genes-02-00925">58</xref>].</p>
<p>To classify the dimension of microsynteny for the 39 RFMs, four categories were specified.
<list list-type="bullet">
<list-item>
<p>Complete microsynteny (type I) is determined for relatives of both neighboring genes;</p></list-item>
<list-item>
<p>extensive microsynteny (type II) means the majority of genes shares homology but with a few exceptions;</p></list-item>
<list-item>
<p>partial microsynteny (type III) specifies the homology to a single adjacent gene and</p></list-item>
<list-item>
<p>fragmented microsynteny (type IV) is given by subsets of homologous genes within a RFM.</p></list-item></list></p>
<p>According to this definition, microsynteny of type I, II, III, IV was observed for 9, 17, 1, and 11 RFMs, respectively (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1). Microsynteny analyses for the stand-alone sRNA SmelC749 was not performed.</p>
<p>Complete microsynteny was observed for RFMs SmelA022, SmelA054, SmelB095, SmelC032, SmelC055, SmelC165, SmelC549, SmelC775, and SmelC776, which are predominantly (8 out of 9) restricted to the <italic>Sinorhizobium/Ensifer</italic> group. An exception is given by <italic>RFM<sub>SmelC</sub></italic><sub>165</sub> with relatives in the <italic>Rhizobiaceae</italic>. Extensive microsynteny was observed for RFMs of SmelA001, SmelA020, SmelA056, SmelB003, SmelB008, SmelB009, SmelB075, SmelC289, SmelC416, SmelC500, SmelC507, SmelC601, SmelC671, SmelC023, SmelC434, SmelA033, and SmelC291 (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1). Similar to the aforementioned RFMs, with the exception of SmelA033, SmelC023, SmelC289, SmelC291, and SmelC671, this type of microsynteny was predominantly observed in the <italic>Sinorhizobium/Ensifer</italic> group as well. As expected, the degree of microsynteny is higher for RFMs that are restricted to closely related organisms and to RFMs with a predominant occurrence on descendants of the ancestral chromosome and megaplasmid [<xref ref-type="bibr" rid="b53-genes-02-00925">53</xref>,<xref ref-type="bibr" rid="b56-genes-02-00925">56</xref>].</p>
<p>All RNAs of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub> are located adjacent to a DNA polymerase I encoding gene, except for RNA12 of <italic>A. tumefaciens</italic> str. C58. Furthermore, for the 14 RNAs identified in the <italic>Rhizobiaceae</italic>, a gene encoding a MarR-type transcriptional regulator is situated next to and in case of RNA4 of <italic>S. fredii</italic> overlaps the sRNA gene (<xref ref-type="fig" rid="f5-genes-02-00925">Figure 5c</xref>, Table S1). Due to the aberrant length of the overlapping transcriptional regulator gene (compared to its homologous genes) and the presence of alternative start codons approximately 200 nt downstream of the predicted start we presume an annotation mistake. Except for RNA22, RNA25, and RNA28 of <italic>B. abortus</italic> S19, <italic>B. ovis</italic> ATCC25840 and <italic>B. melitensis</italic> M28, all <italic>RFM<sub>SmelC</sub></italic><sub>023</sub> members that occur in the <italic>Brucellaceae</italic> are located antisense to a predicted small peptide encoding region. Due to the fact that the sequence of the predicted ORFs is also present in other <italic>Brucellaceae</italic> which lack this annotation, most likely this ORF was missed during gene prediction.</p>
<p>Except for RNA3 of <italic>S. medicae</italic> WSM419, all relatives of SmelC289 are located next to a prolyl-tRNA synthetase gene (<xref ref-type="fig" rid="f6-genes-02-00925">Figure 6c</xref>, Table S1). In case of RNA11, RNA13, RNA14, RNA15, RNA17, RNA19, RNA20, and RNA21 of the <italic>Brucellaceae</italic>, the prolyl-tRNA synthetase gene is indeed located adjacent to the corresponding sRNA genes, but these sRNA genes are overlapped in antisense by one or two presumably misannotated, small hypothetical genes. For 18 RFMs, overlapping genes were predicted of which the majority is annotated as hypothetical and thus their function and existence remain in question.</p>
<p>Partial microsynteny was only observed for <italic>RFM<sub>SmelA</sub></italic><sub>014</sub>, while fragmented microsynteny is given for RFMs of SmelA003, SmelA018, SmelA019, SmelA075, SmelA099, SmelB033, SmelB044, SmelB053, SmelB064, SmelB126, and SmelC151 (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1).</p>
<p>In case of RFMs of SmelA003, SmelA075, SmelB053, SmelB044, and SmelB126, fragmented microsynteny is explained by their multiple copy numbers per genome. Members of these RFMs indicate a higher rate of intragenomic transfers. <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> and <italic>RFM<sub>SmelA</sub></italic><sub>099</sub> occur with three and four hairpin loops, respectively, with similar loop motifs (see Section 2.7). Considering both RFMs, fragmented microsynteny is the dominant observation, but a closer look at specific taxonomy families, e.g., <italic>R. leguminosarum</italic> reveals “local” microsynteny. In detail, the <italic>R. leguminosarum</italic> relatives RNA9, RNA14, RNA16, RNA17, RNA18, RNA21, and RNA24 of <italic>RFM<sub>SmelA</sub></italic><sub>099</sub> and RNA12, RNA17, RNA19, RNA26, RNA35, RNA42, and RNA54 of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> occur in complete and extensive microsynteny, respectively <italic>RFM<sub>SmelB</sub></italic><sub>126</sub> contains 4, 3, 2, and 5 copies in <italic>S. meliloti</italic> 1021, BL225c, Ak83, and <italic>S. medicae</italic> WSM419, respectively An association to a potassium transporter encoding gene was observed for at least a single sRNA copy in each genome. Thus it is tempting to speculate that the potassium transporter associated sRNAs represent the ancestral version of this RFM.</p></sec>
<sec>
<label>2.4.</label>
<title>Copy Numbers and Association with Mobile Genetic Elements</title>
<p>Multiple copy numbers per genome as well as the scarce microsynteny of RFMs of SmelA003, SmelA075, SmelB044, SmelB053, and SmelB126 are in good agreement with the scattered occurrence of mobile genetic elements next to the sRNA loci. Mobile genetic elements probably contribute significantly to the genetic polymorphism in <italic>S. meliloti</italic> natural populations [<xref ref-type="bibr" rid="b59-genes-02-00925">59</xref>], since mobile genetic elements are able to copy and uncouple sRNA loci from their genomic context. Repeats and mobile genetic elements were also associated to members of <italic>RFM<sub>SmelA</sub></italic><sub>014</sub>, <italic>RFM<sub>SmelA</sub></italic><sub>054</sub><italic>, RFM<sub>SmelB</sub></italic><sub>003</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>008</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>009</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>064</sub>, <italic>RFM<sub>SmelB</sub></italic><sub>075</sub>, and <italic>RFM<sub>SmelC</sub></italic><sub>500</sub>.</p></sec>
<sec>
<label>2.5.</label>
<title>Structural Features Conserved in RFMs</title>
<p>Generally, transcripts have a varying number of sub-structural RNA-domains determined as stacked base pairs, internal loops, bulges and hairpin loops [<xref ref-type="bibr" rid="b60-genes-02-00925">60</xref>]. A number of sRNAs, e.g., Yfr1 of several cyanobacteria, reveal typical Rho-independent terminator-like features with a 3′-located, GC rich hairpin followed by a poly-U-tail [<xref ref-type="bibr" rid="b61-genes-02-00925">61</xref>,<xref ref-type="bibr" rid="b62-genes-02-00925">62</xref>]. Additional examples for sRNAs with typical terminator features are represented by RprA and Qrr1 of <italic>E. coli</italic> and <italic>Vibrio cholera</italic>, respectively [<xref ref-type="bibr" rid="b63-genes-02-00925">63</xref>]. RFMs of SmelB126, SmelB053, SmelC434, SmelC507, SmelC151, SmelC023, SmelC289, and SmelC671 include typical terminator structures. On the contrary, <italic>RFM<sub>SmelB</sub></italic><sub>075</sub>, <italic>RFM<sub>SmelA</sub></italic><sub>014</sub>, <italic>RFM<sub>SmelA</sub></italic><sub>054</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>416</sub>, <italic>RFM<sub>SmelC</sub></italic><sub>601</sub>, and <italic>RFM<sub>SmelC</sub></italic><sub>165</sub> contain stems with atypical hairpin loops, e.g., disrupted with internal loops, followed by poly-U-tails (<xref ref-type="fig" rid="f5-genes-02-00925">Figures 5b</xref> and <xref ref-type="fig" rid="f6-genes-02-00925">6b</xref>, Supplement S1). Otaka <italic>et al.</italic> [<xref ref-type="bibr" rid="b64-genes-02-00925">64</xref>] reported that besides transcription termination, terminator poly-U-tails of the noncoding transcripts SgrS and RyhB in <italic>E. coli</italic> are essential for Hfq interaction and riboregulation [<xref ref-type="bibr" rid="b64-genes-02-00925">64</xref>]. A similar pattern could be presumed in case of the aforementioned trans-encoded sRNA models. However, the remaining models reveal no terminator-like features (Supplement S1).</p>
<p>A complex situation is given for SmelB050, SmelB053, and SmelC691. The pSymB-located sRNA genes share typical trans-encoded sRNA gene features with a long distance to neighboring genes. Their transcripts have distinct 5′- and 3′-ends and form a triple stem loop structure [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>]. SmelC691 has a similar pattern except for the first stem loop. The identified sRNA relatives, all collected in <italic>RFM<sub>SmelB</sub></italic><sub>053</sub>, are mainly found in the <italic>Rhizobiaceae</italic>; only a single member occurs in <italic>Ochrobactrum anthropii</italic> in the <italic>Brucellaceae</italic> (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1). Comparison of all homologous sequences indicates the first stem loop as the most variable, sometimes completely missing domain, while the second stem loop shows strong conservation, at least in the loop motifs (GGAUGUA). The third stem loop has typical Rho-independent terminator-like features (<xref ref-type="fig" rid="f4-genes-02-00925">Figure 4a</xref>, Supplement S1). In addition to the identified sRNA relatives of <italic>RFM<sub>SmelB</sub></italic><sub>053</sub>, a number of putative 3′-UTRs were identified in the Rhizobiales, which occur in a sequence and structure pattern similar to the second and third stem loop of the SmelB053 relatives (<xref ref-type="fig" rid="f4-genes-02-00925">Figure 4b</xref>). The majority of the identified 3′-UTRs (29 out of 33) are connected to genes coding for proteins involved in cold shock adaptation, e.g., SmelC521 in <italic>S. meliloti</italic> 1021 (<xref ref-type="fig" rid="f4-genes-02-00925">Figure 4c</xref>, Table S2) [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>]. Post-transcriptional regulation of cold shock genes via special 3′-UTR structures was reported in case of the 428 nt long <italic>cspA</italic> mRNA in <italic>E. coli</italic>. The mRNA has two stem loop structures at its 3′-end connected to regulation of degradation via binding of Hfq and Poly-(A)-polymerase I (PAP I) that prevents binding of polynucleotide phosphorylase and RNAseE [<xref ref-type="bibr" rid="b66-genes-02-00925">66</xref>,<xref ref-type="bibr" rid="b67-genes-02-00925">67</xref>]. Due to the structural similarity of the homologous sequences identified in our study compared to the <italic>cspA</italic> 3′-UTR of <italic>E. coli</italic> and the predominant connection of the homologous 3′-UTRs to cold shock genes in the Rhizobiales, we suggest similar functions of the 3′-UTRs in a Hfq dependent manner. The functional characteristics of the trans-encoded sRNA SmelB053 and its relatives remain unclear, but due to their conspicuous similarity to the conserved 3′-domains of cold shock genes, they might act as interceptor transcripts that sequester RNA degradation complexes and thus protect and stabilize mRNAs. Sequestration is a well characterized phenomenon in bacteria, e.g., exemplified by CsrB, 6S and GlmY RNA [<xref ref-type="bibr" rid="b14-genes-02-00925">14</xref>,<xref ref-type="bibr" rid="b63-genes-02-00925">63</xref>,<xref ref-type="bibr" rid="b68-genes-02-00925">68</xref>].</p>
<p>Strong sequence and thus structure conservation is a general feature between RFM members of <italic>S. meliloti</italic>-specific sRNAs (SmelA001, SmelA014, SmelA018, SmelA019, SmelA020, SmelA022, SmelA054, SmelA056, SmelB064, SmelC032). The lowest conservation is found in <italic>RFM<sub>SmelB</sub></italic><sub>064</sub> with a structure conservation index (SCI, see Methods section) of 0.91, while the remaining transcripts of each model reveal strong sequence and thus structure conservation, with SCI values of approximately 1 (Supplement S1). Functional characteristics of trans-encoded sRNAs are commonly provided by sub-sequences and structures, e.g., in case of DsrA and RyhB in <italic>E.coli</italic> [<xref ref-type="bibr" rid="b63-genes-02-00925">63</xref>]. Consequently in sRNA relatives, these crucial domains of sRNAs should be conserved in a more stringent manner than in non-functional domains. However, due to the kinship between and the high sequence similarities of the <italic>S. meliloti</italic> sRNAs, conclusions about conspicuous functional sub-structures of these transcripts remain impractical. As a matter of course, we see more sequence divergence for a subset of 22 RFMs with additional relatives in the <italic>Rhizobiaceae. RFM<sub>SmelA</sub></italic><sub>003</sub> and <italic>RFM<sub>SmelB</sub></italic><sub>126</sub> have several genome internal copies in each strain (<xref ref-type="fig" rid="f3-genes-02-00925">Figure 3</xref>, Table S1). Presumably due to the duplication events, transcripts of these RFMs show strong sequence divergence. However, the structure as well as a motif section of the second hairpin loop of <italic>RFM<sub>SmelA</sub></italic><sub>003</sub> shows conservation (Supplement S1). <italic>RFM<sub>SmelB</sub></italic><sub>126</sub> reveals piecewise sequence conservations at more or less conserved positions, as well as a conserved 3′-located terminator hairpin. The best structural conservation for the 22 RFMs is represented by <italic>RFM<sub>SmelC</sub></italic><sub>055</sub> (SCI = 1.04), <italic>RFM<sub>SmelC</sub></italic><sub>151</sub> (SCI = 1.02), <italic>RFM<sub>SmelB</sub></italic><sub>009</sub> (SCI = 1.02), <italic>RFM<sub>SmelC</sub></italic><sub>500</sub> (SCI = 1.01), <italic>RFM<sub>SmelC</sub></italic><sub>775</sub> (SCI = 1.0), <italic>RFM<sub>SmelB</sub></italic><sub>075</sub> (SCI = 1.0), and <italic>RFM<sub>SmelB</sub></italic><sub>008</sub> (SCI = 1.0) (Supplement S1). The strongest nucleotide divergence is shown by <italic>RFM<sub>SmelC</sub></italic><sub>549</sub> with relatives in <italic>S. meliloti</italic> and <italic>S. fredii</italic> (SCI = 0.69) (Supplement S1).</p>
<p>In general, <italic>RFM<sub>SmelC</sub></italic><sub>549</sub> consists of four conserved stem loops but is disrupted by several internal loops and single stranded domains. These single stranded domains are the most deviating regions and thus they are responsible for the depressed SCI value. This high conservation implies a functional role of these four domains (Supplement S1).</p>
<p>The RFMs of SmelC023 (SCI = 1.11), SmelA075 (SCI = 0.99), SmelC289 (SCI = 0.9), SmelC291 (SCI = 0.55), and SmelC671 (SCI = 0.55) are composed of transcripts with variable sequence and structure conservations, presumably derived from a common ancestor (Supplement S1). In case of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub> and <italic>RFM<sub>SmelC</sub></italic><sub>289</sub>, four and five hairpin loops, respectively, are the main components of these transcripts. The 5′-located domain of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub> is highly variable both in length and nucleotide composition. The 3′-located stem loop of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub> is a Rho-independent terminator-like structure with high GC content (Supplement S1). This is supported by the poly-T-sequence following the annotated sRNA gene (data not shown). Presumably a degradation event of the initially annotated sRNA SmelC023 occurred and resulted in a processed 3′-end. The middle domain consists of two hairpin loop structures with highly conserved sequences within the loops, while the structural maintenance of these sub-domains is provided by stems with varying nucleotide compositions with evolutionary established base pair exchanges (<xref ref-type="fig" rid="f5-genes-02-00925">Figures 5a,b</xref>, Supplement S1). This strongly suggests that the functional maintenance of this molecule is provided by both the hairpin structures and the loop sequences. sRNAs with several functional domains are a common feature in bacteria. Obvious examples are the trans-encoded sRNAs OxyS and DsrA in <italic>E. coli</italic>. The former binds to the fhlA mRNA in a Hfq-dependent manner. Two stem loops are presumed to be involved in mRNA binding, while the interaction site of Hfq is different to that of the mRNA binding site [<xref ref-type="bibr" rid="b3-genes-02-00925">3</xref>,<xref ref-type="bibr" rid="b69-genes-02-00925">69</xref>,<xref ref-type="bibr" rid="b70-genes-02-00925">70</xref>]. The latter exhibits different hairpins for different targets. In detail, DsrA consists of three stem loops of which the second is able to interact with <italic>hns</italic> mRNA, blocks the ribosome binding site (RBS) and thus inhibits mRNA translation. The third stem loop interacts with the <italic>rpoS</italic> mRNA and activates translation via remodeling of an inhibitory mRNA sub-structure [<xref ref-type="bibr" rid="b70-genes-02-00925">70</xref>]. Similar features could be presumed for <italic>RFM<sub>SmelC</sub></italic><sub>291</sub> (sra33) [<xref ref-type="bibr" rid="b54-genes-02-00925">54</xref>]. The 5′-domain of <italic>RFM<sub>SmelC</sub></italic><sub>291</sub> has a structurally conserved stem loop with a strongly conserved loop motif, UCCGCCGCAUCU, while the second stem loop shows extremely variable stem sequences but occurs with a dominant but different loop motif, UCCUCG as well (Supplement S1).</p>
<p><italic>RFM<sub>SmelC</sub></italic><sub>289</sub> shows a 5′-located stem loop characterized by an AU-rich loop, stabilized by a variable stem of organism-specific nucleotide contents. Similar to <italic>RFM<sub>SmelC</sub></italic><sub>023</sub>, the 3′-region contains a Rho-independent terminator-like structure and the middle domain consists of two conserved hairpin loops as well. The former hairpin shows complete conservation, while the latter reveals a highly conserved stem with a variable loop (<xref ref-type="fig" rid="f6-genes-02-00925">Figures 6a,b</xref>, Supplement S1). However, the functional patterns of noncoding transcripts are not restricted to their presumed single stranded domains, e.g., the binding sites of RprA sRNA in <italic>E. coli</italic> are predominantly incorporated in stem structures [<xref ref-type="bibr" rid="b63-genes-02-00925">63</xref>]. Further hidden sRNA sequences that are essential for target interactions could be released due to a structural reformation of the sRNA, for example, the RNA binding protein Hfq is able to alter the secondary structure of the RydC sRNA in <italic>Enterobacteriaceae</italic> resulting in an active version of this transcript [<xref ref-type="bibr" rid="b71-genes-02-00925">71</xref>]. <italic>RFM<sub>SmelC</sub></italic><sub>671</sub> represents the most variable model of related sRNAs in this study. The transcript has a long single-stranded domain with a conserved CUCCCUGU motif, enclosed by down- and upstream located, highly variable hairpin loops of which the 3′-located domain acts as terminator (Supplement S1). A similar pattern was observed for Qrr1 in <italic>Vibrio cholera</italic>. This transcript reveals a long single stranded domain between its first and second loop, which was verified as the mRNA interaction site. This motif is highly conserved within Qrr2-4, which are paralogs of Qrr1 [<xref ref-type="bibr" rid="b4-genes-02-00925">4</xref>,<xref ref-type="bibr" rid="b63-genes-02-00925">63</xref>].</p></sec>
<sec>
<label>2.6.</label>
<title>sRNAs in Antisense</title>
<p>Large numbers of cis-encoded antisense RNAs were identified, e.g., via sequencing and tiling array studies in <italic>Synechocystis</italic> [<xref ref-type="bibr" rid="b28-genes-02-00925">28</xref>], <italic>H. pylori</italic> [<xref ref-type="bibr" rid="b26-genes-02-00925">26</xref>], <italic>S. meliloti</italic> [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>], and <italic>R. etli</italic> [<xref ref-type="bibr" rid="b30-genes-02-00925">30</xref>]. Cis-encoded antisense sRNAs are located in antisense to their targets, act via perfect base pairing and mediate post-transcriptional regulation, e.g., stabilization or destabilization of target mRNAs [<xref ref-type="bibr" rid="b6-genes-02-00925">6</xref>]. Pairs of small noncoding transcripts that are located in antisense to each other remain rarely identified in bacteria. Georg and Hess [<xref ref-type="bibr" rid="b72-genes-02-00925">72</xref>] presumed that two small transcripts encoded by unlinked sRNA genes interact with each other due to a complementary section in both transcripts. However, the functional relevance of this feature needs to be further elucidated.</p>
<p>Here, we observe that SmelC776 relatives are located in antisense to <italic>RFM<sub>SmelC</sub></italic><sub>775</sub> sRNAs and thus most likely allow a mutual interference of both transcripts. This is in good agreement with the strong sequence conservation of SmelC776 (SCI = 0.95) within the overlapping domain (Table S1, Supplement S1). Approximately 90% (17 out of 19) of the nucleotide exchanges of <italic>RFM<sub>SmelC</sub></italic><sub>776</sub> in the <italic>S. fredii</italic> strain occur outside the overlapping part and thus suggest the antisense part as the functional transcript domain.</p></sec>
<sec>
<label>2.7.</label>
<title>Focus on SmelA075 and SmelA099</title>
<p>A remarkable situation was found in case of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> and <italic>RFM<sub>SmelA</sub></italic><sub>099</sub>, which exhibit three and four hairpin loops, respectively. Each hairpin loop carries similar loop motifs, CCUCCUCCC, representing an anti Shine-Dalgarno (aSD) sequence, while the stems show more variability in their nucleotide content (<xref ref-type="fig" rid="f7-genes-02-00925">Figure 7</xref>, Supplement S1). In several sRNAs, e.g., RNA III in <italic>Staphylococcus aureus</italic>, CyaR in <italic>Enterobacteria</italic> and ABcR1 in <italic>A. tumefaciens</italic> C58, loop sequences with at least partial aSD motifs were observed. Functional analyses demonstrated that the aSD motif is indispensable for sRNA binding to the RBS of the particular mRNA targets, as well as the resulting translation inhibition [<xref ref-type="bibr" rid="b73-genes-02-00925">73</xref>–<xref ref-type="bibr" rid="b79-genes-02-00925">79</xref>]. The sequencing profile as well as Northern blot analyses of SmelA075, which is a member of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub>, suggested SmelA075 as a stress-induced sRNA that occurs in several processed forms [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>]. This is in good agreement with the trans-encoded sRNA RS0680a and its homologous transcripts identified in <italic>Rhodobacter sphaeroides</italic>. RS0680a represents a shorter version than the RFM members identified in this study. The transcript has two stem loops instead of three and four, and each comprises an aSD motif. It was suggested that RS0680a undergoes different maturation processes and is involved in the stress response in a more general pattern via binding to the RBS of several genes [<xref ref-type="bibr" rid="b80-genes-02-00925">80</xref>]. From a biological point of view, it was suggested to group all derivatives of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> and <italic>RFM<sub>SmelA</sub></italic><sub>099</sub> to a single sRNA family that consists of members with a varying number of hairpin modules. Implementing all facts, this RNA family consists of several copies per genome. It is somehow involved in stress adaption, presumably in post-transcriptional regulation via blocking the RBS of target mRNAs.</p></sec></sec>
<sec>
<label>3.</label>
<title>Experimental Section</title>
<p>In this section, we describe in detail the process of RNA family model construction from our <italic>S. meliloti</italic> transcripts which is not a fully automated method. We report on the automatic steps and on the points where human judgement or design decisions are involved. A genuine knowledge of common bioinformatics tools is assumed. All tools used are strictly concerned with secondary structure—non-standard base pairs, possible pseudoknots, or other tertiary interactions are not considered. Although including such features would be desirable, present day tools cannot achieve this.</p>
<sec sec-type="methods">
<label>3.1.</label>
<title>Sequence Data and Databases</title>
<p>52 of approximately 180 trans-encoded sRNAs were selected and downloaded from GenDB [<xref ref-type="bibr" rid="b81-genes-02-00925">81</xref>], accessible via the RhizoGATE portal [<xref ref-type="bibr" rid="b1-genes-02-00925">1</xref>,<xref ref-type="bibr" rid="b2-genes-02-00925">2</xref>]. The choice of these 52 candidates was made in purely technical terms: Candidate transcripts should be well-covered by sequence reads, with clearly defined ends, and remote from any coding region. We plan to generate models for the remaining transcripts in the near future.</p>
<p>Complete genome sequences and annotations of all <italic>Rhizobiales</italic> available were obtained from the NCBI genomes FTP site [<xref ref-type="bibr" rid="b82-genes-02-00925">82</xref>]. For a complete list see Table S3. Additionally, whole genome and plasmid sequences were included that are not (yet) part of the above collection. Sequence data of <italic>A. sp. H13-3</italic>, <italic>B. melitensis M28</italic>, <italic>S. meliloti AK83</italic>, and <italic>S. meliloti BL225C</italic>, all members of the order of the Rhizobiales, were downloaded from the NCBI nucleotide database.</p></sec>
<sec>
<label>3.2.</label>
<title>Construction of RNA Family Models</title>
<p>There is no standard and fully automated way to construct an RNA family model. The general difficulty of this process has been discussed, e.g., in [<xref ref-type="bibr" rid="b83-genes-02-00925">83</xref>]. Family model construction is supported by a variety of tools, but interspersed with modeling decisions and candidate screening by a human expert. In our case, we start with a trans-encoded sRNA from <italic>S. meliloti</italic>, say SmelXnnn, and construct an RNA family model <italic>RFM<sub>SmelXnnn</sub></italic>, which (1) comprises a set of orthologous RNA sequences from related organisms; and provides (2) a search function to find further family members in the Rhizobiales and beyond. In a few cases, we find that several sRNAs should be collected into the same family model, which is then named arbitrarily after one of them.</p>
<p>We constructed types of RFMs, <italic>Covariance models</italic> and <italic>Thermodynamic matchers</italic>. The use of these two complementary methods has already been motivated above (Section 2.1). We now add some details about both methods, and about the assessment step used with both.</p></sec>
<sec>
<label>3.3.</label>
<title>Automated Candidate Generation and CM Construction Steps</title>
<p>Recall <xref ref-type="fig" rid="f1-genes-02-00925">Figure 1</xref>, which gives an overview of our RFM construction pipeline. Phase 1 identifies putative homologous RNAs by iterative searches focusing on sequence homology. Phase 2 constructs an initial family model based on sequence and conserved structure, and uses this model to search all Rhizobiales for further homologs.</p>
<sec>
<label>3.3.1.</label>
<title>Phase 1: Sequence Homology Search for SmelXnnn</title>
<p>In the first stage of the workflow, putative homologs are obtained by sequence homology searches using <sc>blastn</sc> [<xref ref-type="bibr" rid="b84-genes-02-00925">84</xref>] and G<sc>otoh</sc>S<sc>can</sc> [<xref ref-type="bibr" rid="b85-genes-02-00925">85</xref>]. We initialize the search for homologous RNA sequences of a reference sequence SmelXnnn by employing <sc>blastn</sc> with <italic>E</italic> &lt; 10<sup>−5</sup> on the complete set of alphaproteobacterial genomes (For word-size and scoring function we set the parameters -W 7 -q -3 -r 2 -G 2 -E 2).</p>
<p>As fragmentation of conserved regions is a common characteristics of RNA families, candidate homologs often do not cover the complete reference sequence. Therefore, <sc>blastn</sc> matches must be postprocessed. First, the candidate sequence is extended on either side to cover the reference sRNA sequence plus an extra 10% of its length on either side. Next, the reference is semi-globally aligned to the candidate sequence and un-matched leading and trailing bases of the candidate are trimmed.</p>
<p>The detection of RNA homologs is complemented by G<sc>otoh</sc>S<sc>can</sc> searches for SmelXnnn in three separated sequence databases representing the families of <italic>Rhizobiaceae</italic>, <italic>Brucellaceae</italic>, and <italic>Phyllobacteriaceae</italic>, using default parameters. The latter are the most closely related families of <italic>Rhizobiaceae</italic>, with <italic>S. meliloti</italic> as member, within the order of Rhizobiales.</p>
<p>Resulting candidates from both search methods are combined and undergo assessment in the same way.</p></sec>
<sec>
<label>3.3.2.</label>
<title>Iteration of Homology Search</title>
<p>As is common practice in search of distant homologies [<xref ref-type="bibr" rid="b86-genes-02-00925">86</xref>], rather than searching with a relaxed threshold, we use a stringent threshold in each step. From the hits determined and assessed positively, a new search emerges, again with a stringent threshold. We use a maximum of three iterations.</p></sec>
<sec>
<label>3.3.3.</label>
<title>Phase 2: CM Creation and Search</title>
<p>We enter Phase 2 with a first set of candidates for <italic>RFM<sub>SmelXnnn</sub></italic>, given that at least two homologs of SmelXnnn were found. An initial covariance model <italic>CM<sub>SmelXnnn</sub></italic> is to be built. This requires a multiple sequence alignment which supports a consensus structure. L<sc>oc</sc>ARNA [<xref ref-type="bibr" rid="b87-genes-02-00925">87</xref>] is used for creating the structural alignment, RNA<sc>alifold</sc> [<xref ref-type="bibr" rid="b88-genes-02-00925">88</xref>,<xref ref-type="bibr" rid="b89-genes-02-00925">89</xref>] for prediction of consensus structure from this alignment, and I<sc>nfernal</sc> [<xref ref-type="bibr" rid="b24-genes-02-00925">24</xref>] for CM model construction and search. <xref ref-type="fig" rid="f8-genes-02-00925">Figure 8</xref> gives an example of the input for CM construction.</p>
<p>After constructing <italic>CM<sub>SmelXnnn</sub></italic> by I<sc>nfernal</sc>, a profile search is performed on all alphaproteobacterial genomes. The best 50 hits are analyzed and undergo assessment. The steps of model construction, search and hit assessment are repeated while new homologs are identified. A maximum of three cycles is allowed. The third iteration uses as a cut-off a CM-score of 25% of the highest CM-score of any present member of the model.</p></sec></sec>
<sec>
<label>3.4.</label>
<title>TDM Model Construction</title>
<p>For specific RNA families, we found that a CM was inadequate to express their peculiarities. For example, SmelA075 has three hairpins, all of which exhibit a perfect loop motif (CCUCCUCCC) (cf. Section 2.7). It is widely distributed among the Rhizobiales, with the stems sequences highly diverged. As a consequence, the discriminatory power of the CM decreases. A TDM can be designed such that it gives no weight to the stem sequences, but enforces the loop motif. SmelB053 is a similar case with a strongly conserved structure and low sequence similarity, except for a prominent loop motif (GAUGUA).</p>
<p><xref ref-type="fig" rid="f9-genes-02-00925">Figure 9</xref> shows a snapshot of TDM construction, where we depict a structure graphics with the help of the L<sc>ocomotif</sc> editor [<xref ref-type="bibr" rid="b47-genes-02-00925">47</xref>], annotate it with conserved sequence information and compiled it into a search program at the Bielefeld Bioinformatics Server [<xref ref-type="bibr" rid="b90-genes-02-00925">90</xref>].</p>
<p>TDMs tend to run a bit faster than CMs (when their HMM filter is turned off). We use them to scan all <italic>Rhizobiales</italic> genomes. Resulting candidates are assessed like other candidates.</p>
<p>Design of a TDM also requires some iteration, as the motif description can be made more or less specific. Generally, it is a good strategy to begin with a rather restrictive motif and check that the known sRNAs are actually found, which verifies that the design is correct. Then, the motif is gradually relaxed. Search results may suggest adjustment to the original TDM design, such as relaxing the number of paired bases in a stem, or increasing the allowed size for a loop. This interplay of human design, matcher compilation and search continues as long as candidates pass the assessment step.</p></sec>
<sec>
<label>3.5.</label>
<title>Assessment: Taming the Flood of Candidates</title>
<p>When an RNA family model is available, its search procedure can be used with more or less stringent cut-offs. In this study, however, we need to construct such a model in the first place. We start from a single family member, which may not even be a typical one, but has the virtue of being based on an experimental screen rather than being an <italic>in silico</italic> prediction. We want to collect a large number of homologs, and use further evidence to weed out unplausible candidates.</p>
<p>Some sequences show strong sequence and structure conservation and thus can be unambiguously identified in an automatic fashion. No further effort, human or computational, is spent on them. Divergent sequences without global conservation have to be curated integrating sequence and structure conservation with additional sources of information, such as genomic context and phylogenetic distribution. <xref ref-type="fig" rid="f10-genes-02-00925">Figure 10</xref> gives an overview of this assessment, which is described further below.</p>
<sec>
<label>3.5.1.</label>
<title>Filtering of Sequence Homlogs Based on Pairwise L<sc>oc</sc>ARNA Scores</title>
<p>Let <italic>S</italic> = SmelXnnn for this section. Each putative homolog returned by sequence search is aligned (separately) to the <italic>S</italic> using L<sc>oc</sc>ARNA. For any further consideration, only candidates <italic>c</italic> are retained for which their L<sc>oc</sc>ARNA score <italic>loc</italic>(<italic>S</italic>, <italic>c</italic>) satisfies
<disp-formula id="FD1">
<mml:math id="mm1" display="block">
<mml:semantics id="sm1">
<mml:mrow>
<mml:mtext mathvariant="italic">loc</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>&gt;</mml:mo>
<mml:mn>0.33</mml:mn>
<mml:mo>⋅</mml:mo>
<mml:mtext mathvariant="italic">loc</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:semantics></mml:math></disp-formula></p>
<p>This restriction is due to practical considerations, as a trade off between the number of candidates that we have to inspect against the possible loss of true homologs. It leaves us with a set of candidates that align well to <italic>S</italic> individually, yet the structures underlying their alignments may have little in common.</p></sec>
<sec sec-type="methods">
<label>3.5.2.</label>
<title>Analysis of Structure Conservation</title>
<p>We next promote candidates with strong structural conservation, which pass the selection process without further inspection: From the previous step, we have two criteria that measure individual structure conservation of each candidate <italic>c</italic> against S, the (pairwise) L<sc>oc</sc>ARNA score <italic>loc</italic>(<italic>S</italic>, <italic>c</italic>) and the (pairwise) structure conservation index <italic>sci</italic>(<italic>S</italic>, <italic>c</italic>), which is computed with the help of RNA<sc>fold</sc> and RNA<sc>alifold</sc>, both part of the Vienna RNA Package [<xref ref-type="bibr" rid="b91-genes-02-00925">91</xref>]. If an SCI value is low, the candidate rather folds into a native structure different from the consensus found by L<sc>oc</sc>ARNA. Now the pieces of information gained from pairwise considerations have to be related.</p>
<p>For the pairwise alignment of <italic>S</italic> and <italic>c</italic>, we map the consensus structure back to <italic>S</italic> and compute its abstract shape representation, computed with RNA<sc>shapes</sc> [<xref ref-type="bibr" rid="b92-genes-02-00925">92</xref>,<xref ref-type="bibr" rid="b93-genes-02-00925">93</xref>], extended by hairpin centers (akin to the “helix centers” of [<xref ref-type="bibr" rid="b94-genes-02-00925">94</xref>]). A hairpin center is calculated as (<italic>i</italic> + <italic>j</italic>)/2, where <italic>i</italic> and <italic>j</italic> are the positions of the hairpin closing base pair.</p>
<p>We refer to the combination of shape and hairpin center as a <italic>pointed shape</italic>. This yields (potentially different) pointed shapes <italic>p<sub>S</sub></italic><sub>,</sub><italic><sub>c</sub></italic> for <italic>S</italic>, one for each candidate <italic>c</italic>. We also compute <italic>p<sub>S</sub></italic><sub>,</sub><italic><sub>S</sub></italic>, which is the pointed shape of the minimum free energy structure for <italic>S</italic>. Candidate <italic>c</italic> qualifies as a family member if</p>
<disp-formula id="FD2">
<label>(1)</label>
<mml:math id="mm2" display="block">
<mml:semantics id="sm2">
<mml:mrow>
<mml:mtext mathvariant="italic">loc</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>&gt;</mml:mo>
<mml:mn>0.75</mml:mn>
<mml:mo>⋅</mml:mo>
<mml:mtext mathvariant="italic">loc</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mspace width="0.2em"/>
<mml:mtext>and</mml:mtext></mml:mrow></mml:semantics></mml:math></disp-formula>
<disp-formula id="FD3">
<label>(2)</label>
<mml:math id="mm3" display="block">
<mml:semantics id="sm3">
<mml:mrow>
<mml:mtext mathvariant="italic">sci</mml:mtext>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>&gt;</mml:mo>
<mml:mn>0.9</mml:mn>
<mml:mspace width="0.2em"/>
<mml:mtext>and</mml:mtext></mml:mrow></mml:semantics></mml:math></disp-formula>
<disp-formula id="FD4">
<label>(3)</label>
<mml:math id="mm4" display="block">
<mml:semantics id="sm4">
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>c</mml:mi></mml:mrow></mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mi>S</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:semantics></mml:math></disp-formula>
<p>The last equation borrows the idea of consensus <italic>shapes</italic> from [<xref ref-type="bibr" rid="b95-genes-02-00925">95</xref>] for a fast way to select candidates where structure conservation is obvious.</p>
<p>Candidates which do not pass the above test are not (yet) discarded, but subjected to the next step.</p></sec>
<sec>
<label>3.5.3.</label>
<title>Synteny, Phylogeny, and Multiple Alignment</title>
<p>We perform BLAST comparisons of the protein-coding genes flanking SmelXnnn against the Rhizobiales database, with a maximum E-value of 10<sup>−6</sup> to indicate gene synteny. If one or both flanking genes are conserved with respect to at least one of the family members as accepted at this point, the candidate is accepted.</p>
<p>Additionally, we examine the distribution of homologs related to phylogeny, replicon type, and copy number. Candidates are discarded, for example, when located on a different replicon in a closely related strain, or when there is only a single hit in a remote phylogenetic group.</p>
<p>Finally (this must be the last assessment step for computational reasons), candidates that passed the previous criteria are cast into a multiple structural alignment with L<sc>oc</sc>ARNA. Such an alignment is easily derailed by outliers that do not fit to a common structure. Obvious outliers, exhibiting a scattered alignment throughout the sequence, are removed by human inspection. As long as candidates are removed, the alignment is recalculated for the remaining family members.</p></sec></sec>
<sec>
<label>3.6.</label>
<title>Towards a More Automated RNA Family Model Construction Process</title>
<p>So far we completed 39 RFMs including 52 of 173 trans-encoded sRNAs published in [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>]. There are 121 more to go. Further automation of the model construction process is highly desirable, but not easy to achieve. While some parts of our assessment step can be integrated into an automated workflow, there are also serious challenges arising from technical limitations of the available software. We discuss two such aspects in the sequel.</p>
<sec>
<label>3.6.1.</label>
<title>Modular Architecture of RNA Families</title>
<p>An important characteristic of RNA is its modular architecture. Similar substructures, shared among different RNA families, or multiple copies of modules within an RNA family may indicate related functionality. In RNA family reconstruction, shared modules complicate the identification of homologs by making it difficult to distinguish whether a candidate belongs to an already existing family or constitutes a completely new RNA family, merely sharing a similar sub-structure.</p>
<p>An example for different manifestations of a repetitive module, comprised of a single hairpin, are the trans-encoded sRNAs SmelC201 (not included in this study), SmelA075, and SmelA099. Their structures are composed of two, three, and four consecutive copies of similar hairpins. Neither CMs nor TDMs (at least not those created with L<sc>ocomotif</sc>) can model a variable number of modules. In case of SmelA075 and SmelA099, our workaround was to build separate models for each module number. An extra classification step was required for matches obtained from the alternative models, because shorter models produced multiple overlapping hits to homologs that were members of RFMs with a higher number of module copies. From the algorithmic point of view, it should be simple to extend modeling techniques in this direction.</p></sec>
<sec>
<label>3.6.2.</label>
<title>Conserved Terminators in Short sRNAs</title>
<p>Finding homologs for sRNAs with a length below 80 nucleotides is generally difficult if their sequence has diverged while retaining a conserved structure. In particular, when the structure is a GC-rich hairpin, it often matches to terminator hairpins in multiple locations. A source of confusion is for example <italic>RFM<sub>SmelB</sub></italic><sub>053</sub>, whose structure consists of three adjacent hairpins. The nuceotide sequence of the central hairpin is conserved, whereas the last one constitutes a <italic>bona fide</italic> terminator.</p>
<p>Situations like this ask for a generalization of models towards avoidance of specified motifs. This is probably more difficult than allowing for optional modules.</p></sec></sec></sec>
<sec sec-type="conclusions">
<label>4.</label>
<title>Conclusions</title>
<p>In this study, we aimed at the identification of homologous sRNAs in the Rhizobiales, starting with a set of well-defined trans-encoded sRNAs from <italic>S. meliloti</italic> 1021. This is the first comprehensive, comparative <italic>in silico</italic> approach in this group of bacteria. Definition of RNA family models and grouping of sRNAs into these families is complicated by the poor knowledge about relationships between sequence, structure, and functions of sRNA domains. Whereas strong sequence and structural conservation is a good indication for assignment to the same family, the process becomes more difficult if only short sequence motifs and some structural features show similarities. This also includes ambiguous situations of sRNAs showing limited similarities to different family models. Therefore, full automation of RFM construction has not yet been achieved in this study. It is also not clear how far up in the taxonomy an approach like ours can reach. Hence, numbers of false positives and negatives are likely to increase with the evolutionary distance of the organisms.</p>
<p>Several independent pieces of evidence suggest that the 39 family models delivered here are a trustworthy bases for further, experimental and bioinformatics analyses. Apart from the criteria of sequence, structure and synteny conservation, such independent evidence is the following:</p>
<list list-type="bullet">
<list-item>
<p>Our initial, experimental screen [<xref ref-type="bibr" rid="b38-genes-02-00925">38</xref>] was able to recover transcripts of the majority of sRNAs known in <italic>S. meliloti</italic> at that time;</p></list-item>
<list-item>
<p>Generally, our family models exhibit a plausible distribution of their members with respect to phylogeny;</p></list-item>
<list-item>
<p>In particular, the specific distribution of observed transcripts on replicons is in agreement with the accepted view that the symbiotic plasmid is a late acquisition in <italic>Sinorhizobium</italic>;</p></list-item>
<list-item>
<p>Members of five of our family models, found in <italic>A. tumefaciens</italic>, were validated experimentally by deep sequencing and Northern blots in independent studies (cf. Section 2.2).</p></list-item></list>
<p>Our approach provides valuable insights into the distributions of conserved and the presence of species-, family-, and genus-specific sRNAs. Most of the RNA families are restricted to the <italic>Rhizobiaceae</italic>, but a few show a broader distribution, implying a more general conserved function. While functional studies of sRNAs may build on our predictions, the future bioinformatics tasks are to construct models for the remaining transcripts from <italic>S. meliloti</italic>, and the extension of the comparative analysis to the alphaproteobacteria, and possibly beyond.</p></sec></body>
<back>
<sec sec-type="display-objects">
<title>Figures</title>
<fig id="f1-genes-02-00925" position="float">
<label>Figure 1.</label>
<caption>
<p>Workflow of covariance model construction.</p></caption>
<graphic xlink:href="genes-02-00925f1.gif"/></fig>
<fig id="f2-genes-02-00925" position="float">
<label>Figure 2.</label>
<caption>
<p>Workflow of thermodynamic matcher construction.</p></caption>
<graphic xlink:href="genes-02-00925f2.gif"/></fig>
<fig id="f3-genes-02-00925" position="float">
<label>Figure 3.</label>
<caption>
<p>Distribution pattern of trans-encoded sRNAs in the Rhizobiales. The simplified phylogenetic tree includes sequenced strains and was adopted from the Pathosystems Resource Integration center (PATRIC) [<xref ref-type="bibr" rid="b48-genes-02-00925">48</xref>]. Analyzed bacterial strains that reveal no relatives in each RFM were removed from this scheme. A complete summary of all genomes used in this study is given in Table S2. sRNA occurrence of particular RFMs is given in each line. Chromosome (<bold>C</bold>), pSymA (<bold>A</bold>), and pSymB (<bold>B</bold>) of <italic>S. meliloti</italic> 1021 carry the initial set of trans-encoded sRNAs used in this comparative study and were indicated as different blocks separated by black, horizontal lines. The upper block (<bold>M</bold>) summarizes RFMs of sRNAs with several gene copies in particular genomes. * indicates RFMs that contain several sRNA gene copies in the <italic>S. meliloti</italic> 1021 strain (Table S1). The color code indicates the number of related sRNAs in each strain: 1 = grey, 2 = blue, 3 = red, and ≥4 = green. Complete (type I), extensive (type II), partial (type III), and fragmented (type IV) microsynteny is represented by black boxes.</p></caption>
<graphic xlink:href="genes-02-00925f3.gif"/></fig>
<fig id="f4-genes-02-00925" position="float">
<label>Figure 4.</label>
<caption>
<p>Structural comparison between <italic>RFM<sub>SmelB</sub></italic><sub>053</sub> and related 3′-UTRs. (<bold>a</bold>) Consensus secondary structure of <italic>RFM<sub>SmelB</sub></italic><sub>053</sub> and (<bold>b</bold>) related 3′-UTRs; Base pairs in (<bold>a</bold>) and (<bold>b</bold>) are colored according to the Vienna RNA conservation coloring scheme [<xref ref-type="bibr" rid="b65-genes-02-00925">65</xref>]. Colors indicate the number of nucleotide combinations, out of the six possible base pairs, in the underlying alignment that are involved in forming predicted base-pairs (red = 1, yellow = 2, green = 3, cyan = 4, blue = 5, purple = 6). Pale colors are used for the case that some sRNAs do not form a base-pair; (<bold>c</bold>) Associated proteins of 3′-UTRs. Arrows indicate the orientation of each gene, identical colors indicate homologous genes. Non-colored arrows denote non-homologous genes (genes are not shown to scale).</p></caption>
<graphic xlink:href="genes-02-00925f4.gif"/></fig>
<fig id="f5-genes-02-00925" position="float">
<label>Figure 5.</label>
<caption>
<p>Structural, functional and genomic features of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub>. (<bold>a</bold>) Alignment of presumed functional hairpin loops; and (<bold>b</bold>) consensus secondary structure of identified relatives of SmelC023. Base pairs are colored as in <xref ref-type="fig" rid="f4-genes-02-00925">Figure 4a</xref>; (<bold>c</bold>) Microsynteny pattern of <italic>RFM<sub>SmelC</sub></italic><sub>023</sub>. Illustration as in <xref ref-type="fig" rid="f4-genes-02-00925">Figure 4c</xref>.</p></caption>
<graphic xlink:href="genes-02-00925f5.gif"/></fig>
<fig id="f6-genes-02-00925" position="float">
<label>Figure 6.</label>
<caption>
<p>Structural, functional and genomic features of <italic>RFM<sub>SmelC</sub></italic><sub>289</sub>. (<bold>a</bold>) Alignment of presumed functional hairpin loops and (<bold>b</bold>) consensus secondary structure of identified relatives of SmelC289. Base pairs are colored as in <xref ref-type="fig" rid="f4-genes-02-00925">Figure 4a</xref>; (<bold>c</bold>) Microsynteny pattern of <italic>RFM<sub>SmelC</sub></italic><sub>289</sub>. Illustration as in <xref ref-type="fig" rid="f4-genes-02-00925">Figure 4c</xref>.</p></caption>
<graphic xlink:href="genes-02-00925f6.gif"/></fig>
<fig id="f7-genes-02-00925" position="float">
<label>Figure 7.</label>
<caption>
<p>Hairpin loop structures of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub> and <italic>RFM<sub>SmelA</sub></italic><sub>099</sub>. (<bold>a</bold>) Consensus secondary structure of <italic>RFM<sub>SmelA</sub></italic><sub>075</sub>; and (<bold>b</bold>) <italic>RFM<sub>SmelA</sub></italic><sub>099</sub>. Base pairs are colored as in <xref ref-type="fig" rid="f4-genes-02-00925">Figure 4a</xref>.</p></caption>
<graphic xlink:href="genes-02-00925f7.gif"/></fig>
<fig id="f8-genes-02-00925" position="float">
<label>Figure 8.</label>
<caption>
<p>Multiple sequence alignment and consensus structure, as required for CM construction. Shown is the multiple sequence alignment of <italic>RFM<sub>SmelC</sub></italic><sub>055</sub> in Stockholm format.</p></caption>
<graphic xlink:href="genes-02-00925f8.gif"/></fig>
<fig id="f9-genes-02-00925" position="float">
<label>Figure 9.</label>
<caption>
<p>A snapshot from the construction of a thermodynamic matcher using the L<sc>ocomotif</sc> editor.</p></caption>
<graphic xlink:href="genes-02-00925f9.gif"/></fig>
<fig id="f10-genes-02-00925" position="float">
<label>Figure 10.</label>
<caption>
<p>Candidate selection process.</p></caption>
<graphic xlink:href="genes-02-00925f10.gif"/></fig></sec>
<ref-list>
<title>References</title>
<ref id="b1-genes-02-00925"><label>1.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Barnett</surname><given-names>M.J.</given-names></name><name><surname>Capela</surname><given-names>D.</given-names></name><name><surname>Dondrup</surname><given-names>M.</given-names></name><name><surname>Kamp</surname><given-names>P.B.</given-names></name><name><surname>Krol</surname><given-names>E.</given-names></name><name><surname>Linke</surname><given-names>B.</given-names></name><name><surname>Rüberg</surname><given-names>S.</given-names></name><name><surname>Runte</surname><given-names>K.</given-names></name><name><surname>Schroeder</surname><given-names>B.K.</given-names></name><etal/></person-group><article-title>A portal for rhizobial genomes: RhizoGATE integrates a <italic>Sinorhizobium meliloti</italic> genome annotation update with postgenome data</article-title><source>J. Biotechnol.</source><year>2009</year><volume>140</volume><fpage>45</fpage><lpage>50</lpage><pub-id pub-id-type="doi">10.1016/j.jbiotec.2008.11.006</pub-id><pub-id pub-id-type="pmid">19103235</pub-id></citation></ref>
<ref id="b2-genes-02-00925"><label>2.</label><citation citation-type="web"><article-title>RhizoGATE - The gateway to rhizobial genomes</article-title><comment>Available online: <ext-link xlink:href="http://www.rhizogate.de/" ext-link-type="uri">http://www.rhizogate.de/</ext-link> (accessed on 25 February 2011)</comment></citation></ref>
<ref id="b3-genes-02-00925"><label>3.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wassarman</surname><given-names>K.M.</given-names></name></person-group><article-title>Small RNAs in bacteria: Diverse regulators of gene expression in response to environmental changes</article-title><source>Cell</source><year>2002</year><volume>109</volume><fpage>141</fpage><lpage>144</lpage><pub-id pub-id-type="doi">10.1016/S0092-8674(02)00717-1</pub-id><pub-id pub-id-type="pmid">12007399</pub-id></citation></ref>
<ref id="b4-genes-02-00925"><label>4.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lenz</surname><given-names>D.H.</given-names></name><name><surname>Mok</surname><given-names>K.C.</given-names></name><name><surname>Lilley</surname><given-names>B.N.</given-names></name><name><surname>Kulkarni</surname><given-names>R.V.</given-names></name><name><surname>Wingreen</surname><given-names>N.S.</given-names></name><name><surname>Bassler</surname><given-names>B.L.</given-names></name></person-group><article-title>The small RNA chaperone Hfq and multiple small RNAs control quorum sensing in <italic>Vibrio harveyi</italic> and <italic>Vibrio cholerae</italic></article-title><source>Cell</source><year>2004</year><volume>118</volume><fpage>69</fpage><lpage>82</lpage><pub-id pub-id-type="doi">10.1016/j.cell.2004.06.009</pub-id><pub-id pub-id-type="pmid">15242645</pub-id></citation></ref>
<ref id="b5-genes-02-00925"><label>5.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fozo</surname><given-names>E.M.</given-names></name><name><surname>Hemm</surname><given-names>M.R.</given-names></name><name><surname>Storz</surname><given-names>G.</given-names></name></person-group><article-title>Small toxic proteins and the antisense RNAs that repress them</article-title><source>Microbiol. Mol. Biol. Rev.</source><year>2008</year><volume>72</volume><fpage>579</fpage><lpage>589</lpage><pub-id pub-id-type="doi">10.1128/MMBR.00025-08</pub-id><pub-id pub-id-type="pmid">19052321</pub-id></citation></ref>
<ref id="b6-genes-02-00925"><label>6.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Storz</surname><given-names>G.</given-names></name><name><surname>Altuvia</surname><given-names>S.</given-names></name><name><surname>Wassarman</surname><given-names>K.M.</given-names></name></person-group><article-title>An abundance of RNA regulators</article-title><source>Annu. Rev. Biochem.</source><year>2005</year><volume>74</volume><fpage>199</fpage><lpage>217</lpage><pub-id pub-id-type="doi">10.1146/annurev.biochem.74.082803.133136</pub-id><pub-id pub-id-type="pmid">15952886</pub-id></citation></ref>
<ref id="b7-genes-02-00925"><label>7.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altuvia</surname><given-names>S.</given-names></name><name><surname>Zhang</surname><given-names>A.</given-names></name><name><surname>Argaman</surname><given-names>L.</given-names></name><name><surname>Tiwari</surname><given-names>A.</given-names></name><name><surname>Storz</surname><given-names>G.</given-names></name></person-group><article-title>The <italic>Escherichia coli</italic> OxyS regulatory RNA represses fhlA translation by blocking ribosome binding</article-title><source>EMBO J.</source><year>1998</year><volume>17</volume><fpage>6069</fpage><lpage>6075</lpage><pub-id pub-id-type="doi">10.1093/emboj/17.20.6069</pub-id><pub-id pub-id-type="pmid">9774350</pub-id></citation></ref>
<ref id="b8-genes-02-00925"><label>8.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andersen</surname><given-names>J.</given-names></name><name><surname>Forst</surname><given-names>S.A.</given-names></name><name><surname>Zhao</surname><given-names>K.</given-names></name><name><surname>Inouye</surname><given-names>M.</given-names></name><name><surname>Delihas</surname><given-names>N.</given-names></name></person-group><article-title>The function of micF RNA. micF RNA is a major factor in the thermal regulation of OmpF protein in <italic>Escherichia coli</italic></article-title><source>J. Biol. Chem.</source><year>1989</year><volume>264</volume><fpage>17961</fpage><lpage>17970</lpage><pub-id pub-id-type="pmid">2478539</pub-id></citation></ref>
<ref id="b9-genes-02-00925"><label>9.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>S.</given-names></name><name><surname>Zhang</surname><given-names>A.</given-names></name><name><surname>Blyn</surname><given-names>L.B.</given-names></name><name><surname>Storz</surname><given-names>G.</given-names></name></person-group><article-title>MicC, a second small-RNA regulator of Omp protein expression in <italic>Escherichia coli</italic></article-title><source>J. Bacteriol.</source><year>2004</year><volume>186</volume><fpage>6689</fpage><lpage>6697</lpage><pub-id pub-id-type="doi">10.1128/JB.186.20.6689-6697.2004</pub-id><pub-id pub-id-type="pmid">15466019</pub-id></citation></ref>
<ref id="b10-genes-02-00925"><label>10.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Majdalani</surname><given-names>N.</given-names></name><name><surname>Cunning</surname><given-names>C.</given-names></name><name><surname>Sledjeski</surname><given-names>D.</given-names></name><name><surname>Elliott</surname><given-names>T.</given-names></name><name><surname>Gottesman</surname><given-names>S.</given-names></name></person-group><article-title>DsrA RNA regulates translation of RpoS message by an anti-antisense mechanism, independent of its action as an antisilencer of transcription</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>1998</year><volume>95</volume><fpage>12462</fpage><lpage>12467</lpage><pub-id pub-id-type="doi">10.1073/pnas.95.21.12462</pub-id><pub-id pub-id-type="pmid">9770508</pub-id></citation></ref>
<ref id="b11-genes-02-00925"><label>11.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Massé</surname><given-names>E.</given-names></name><name><surname>Escorcia</surname><given-names>F.E.</given-names></name><name><surname>Gottesman</surname><given-names>S.</given-names></name></person-group><article-title>Coupled degradation of a small regulatory RNA and its mRNA targets in <italic>Escherichia coli</italic></article-title><source>Genes Dev.</source><year>2003</year><volume>17</volume><fpage>2374</fpage><lpage>2383</lpage><pub-id pub-id-type="doi">10.1101/gad.1127103</pub-id><pub-id pub-id-type="pmid">12975324</pub-id></citation></ref>
<ref id="b12-genes-02-00925"><label>12.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Massé</surname><given-names>E.</given-names></name><name><surname>Salvail</surname><given-names>H.</given-names></name><name><surname>Desnoyers</surname><given-names>G.</given-names></name><name><surname>Arguin</surname><given-names>M.</given-names></name></person-group><article-title>Small RNAs controlling iron metabolism</article-title><source>Curr. Opin. Microbiol.</source><year>2007</year><volume>10</volume><fpage>140</fpage><lpage>145</lpage><pub-id pub-id-type="doi">10.1016/j.mib.2007.03.013</pub-id><pub-id pub-id-type="pmid">17383226</pub-id></citation></ref>
<ref id="b13-genes-02-00925"><label>13.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Opdyke</surname><given-names>J.A.</given-names></name><name><surname>Kang</surname><given-names>J.G.</given-names></name><name><surname>Storz</surname><given-names>G.</given-names></name></person-group><article-title>GadY, a small-RNA regulator of acid response genes in <italic>Escherichia coli</italic></article-title><source>J. Bacteriol.</source><year>2004</year><volume>186</volume><fpage>6698</fpage><lpage>6705</lpage><pub-id pub-id-type="doi">10.1128/JB.186.20.6698-6705.2004</pub-id><pub-id pub-id-type="pmid">15466020</pub-id></citation></ref>
<ref id="b14-genes-02-00925"><label>14.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Babitzke</surname><given-names>P.</given-names></name><name><surname>Romeo</surname><given-names>T.</given-names></name></person-group><article-title>CsrB sRNA family: Sequestration of RNA-binding regulatory proteins</article-title><source>Curr. Opin. Microbiol.</source><year>2007</year><volume>10</volume><fpage>156</fpage><lpage>163</lpage><pub-id pub-id-type="doi">10.1016/j.mib.2007.03.007</pub-id><pub-id pub-id-type="pmid">17383221</pub-id></citation></ref>
<ref id="b15-genes-02-00925"><label>15.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wassarman</surname><given-names>K.M.</given-names></name><name><surname>Repoila</surname><given-names>F.</given-names></name><name><surname>Rosenow</surname><given-names>C.</given-names></name><name><surname>Storz</surname><given-names>G.</given-names></name><name><surname>Gottesman</surname><given-names>S.</given-names></name></person-group><article-title>Identification of novel small RNAs using comparative genomics and microarrays</article-title><source>Genes Dev.</source><year>2001</year><volume>15</volume><fpage>1637</fpage><lpage>1651</lpage><pub-id pub-id-type="doi">10.1101/gad.901001</pub-id><pub-id pub-id-type="pmid">11445539</pub-id></citation></ref>
<ref id="b16-genes-02-00925"><label>16.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Argaman</surname><given-names>L.</given-names></name><name><surname>Hershberg</surname><given-names>R.</given-names></name><name><surname>Vogel</surname><given-names>J.</given-names></name><name><surname>Bejerano</surname><given-names>G.</given-names></name><name><surname>Wagner</surname><given-names>E.G.</given-names></name><name><surname>Margalit</surname><given-names>H.</given-names></name><name><surname>Altuvia</surname><given-names>S.</given-names></name></person-group><article-title>Novel small RNA-encoding genes in the intergenic regions of <italic>Escherichia coli</italic></article-title><source>Curr. Biol.</source><year>2001</year><volume>11</volume><fpage>941</fpage><lpage>950</lpage><pub-id pub-id-type="doi">10.1016/S0960-9822(01)00270-6</pub-id><pub-id pub-id-type="pmid">11448770</pub-id></citation></ref>
<ref id="b17-genes-02-00925"><label>17.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rivas</surname><given-names>E.</given-names></name><name><surname>Klein</surname><given-names>R.J.</given-names></name><name><surname>Jones</surname><given-names>T.A.</given-names></name><name><surname>Eddy</surname><given-names>S.R.</given-names></name></person-group><article-title>Computational identification of noncoding RNAs in <italic>E. coli</italic> by comparative genomics</article-title><source>Curr. Biol.</source><year>2001</year><volume>11</volume><fpage>1369</fpage><lpage>1373</lpage><pub-id pub-id-type="doi">10.1016/S0960-9822(01)00401-8</pub-id><pub-id pub-id-type="pmid">11553332</pub-id></citation></ref>
<ref id="b18-genes-02-00925"><label>18.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>S.</given-names></name><name><surname>Lesnik</surname><given-names>E.A.</given-names></name><name><surname>Hall</surname><given-names>T.A.</given-names></name><name><surname>Sampath</surname><given-names>R.</given-names></name><name><surname>Griffey</surname><given-names>R.H.</given-names></name><name><surname>Ecker</surname><given-names>D.J.</given-names></name><name><surname>Blyn</surname><given-names>L.B.</given-names></name></person-group><article-title>A bioinformatics based approach to discover small RNA genes in the <italic>Escherichia coli</italic> genome</article-title><source>Biosystems</source><year>2002</year><volume>65</volume><fpage>157</fpage><lpage>177</lpage><pub-id pub-id-type="doi">10.1016/S0303-2647(02)00013-8</pub-id><pub-id pub-id-type="pmid">12069726</pub-id></citation></ref>
<ref id="b19-genes-02-00925"><label>19.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiao</surname><given-names>B.</given-names></name><name><surname>Li</surname><given-names>W.</given-names></name><name><surname>Guo</surname><given-names>G.</given-names></name><name><surname>Li</surname><given-names>B.</given-names></name><name><surname>Liu</surname><given-names>Z.</given-names></name><name><surname>Jia</surname><given-names>K.</given-names></name><name><surname>Guo</surname><given-names>Y.</given-names></name><name><surname>Mao</surname><given-names>X.</given-names></name><name><surname>Zou</surname><given-names>Q.</given-names></name></person-group><article-title>Identification of small noncoding RNAs in <italic>Helicobacter pylori</italic> by a bioinformatics-based approach</article-title><source>Curr. Microbiol.</source><year>2009</year><volume>58</volume><fpage>258</fpage><lpage>263</lpage><pub-id pub-id-type="doi">10.1007/s00284-008-9318-2</pub-id><pub-id pub-id-type="pmid">19123032</pub-id></citation></ref>
<ref id="b20-genes-02-00925"><label>20.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Livny</surname><given-names>J.</given-names></name><name><surname>Brencic</surname><given-names>A.</given-names></name><name><surname>Lory</surname><given-names>S.</given-names></name><name><surname>Waldor</surname><given-names>M.K.</given-names></name></person-group><article-title>Identification of 17 <italic>Pseudomonas aeruginosa</italic> sRNAs and prediction of sRNA-encoding genes in 10 diverse pathogens using the bioinformatic tool sRNAPredict2</article-title><source>Nucleic Acids Res.</source><year>2006</year><volume>34</volume><fpage>3484</fpage><lpage>3493</lpage><pub-id pub-id-type="doi">10.1093/nar/gkl453</pub-id><pub-id pub-id-type="pmid">16870723</pub-id></citation></ref>
<ref id="b21-genes-02-00925"><label>21.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gvakharia</surname><given-names>B.O.</given-names></name><name><surname>Tjaden</surname><given-names>B.</given-names></name><name><surname>Vajrala</surname><given-names>N.</given-names></name><name><surname>Sayavedra-Soto</surname><given-names>L.A.</given-names></name><name><surname>Arp</surname><given-names>D.J.</given-names></name></person-group><article-title>Computational prediction and transcriptional analysis of sRNAs in <italic>Nitrosomonas europaea</italic></article-title><source>FEMS Microbiol. Lett.</source><year>2010</year><volume>312</volume><fpage>46</fpage><lpage>54</lpage><pub-id pub-id-type="doi">10.1111/j.1574-6968.2010.02095.x</pub-id><pub-id pub-id-type="pmid">20840601</pub-id></citation></ref>
<ref id="b22-genes-02-00925"><label>22.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Washietl</surname><given-names>S.</given-names></name><name><surname>Hofacker</surname><given-names>I.L.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name></person-group><article-title>Fast and reliable prediction of noncoding RNAs</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2005</year><volume>102</volume><fpage>2454</fpage><lpage>2459</lpage><pub-id pub-id-type="doi">10.1073/pnas.0409169102</pub-id><pub-id pub-id-type="pmid">15665081</pub-id></citation></ref>
<ref id="b23-genes-02-00925"><label>23.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pedersen</surname><given-names>J.S.</given-names></name><name><surname>Bejerano</surname><given-names>G.</given-names></name><name><surname>Siepel</surname><given-names>A.</given-names></name><name><surname>Rosenbloom</surname><given-names>K.</given-names></name><name><surname>Lindblad-Toh</surname><given-names>K.</given-names></name><name><surname>Lander</surname><given-names>E.S.</given-names></name><name><surname>Kent</surname><given-names>J.</given-names></name><name><surname>Miller</surname><given-names>W.</given-names></name><name><surname>Haussler</surname><given-names>D.</given-names></name></person-group><article-title>Identification and classification of conserved RNA secondary structures in the human genome</article-title><source>PLoS Comput. Biol.</source><year>2006</year><volume>2</volume><fpage>e33:0251</fpage><lpage>e33:0262</lpage></citation></ref>
<ref id="b24-genes-02-00925"><label>24.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nawrocki</surname><given-names>E.P.</given-names></name><name><surname>Kolbe</surname><given-names>D.L.</given-names></name><name><surname>Eddy</surname><given-names>S.R.</given-names></name></person-group><article-title>Infernal 1.0: Inference of RNA alignments</article-title><source>Bioinformatics</source><year>2009</year><volume>25</volume><fpage>1335</fpage><lpage>1337</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btp157</pub-id><pub-id pub-id-type="pmid">19307242</pub-id></citation></ref>
<ref id="b25-genes-02-00925"><label>25.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Washietl</surname><given-names>S.</given-names></name><name><surname>Pedersen</surname><given-names>J.S.</given-names></name><name><surname>Korbel</surname><given-names>J.O.</given-names></name><name><surname>Stocsits</surname><given-names>C.</given-names></name><name><surname>Gruber</surname><given-names>A.R.</given-names></name><name><surname>Hackermüller</surname><given-names>J.</given-names></name><name><surname>Hertel</surname><given-names>J.</given-names></name><name><surname>Lindemeyer</surname><given-names>M.</given-names></name><name><surname>Reiche</surname><given-names>K.</given-names></name><name><surname>Tanzer</surname><given-names>A.</given-names></name><etal/></person-group><article-title>Structured RNAs in the ENCODE selected regions of the human genome</article-title><source>Genome Res.</source><year>2007</year><volume>17</volume><fpage>852</fpage><lpage>864</lpage><pub-id pub-id-type="doi">10.1101/gr.5650707</pub-id><pub-id pub-id-type="pmid">17568003</pub-id></citation></ref>
<ref id="b26-genes-02-00925"><label>26.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sharma</surname><given-names>C.M.</given-names></name><name><surname>Hoffmann</surname><given-names>S.</given-names></name><name><surname>Darfeuille</surname><given-names>F.</given-names></name><name><surname>Reignier</surname><given-names>J.</given-names></name><name><surname>Findeiss</surname><given-names>S.</given-names></name><name><surname>Sittka</surname><given-names>A.</given-names></name><name><surname>Chabas</surname><given-names>S.</given-names></name><name><surname>Reiche</surname><given-names>K.</given-names></name><name><surname>Hackermüller</surname><given-names>J.</given-names></name><name><surname>Reinhardt</surname><given-names>R.</given-names></name><etal/></person-group><article-title>The primary transcriptome of the major human pathogen <italic>Helicobacter pylori</italic></article-title><source>Nature</source><year>2010</year><volume>464</volume><fpage>250</fpage><lpage>255</lpage><pub-id pub-id-type="doi">10.1038/nature08756</pub-id><pub-id pub-id-type="pmid">20164839</pub-id></citation></ref>
<ref id="b27-genes-02-00925"><label>27.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Landt</surname><given-names>S.G.</given-names></name><name><surname>Abeliuk</surname><given-names>E.</given-names></name><name><surname>McGrath</surname><given-names>P.T.</given-names></name><name><surname>Lesley</surname><given-names>J.A.</given-names></name><name><surname>McAdams</surname><given-names>H.H.</given-names></name><name><surname>Shapiro</surname><given-names>L.</given-names></name></person-group><article-title>Small non-coding RNAs in <italic>Caulobacter crescentus</italic></article-title><source>Mol. Microbiol.</source><year>2008</year><volume>68</volume><fpage>600</fpage><lpage>614</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2008.06172.x</pub-id><pub-id pub-id-type="pmid">18373523</pub-id></citation></ref>
<ref id="b28-genes-02-00925"><label>28.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mitschke</surname><given-names>J.</given-names></name><name><surname>Georg</surname><given-names>J.</given-names></name><name><surname>Scholz</surname><given-names>I.</given-names></name><name><surname>Sharma</surname><given-names>C.M.</given-names></name><name><surname>Dienst</surname><given-names>D.</given-names></name><name><surname>Bantscheff</surname><given-names>J.</given-names></name><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Steglich</surname><given-names>C.</given-names></name><name><surname>Wilde</surname><given-names>A.</given-names></name><name><surname>Vogel</surname><given-names>J.</given-names></name><etal/></person-group><article-title>An experimentally anchored map of transcriptional start sites in the model cyanobacterium <italic>Synechocystis</italic> sp. PCC6803</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2011</year><volume>108</volume><fpage>2124</fpage><lpage>2129</lpage><pub-id pub-id-type="doi">10.1073/pnas.1015154108</pub-id><pub-id pub-id-type="pmid">21245330</pub-id></citation></ref>
<ref id="b29-genes-02-00925"><label>29.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Livny</surname><given-names>J.</given-names></name><name><surname>Teonadi</surname><given-names>H.</given-names></name><name><surname>Livny</surname><given-names>M.</given-names></name><name><surname>Waldor</surname><given-names>M.K.</given-names></name></person-group><article-title>High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs</article-title><source>PLoS One</source><year>2008</year><volume>3</volume><fpage>e3197:1</fpage><lpage>e3197:12</lpage></citation></ref>
<ref id="b30-genes-02-00925"><label>30.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vercruysse</surname><given-names>M.</given-names></name><name><surname>Fauvart</surname><given-names>M.</given-names></name><name><surname>Cloots</surname><given-names>L.</given-names></name><name><surname>Engelen</surname><given-names>K.</given-names></name><name><surname>Thijs</surname><given-names>I.M.</given-names></name><name><surname>Marchal</surname><given-names>K.</given-names></name><name><surname>Michiels</surname><given-names>J.</given-names></name></person-group><article-title>Genome-wide detection of predicted non-coding RNAs in <italic>Rhizobium etli</italic> expressed during free-living and host-associated growth using a high-resolution tiling array</article-title><source>BMC Genomics</source><year>2010</year><volume>11</volume><fpage>53:1</fpage><lpage>54:12</lpage></citation></ref>
<ref id="b31-genes-02-00925"><label>31.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wilms</surname><given-names>I.</given-names></name><name><surname>Overlöper</surname><given-names>A.</given-names></name><name><surname>Nowrousian</surname><given-names>M.</given-names></name><name><surname>Sharma</surname><given-names>C.M.</given-names></name><name><surname>Narberhaus</surname><given-names>F.</given-names></name></person-group><article-title>Deep sequencing uncovers numerous small RNAs on all four replicons of the plant pathogen Agrobacterium tumefaciens</article-title><source>RNA Biol.</source><year>2012</year><volume>9</volume><comment>in presss</comment></citation></ref>
<ref id="b32-genes-02-00925"><label>32.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Venkova-Canova</surname><given-names>T.</given-names></name><name><surname>Soberón</surname><given-names>N.E.</given-names></name><name><surname>Ramírez-Romero</surname><given-names>M.A.</given-names></name><name><surname>Cevallos</surname><given-names>M.A.</given-names></name></person-group><article-title>Two discrete elements are required for the replication of a repABC plasmid: An antisense RNA and a stem-loop structure</article-title><source>Mol. Microbiol.</source><year>2004</year><volume>54</volume><fpage>1431</fpage><lpage>1444</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2004.04366.x</pub-id><pub-id pub-id-type="pmid">15554980</pub-id></citation></ref>
<ref id="b33-genes-02-00925"><label>33.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>MacLellan</surname><given-names>S.R.</given-names></name><name><surname>Smallbone</surname><given-names>L.A.</given-names></name><name><surname>Sibley</surname><given-names>C.D.</given-names></name><name><surname>Finan</surname><given-names>T.M.</given-names></name></person-group><article-title>The expression of a novel antisense gene mediates incompatibility within the large repABC family of alpha-proteobacterial plasmids</article-title><source>Mol. Microbiol.</source><year>2005</year><volume>55</volume><fpage>611</fpage><lpage>623</lpage><pub-id pub-id-type="pmid">15659174</pub-id></citation></ref>
<ref id="b34-genes-02-00925"><label>34.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ulvé</surname><given-names>V.M.</given-names></name><name><surname>Sevin</surname><given-names>E.W.</given-names></name><name><surname>Chéron</surname><given-names>A.</given-names></name><name><surname>Barloy-Hubler</surname><given-names>F.</given-names></name></person-group><article-title>Identification of chromosomal alpha-proteobacterial small RNAs by comparative genome analysis and detection in <italic>Sinorhizobium meliloti</italic> strain 1021</article-title><source>BMC Genomics</source><year>2007</year><volume>8</volume><fpage>467:1</fpage><lpage>467:16</lpage></citation></ref>
<ref id="b35-genes-02-00925"><label>35.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ulvé</surname><given-names>V.M.</given-names></name><name><surname>Chéron</surname><given-names>A.</given-names></name><name><surname>Trautwetter</surname><given-names>A.</given-names></name><name><surname>Fontenelle</surname><given-names>C.</given-names></name><name><surname>Barloy-Hubler</surname><given-names>F.</given-names></name></person-group><article-title>Characterization and expression patterns of <italic>Sinorhizobium meliloti</italic> tmRNA (ssrA)</article-title><source>FEMS Microbiol. Lett.</source><year>2007</year><volume>269</volume><fpage>117</fpage><lpage>123</lpage><pub-id pub-id-type="doi">10.1111/j.1574-6968.2006.00616.x</pub-id><pub-id pub-id-type="pmid">17241239</pub-id></citation></ref>
<ref id="b36-genes-02-00925"><label>36.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>del Val</surname><given-names>C.</given-names></name><name><surname>Rivas</surname><given-names>E.</given-names></name><name><surname>Torres-Quesada</surname><given-names>O.</given-names></name><name><surname>Toro</surname><given-names>N.</given-names></name><name><surname>Jiménez-Zurdo</surname><given-names>J.I.</given-names></name></person-group><article-title>Identification of differentially expressed small non-coding RNAs in the legume endosymbiont <italic>Sinorhizobium meliloti</italic> by comparative genomics</article-title><source>Mol. Microbiol.</source><year>2007</year><volume>66</volume><fpage>1080</fpage><lpage>1091</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2007.05978.x</pub-id><pub-id pub-id-type="pmid">17971083</pub-id></citation></ref>
<ref id="b37-genes-02-00925"><label>37.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Valverde</surname><given-names>C.</given-names></name><name><surname>Livny</surname><given-names>J.</given-names></name><name><surname>Schlüter</surname><given-names>J.P.</given-names></name><name><surname>Reinkensmeier</surname><given-names>J.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Parisi</surname><given-names>G.</given-names></name></person-group><article-title>Prediction of <italic>Sinorhizobium meliloti</italic> sRNA genes and experimental detection in strain 2011</article-title><source>BMC Genomics</source><year>2008</year><volume>9</volume><fpage>416:1</fpage><lpage>416:24</lpage></citation></ref>
<ref id="b38-genes-02-00925"><label>38.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schlüter</surname><given-names>J.P.</given-names></name><name><surname>Reinkensmeier</surname><given-names>J.</given-names></name><name><surname>Daschkey</surname><given-names>S.</given-names></name><name><surname>Evguenieva-Hackenberg</surname><given-names>E.</given-names></name><name><surname>Janssen</surname><given-names>S.</given-names></name><name><surname>Jänicke</surname><given-names>S.</given-names></name><name><surname>Becker</surname><given-names>J.D.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name></person-group><article-title>A genome-wide survey of sRNAs in the symbiotic nitrogen-fixing alpha-proteobacterium <italic>Sinorhizobium meliloti</italic></article-title><source>BMC Genomics</source><year>2010</year><volume>11</volume><fpage>245:1</fpage><lpage>245:35</lpage></citation></ref>
<ref id="b39-genes-02-00925"><label>39.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jones</surname><given-names>K.M.</given-names></name><name><surname>Kobayashi</surname><given-names>H.</given-names></name><name><surname>Davies</surname><given-names>B.W.</given-names></name><name><surname>Taga</surname><given-names>M.E.</given-names></name><name><surname>Walker</surname><given-names>G.C.</given-names></name></person-group><article-title>How rhizobial symbionts invade plants: The <italic>Sinorhizobium-Medicago</italic> model</article-title><source>Nat. Rev. Microbiol.</source><year>2007</year><volume>5</volume><fpage>619</fpage><lpage>633</lpage><pub-id pub-id-type="doi">10.1038/nrmicro1705</pub-id><pub-id pub-id-type="pmid">17632573</pub-id></citation></ref>
<ref id="b40-genes-02-00925"><label>40.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galibert</surname><given-names>F.</given-names></name><name><surname>Finan</surname><given-names>T.M.</given-names></name><name><surname>Long</surname><given-names>S.R.</given-names></name><name><surname>Puhler</surname><given-names>A.</given-names></name><name><surname>Abola</surname><given-names>P.</given-names></name><name><surname>Ampe</surname><given-names>F.</given-names></name><name><surname>Barloy-Hubler</surname><given-names>F.</given-names></name><name><surname>Barnett</surname><given-names>M.J.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Boistard</surname><given-names>P.</given-names></name><etal/></person-group><article-title>The composite genome of the legume symbiont <italic>Sinorhizobium meliloti</italic></article-title><source>Science</source><year>2001</year><volume>293</volume><fpage>668</fpage><lpage>672</lpage><pub-id pub-id-type="doi">10.1126/science.1060966</pub-id><pub-id pub-id-type="pmid">11474104</pub-id></citation></ref>
<ref id="b41-genes-02-00925"><label>41.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barnett</surname><given-names>M.J.</given-names></name><name><surname>Fisher</surname><given-names>R.F.</given-names></name><name><surname>Jones</surname><given-names>T.</given-names></name><name><surname>Komp</surname><given-names>C.</given-names></name><name><surname>Abola</surname><given-names>A.P.</given-names></name><name><surname>Barloy-Hubler</surname><given-names>F.</given-names></name><name><surname>Bowser</surname><given-names>L.</given-names></name><name><surname>Capela</surname><given-names>D.</given-names></name><name><surname>Galibert</surname><given-names>F.</given-names></name><name><surname>Gouzy</surname><given-names>J.</given-names></name><etal/></person-group><article-title>Nucleotide sequence and predicted functions of the entire <italic>Sinorhizobium meliloti</italic> pSymA megaplasmid</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2001</year><volume>98</volume><fpage>9883</fpage><lpage>9888</lpage><pub-id pub-id-type="doi">10.1073/pnas.161294798</pub-id><pub-id pub-id-type="pmid">11481432</pub-id></citation></ref>
<ref id="b42-genes-02-00925"><label>42.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finan</surname><given-names>T.M.</given-names></name><name><surname>Weidner</surname><given-names>S.</given-names></name><name><surname>Wong</surname><given-names>K.</given-names></name><name><surname>Buhrmester</surname><given-names>J.</given-names></name><name><surname>Chain</surname><given-names>P.</given-names></name><name><surname>Vorhölter</surname><given-names>F.J.</given-names></name><name><surname>Hernandez-Lucas</surname><given-names>I.</given-names></name><name><surname>Becker</surname><given-names>A.</given-names></name><name><surname>Cowie</surname><given-names>A.</given-names></name><name><surname>Gouzy</surname><given-names>J.</given-names></name><etal/></person-group><article-title>The complete sequence of the 1,683-kb pSymB megaplasmid from the N2-fixing endosymbiont <italic>Sinorhizobium meliloti</italic></article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2001</year><volume>98</volume><fpage>9889</fpage><lpage>9894</lpage><pub-id pub-id-type="doi">10.1073/pnas.161294698</pub-id><pub-id pub-id-type="pmid">11481431</pub-id></citation></ref>
<ref id="b43-genes-02-00925"><label>43.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pappas</surname><given-names>G.</given-names></name><name><surname>Akritidis</surname><given-names>N.</given-names></name><name><surname>Bosilkovski</surname><given-names>M.</given-names></name><name><surname>Tsianos</surname><given-names>E.</given-names></name></person-group><article-title>Brucellosis</article-title><source>N. Engl. J. Med.</source><year>2005</year><volume>352</volume><fpage>2325</fpage><lpage>2336</lpage><pub-id pub-id-type="doi">10.1056/NEJMra050570</pub-id><pub-id pub-id-type="pmid">15930423</pub-id></citation></ref>
<ref id="b44-genes-02-00925"><label>44.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Florin</surname><given-names>T.A.</given-names></name><name><surname>Zaoutis</surname><given-names>T.E.</given-names></name><name><surname>Zaoutis</surname><given-names>L.B.</given-names></name></person-group><article-title>Beyond cat scratch disease: Widening spectrum of <italic>Bartonella henselae</italic> infection</article-title><source>Pediatrics</source><year>2008</year><volume>121</volume><fpage>e1413</fpage><lpage>e1425</lpage><pub-id pub-id-type="doi">10.1542/peds.2007-1897</pub-id><pub-id pub-id-type="pmid">18443019</pub-id></citation></ref>
<ref id="b45-genes-02-00925"><label>45.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCullen</surname><given-names>C.A.</given-names></name><name><surname>Binns</surname><given-names>A.N.</given-names></name></person-group><article-title><italic>Agrobacterium tumefaciens</italic> and plant cell interactions and activities required for interkingdom macromolecular transfer</article-title><source>Annu. Rev. Cell. Dev. Biol.</source><year>2006</year><volume>22</volume><fpage>101</fpage><lpage>127</lpage><pub-id pub-id-type="doi">10.1146/annurev.cellbio.22.011105.102022</pub-id><pub-id pub-id-type="pmid">16709150</pub-id></citation></ref>
<ref id="b46-genes-02-00925"><label>46.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Höchsmann</surname><given-names>T.</given-names></name><name><surname>Höchsmann</surname><given-names>M.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name></person-group><article-title>Thermodynamic Matchers: Strengthening the Significance of RNA Folding Energies</article-title><conf-name>Proceedings of the Computational Systems Bioinformatics Conference</conf-name><conf-loc>Houston, TX, USA</conf-loc><conf-date>August 2006</conf-date><fpage>111</fpage><lpage>121</lpage></citation></ref>
<ref id="b47-genes-02-00925"><label>47.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reeder</surname><given-names>J.</given-names></name><name><surname>Reeder</surname><given-names>J.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name></person-group><article-title>Locomotif: From graphical motif description to RNA motif search</article-title><source>Bioinformatics</source><year>2007</year><volume>23</volume><fpage>i392</fpage><lpage>i400</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btm179</pub-id><pub-id pub-id-type="pmid">17646322</pub-id></citation></ref>
<ref id="b48-genes-02-00925"><label>48.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gillespie</surname><given-names>J.J.</given-names></name><name><surname>Wattam</surname><given-names>A.R.</given-names></name><name><surname>Cammer</surname><given-names>S.A.</given-names></name><name><surname>Gabbard</surname><given-names>J.</given-names></name><name><surname>Shukla</surname><given-names>M.P.</given-names></name><name><surname>Dalay</surname><given-names>O.</given-names></name><name><surname>Driscoll</surname><given-names>T.</given-names></name><name><surname>Hix</surname><given-names>D.</given-names></name><name><surname>Mane</surname><given-names>S.P.</given-names></name><name><surname>Mao</surname><given-names>C.</given-names></name><etal/></person-group><article-title>PATRIC: The comprehensive bacterial bioinformatics resource with a focus on human pathogenic species</article-title><source>Infect. Immun.</source><year>2011</year><volume>79</volume><fpage>4286</fpage><lpage>4298</lpage><pub-id pub-id-type="doi">10.1128/IAI.00207-11</pub-id><pub-id pub-id-type="pmid">21896772</pub-id></citation></ref>
<ref id="b49-genes-02-00925"><label>49.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galardini</surname><given-names>M.</given-names></name><name><surname>Mengoni</surname><given-names>A.</given-names></name><name><surname>Brilli</surname><given-names>M.</given-names></name><name><surname>Pini</surname><given-names>F.</given-names></name><name><surname>Fioravanti</surname><given-names>A.</given-names></name><name><surname>Lucas</surname><given-names>S.</given-names></name><name><surname>Lapidus</surname><given-names>A.</given-names></name><name><surname>Cheng</surname><given-names>J.F.</given-names></name><name><surname>Goodwin</surname><given-names>L.</given-names></name><name><surname>Pitluck</surname><given-names>S.</given-names></name><etal/></person-group><article-title>Exploring the symbiotic pangenome of the nitrogen-fixing bacterium <italic>Sinorhizobium meliloti</italic></article-title><source>BMC Genomics</source><year>2011</year><volume>12</volume><fpage>235:1</fpage><lpage>235:15</lpage></citation></ref>
<ref id="b50-genes-02-00925"><label>50.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>González</surname><given-names>V.</given-names></name><name><surname>Acosta</surname><given-names>J.L.</given-names></name><name><surname>Santamaría</surname><given-names>R.I.</given-names></name><name><surname>Bustos</surname><given-names>P.</given-names></name><name><surname>Fernández</surname><given-names>J.L.</given-names></name><name><surname>González</surname><given-names>I.L.H.</given-names></name><name><surname>Díaz</surname><given-names>R.</given-names></name><name><surname>Flores</surname><given-names>M.</given-names></name><name><surname>Palacios</surname><given-names>R.</given-names></name><name><surname>Mora</surname><given-names>J.</given-names></name><etal/></person-group><article-title>Conserved symbiotic plasmid DNA sequences in the multireplicon pangenomic structure of <italic>Rhizobium etli</italic></article-title><source>Appl. Environ. Microbiol.</source><year>2010</year><volume>76</volume><fpage>1604</fpage><lpage>1614</lpage><pub-id pub-id-type="doi">10.1128/AEM.02039-09</pub-id><pub-id pub-id-type="pmid">20048063</pub-id></citation></ref>
<ref id="b51-genes-02-00925"><label>51.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schmeisser</surname><given-names>C.</given-names></name><name><surname>Liesegang</surname><given-names>H.</given-names></name><name><surname>Krysciak</surname><given-names>D.</given-names></name><name><surname>Bakkou</surname><given-names>N.</given-names></name><name><surname>Quéré</surname><given-names>A.L.</given-names></name><name><surname>Wollherr</surname><given-names>A.</given-names></name><name><surname>Heinemeyer</surname><given-names>I.</given-names></name><name><surname>Morgenstern</surname><given-names>B.</given-names></name><name><surname>Pommerening-Röser</surname><given-names>A.</given-names></name><name><surname>Flores</surname><given-names>M.</given-names></name><etal/></person-group><article-title><italic>Rhizobium</italic> sp. strain NGR234 possesses a remarkable number of secretion systems</article-title><source>Appl. Environ. Microbiol.</source><year>2009</year><volume>75</volume><fpage>4035</fpage><lpage>4045</lpage><pub-id pub-id-type="doi">10.1128/AEM.00515-09</pub-id><pub-id pub-id-type="pmid">19376903</pub-id></citation></ref>
<ref id="b52-genes-02-00925"><label>52.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reeve</surname><given-names>W.</given-names></name><name><surname>Chain</surname><given-names>P.</given-names></name><name><surname>O'Hara</surname><given-names>G.</given-names></name><name><surname>Ardley</surname><given-names>J.</given-names></name><name><surname>Nandesena</surname><given-names>K.</given-names></name><name><surname>Bräu</surname><given-names>L.</given-names></name><name><surname>Tiwari</surname><given-names>R.</given-names></name><name><surname>Malfatti</surname><given-names>S.</given-names></name><name><surname>Kiss</surname><given-names>H.</given-names></name><name><surname>Lapidus</surname><given-names>A.</given-names></name><etal/></person-group><article-title>Complete genome sequence of the <italic>Medicago</italic> microsymbiont Ensifer (<italic>Sinorhizobium</italic>) <italic>medicae</italic> strain WSM419</article-title><source>Stand. Genomic Sci.</source><year>2010</year><volume>2</volume><fpage>77</fpage><lpage>86</lpage><pub-id pub-id-type="doi">10.4056/sigs.43526</pub-id><pub-id pub-id-type="pmid">21304680</pub-id></citation></ref>
<ref id="b53-genes-02-00925"><label>53.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Slater</surname><given-names>S.C.</given-names></name><name><surname>Goldman</surname><given-names>B.S.</given-names></name><name><surname>Goodner</surname><given-names>B.</given-names></name><name><surname>Setubal</surname><given-names>J.C.</given-names></name><name><surname>Farrand</surname><given-names>S.K.</given-names></name><name><surname>Nester</surname><given-names>E.W.</given-names></name><name><surname>Burr</surname><given-names>T.J.</given-names></name><name><surname>Banta</surname><given-names>L.</given-names></name><name><surname>Dickerman</surname><given-names>A.W.</given-names></name><name><surname>Paulsen</surname><given-names>I.</given-names></name><etal/></person-group><article-title>Genome sequences of three agrobacterium biovars help elucidate the evolution of multichromosome genomes in bacteria</article-title><source>J. Bacteriol.</source><year>2009</year><volume>191</volume><fpage>2501</fpage><lpage>2511</lpage><pub-id pub-id-type="doi">10.1128/JB.01779-08</pub-id><pub-id pub-id-type="pmid">19251847</pub-id></citation></ref>
<ref id="b54-genes-02-00925"><label>54.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Georg</surname><given-names>J.</given-names></name><name><surname>Schön</surname><given-names>V.</given-names></name><name><surname>Ude</surname><given-names>S.</given-names></name><name><surname>Hess</surname><given-names>W.R.</given-names></name></person-group><article-title>Biocomputational prediction of non-coding RNAs in model cyanobacteria</article-title><source>BMC Genomics</source><year>2009</year><volume>10</volume><fpage>123:1</fpage><lpage>123:15</lpage></citation></ref>
<ref id="b55-genes-02-00925"><label>55.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duan</surname><given-names>Y.</given-names></name><name><surname>Zhou</surname><given-names>L.</given-names></name><name><surname>Hall</surname><given-names>D.G.</given-names></name><name><surname>Li</surname><given-names>W.</given-names></name><name><surname>Doddapaneni</surname><given-names>H.</given-names></name><name><surname>Lin</surname><given-names>H.</given-names></name><name><surname>Liu</surname><given-names>L.</given-names></name><name><surname>Vahling</surname><given-names>C.M.</given-names></name><name><surname>Gabriel</surname><given-names>D.W.</given-names></name><name><surname>Williams</surname><given-names>K.P.</given-names></name><etal/></person-group><article-title>Complete genome sequence of citrus huanglongbing bacterium, ‘<italic>Candidatus</italic> Liberibacter asiaticus’ obtained through metagenomics</article-title><source>Mol. Plant Microbe Interact.</source><year>2009</year><volume>22</volume><fpage>1011</fpage><lpage>1020</lpage><pub-id pub-id-type="doi">10.1094/MPMI-22-8-1011</pub-id><pub-id pub-id-type="pmid">19589076</pub-id></citation></ref>
<ref id="b56-genes-02-00925"><label>56.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bentley</surname><given-names>S.D.</given-names></name><name><surname>Parkhill</surname><given-names>J.</given-names></name></person-group><article-title>Comparative genomic structure of prokaryotes</article-title><source>Annu. Rev. Genet.</source><year>2004</year><volume>38</volume><fpage>771</fpage><lpage>792</lpage><pub-id pub-id-type="doi">10.1146/annurev.genet.38.072902.094318</pub-id><pub-id pub-id-type="pmid">15568993</pub-id></citation></ref>
<ref id="b57-genes-02-00925"><label>57.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mushegian</surname><given-names>A.R.</given-names></name><name><surname>Koonin</surname><given-names>E.V.</given-names></name></person-group><article-title>Gene order is not conserved in bacterial evolution</article-title><source>Trends Genet.</source><year>1996</year><volume>12</volume><fpage>289</fpage><lpage>290</lpage><pub-id pub-id-type="doi">10.1016/0168-9525(96)20006-X</pub-id><pub-id pub-id-type="pmid">8783936</pub-id></citation></ref>
<ref id="b58-genes-02-00925"><label>58.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Suyama</surname><given-names>M.</given-names></name><name><surname>Bork</surname><given-names>P.</given-names></name></person-group><article-title>Evolution of prokaryotic gene order: Genome rearrangements in closely related species</article-title><source>Trends Genet.</source><year>2001</year><volume>17</volume><fpage>10</fpage><lpage>13</lpage><pub-id pub-id-type="doi">10.1016/S0168-9525(00)02159-4</pub-id><pub-id pub-id-type="pmid">11163906</pub-id></citation></ref>
<ref id="b59-genes-02-00925"><label>59.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Biondi</surname><given-names>E.G.</given-names></name><name><surname>Femia</surname><given-names>A.P.</given-names></name><name><surname>Favilli</surname><given-names>F.</given-names></name><name><surname>Bazzicalupo</surname><given-names>M.</given-names></name></person-group><article-title>IS Rm31, a new insertion sequence of the IS 66 family in <italic>Sinorhizobium meliloti</italic></article-title><source>Arch. Microbiol.</source><year>2003</year><volume>180</volume><fpage>118</fpage><lpage>126</lpage><pub-id pub-id-type="doi">10.1007/s00203-003-0568-x</pub-id><pub-id pub-id-type="pmid">12819859</pub-id></citation></ref>
<ref id="b60-genes-02-00925"><label>60.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname><given-names>Q.</given-names></name><name><surname>Chen</surname><given-names>Y.P.P.</given-names></name></person-group><article-title>Modeling conserved structure patterns for functional noncoding RNA</article-title><source>IEEE Trans. Biomed. Eng.</source><year>2011</year><volume>58</volume><fpage>1528</fpage><lpage>1533</lpage><pub-id pub-id-type="doi">10.1109/TBME.2010.2090043</pub-id><pub-id pub-id-type="pmid">21041153</pub-id></citation></ref>
<ref id="b61-genes-02-00925"><label>61.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Gierga</surname><given-names>G.</given-names></name><name><surname>Axmann</surname><given-names>I.M.</given-names></name><name><surname>Hess</surname><given-names>W.R.</given-names></name></person-group><article-title>A motif-based search in bacterial genomes identifies the ortholog of the small RNA Yfr1 in all lineages of cyanobacteria</article-title><source>BMC Genomics</source><year>2007</year><volume>8</volume><fpage>375:1</fpage><lpage>375:11</lpage></citation></ref>
<ref id="b62-genes-02-00925"><label>62.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kingsford</surname><given-names>C.L.</given-names></name><name><surname>Ayanbule</surname><given-names>K.</given-names></name><name><surname>Salzberg</surname><given-names>S.L.</given-names></name></person-group><article-title>Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake</article-title><source>Genome Biol.</source><year>2007</year><volume>8</volume><fpage>R22:1</fpage><lpage>R22:12</lpage></citation></ref>
<ref id="b63-genes-02-00925"><label>63.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fröhlich</surname><given-names>K.S.</given-names></name><name><surname>Vogel</surname><given-names>J.</given-names></name></person-group><article-title>Activation of gene expression by small RNA</article-title><source>Curr. Opin. Microbiol.</source><year>2009</year><volume>12</volume><fpage>674</fpage><lpage>682</lpage><pub-id pub-id-type="doi">10.1016/j.mib.2009.09.009</pub-id><pub-id pub-id-type="pmid">19880344</pub-id></citation></ref>
<ref id="b64-genes-02-00925"><label>64.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Otaka</surname><given-names>H.</given-names></name><name><surname>Ishikawa</surname><given-names>H.</given-names></name><name><surname>Morita</surname><given-names>T.</given-names></name><name><surname>Aiba</surname><given-names>H.</given-names></name></person-group><article-title>PolyU tail of rho-independent terminator of bacterial small RNAs is essential for Hfq action</article-title><source>Proc. Natl. Acad. Sci. USA</source><year>2011</year><volume>108</volume><fpage>13059</fpage><lpage>13064</lpage><pub-id pub-id-type="doi">10.1073/pnas.1107050108</pub-id><pub-id pub-id-type="pmid">21788484</pub-id></citation></ref>
<ref id="b65-genes-02-00925"><label>65.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gruber</surname><given-names>A.R.</given-names></name><name><surname>Lorenz</surname><given-names>R.</given-names></name><name><surname>Bernhart</surname><given-names>S.H.</given-names></name><name><surname>Neuböck</surname><given-names>R.</given-names></name><name><surname>Hofacker</surname><given-names>I.L.</given-names></name></person-group><article-title>The Vienna RNA websuite</article-title><source>Nucleic Acids Res.</source><year>2008</year><volume>36</volume><fpage>W70</fpage><lpage>W74</lpage><pub-id pub-id-type="doi">10.1093/nar/gkn188</pub-id><pub-id pub-id-type="pmid">18424795</pub-id></citation></ref>
<ref id="b66-genes-02-00925"><label>66.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hankins</surname><given-names>J.S.</given-names></name><name><surname>Denroche</surname><given-names>H.</given-names></name><name><surname>Mackie</surname><given-names>G.A.</given-names></name></person-group><article-title>Interactions of the RNA-binding protein Hfq with cspA mRNA, encoding the major cold shock protein</article-title><source>J. Bacteriol.</source><year>2010</year><volume>192</volume><fpage>2482</fpage><lpage>2490</lpage><pub-id pub-id-type="doi">10.1128/JB.01619-09</pub-id><pub-id pub-id-type="pmid">20233932</pub-id></citation></ref>
<ref id="b67-genes-02-00925"><label>67.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brennan</surname><given-names>R.G.</given-names></name><name><surname>Link</surname><given-names>T.M.</given-names></name></person-group><article-title>Hfq structure, function and ligand binding</article-title><source>Curr. Opin. Microbiol.</source><year>2007</year><volume>10</volume><fpage>125</fpage><lpage>133</lpage><pub-id pub-id-type="doi">10.1016/j.mib.2007.03.015</pub-id><pub-id pub-id-type="pmid">17395525</pub-id></citation></ref>
<ref id="b68-genes-02-00925"><label>68.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Göpel</surname><given-names>Y.</given-names></name><name><surname>Lüttmann</surname><given-names>D.</given-names></name><name><surname>Heroven</surname><given-names>A.K.</given-names></name><name><surname>Reichenbach</surname><given-names>B.</given-names></name><name><surname>Dersch</surname><given-names>P.</given-names></name><name><surname>Görke</surname><given-names>B.</given-names></name></person-group><article-title>Common and divergent features in transcriptional control of the homologous small RNAs GlmY and GlmZ in Enterobacteriaceae</article-title><source>Nucleic Acids Res.</source><year>2011</year><volume>39</volume><fpage>1294</fpage><lpage>1309</lpage><pub-id pub-id-type="doi">10.1093/nar/gkq986</pub-id><pub-id pub-id-type="pmid">20965974</pub-id></citation></ref>
<ref id="b69-genes-02-00925"><label>69.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salim</surname><given-names>N.N.</given-names></name><name><surname>Feig</surname><given-names>A.L.</given-names></name></person-group><article-title>An upstream Hfq binding site in the fhlA mRNA leader region facilitates the OxyS-fhlA interaction</article-title><source>PLoS One</source><year>2010</year><volume>5</volume><fpage>e13028:1</fpage><lpage>e13028:11</lpage></citation></ref>
<ref id="b70-genes-02-00925"><label>70.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brantl</surname><given-names>S.</given-names></name></person-group><article-title>Antisense-RNA regulation and RNA interference</article-title><source>Biochim. Biophys. Acta</source><year>2002</year><volume>1575</volume><fpage>15</fpage><lpage>25</lpage><pub-id pub-id-type="doi">10.1016/S0167-4781(02)00280-4</pub-id><pub-id pub-id-type="pmid">12020814</pub-id></citation></ref>
<ref id="b71-genes-02-00925"><label>71.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Antal</surname><given-names>M.</given-names></name><name><surname>Bordeau</surname><given-names>V.</given-names></name><name><surname>Douchin</surname><given-names>V.</given-names></name><name><surname>Felden</surname><given-names>B.</given-names></name></person-group><article-title>A small bacterial RNA regulates a putative ABC transporter</article-title><source>J. Biol. Chem.</source><year>2005</year><volume>280</volume><fpage>7901</fpage><lpage>7908</lpage><pub-id pub-id-type="pmid">15618228</pub-id></citation></ref>
<ref id="b72-genes-02-00925"><label>72.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Georg</surname><given-names>J.</given-names></name><name><surname>Hess</surname><given-names>W.R.</given-names></name></person-group><article-title>Regulatory RNAs in cyanobacteria: Developmental decisions, stress responses and a plethora of chromosomally encoded cis-antisense RNAs</article-title><source>Biol. Chem.</source><year>2011</year><volume>392</volume><fpage>291</fpage><lpage>297</lpage><pub-id pub-id-type="doi">10.1515/bc.2011.046</pub-id><pub-id pub-id-type="pmid">21294678</pub-id></citation></ref>
<ref id="b73-genes-02-00925"><label>73.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wilms</surname><given-names>I.</given-names></name><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Hess</surname><given-names>W.R.</given-names></name><name><surname>Leichert</surname><given-names>L.I.</given-names></name><name><surname>Narberhaus</surname><given-names>F.</given-names></name></person-group><article-title>Small RNA-mediated control of the <italic>Agrobacterium tumefaciens</italic> GABA binding protein</article-title><source>Mol. Microbiol.</source><year>2011</year><volume>80</volume><fpage>492</fpage><lpage>506</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2011.07589.x</pub-id><pub-id pub-id-type="pmid">21320185</pub-id></citation></ref>
<ref id="b74-genes-02-00925"><label>74.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Benito</surname><given-names>Y.</given-names></name><name><surname>Kolb</surname><given-names>F.A.</given-names></name><name><surname>Romby</surname><given-names>P.</given-names></name><name><surname>Lina</surname><given-names>G.</given-names></name><name><surname>Etienne</surname><given-names>J.</given-names></name><name><surname>Vandenesch</surname><given-names>F.</given-names></name></person-group><article-title>Probing the structure of RNAIII, the <italic>Staphylococcus aureus</italic> agr regulatory RNA, and identification of the RNA domain involved in repression of protein A expression</article-title><source>RNA</source><year>2000</year><volume>6</volume><fpage>668</fpage><lpage>679</lpage><pub-id pub-id-type="doi">10.1017/S1355838200992550</pub-id><pub-id pub-id-type="pmid">10836788</pub-id></citation></ref>
<ref id="b75-genes-02-00925"><label>75.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huntzinger</surname><given-names>E.</given-names></name><name><surname>Boisset</surname><given-names>S.</given-names></name><name><surname>Saveanu</surname><given-names>C.</given-names></name><name><surname>Benito</surname><given-names>Y.</given-names></name><name><surname>Geissmann</surname><given-names>T.</given-names></name><name><surname>Namane</surname><given-names>A.</given-names></name><name><surname>Lina</surname><given-names>G.</given-names></name><name><surname>Etienne</surname><given-names>J.</given-names></name><name><surname>Ehresmann</surname><given-names>B.</given-names></name><name><surname>Ehresmann</surname><given-names>C.</given-names></name><etal/></person-group><article-title><italic>Staphylococcus aureus</italic> RNAIII and the endoribonuclease III coordinately regulate spa gene expression</article-title><source>EMBO J.</source><year>2005</year><volume>24</volume><fpage>824</fpage><lpage>835</lpage><pub-id pub-id-type="doi">10.1038/sj.emboj.7600572</pub-id><pub-id pub-id-type="pmid">15678100</pub-id></citation></ref>
<ref id="b76-genes-02-00925"><label>76.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Boisset</surname><given-names>S.</given-names></name><name><surname>Geissmann</surname><given-names>T.</given-names></name><name><surname>Huntzinger</surname><given-names>E.</given-names></name><name><surname>Fechter</surname><given-names>P.</given-names></name><name><surname>Bendridi</surname><given-names>N.</given-names></name><name><surname>Possedko</surname><given-names>M.</given-names></name><name><surname>Chevalier</surname><given-names>C.</given-names></name><name><surname>Helfer</surname><given-names>A.C.</given-names></name><name><surname>Benito</surname><given-names>Y.</given-names></name><name><surname>Jacquier</surname><given-names>A.</given-names></name><etal/></person-group><article-title><italic>Staphylococcus aureus</italic> RNAIII coordinately represses the synthesis of virulence factors and the transcription regulator Rot by an antisense mechanism</article-title><source>Genes Dev.</source><year>2007</year><volume>21</volume><fpage>1353</fpage><lpage>1366</lpage><pub-id pub-id-type="doi">10.1101/gad.423507</pub-id><pub-id pub-id-type="pmid">17545468</pub-id></citation></ref>
<ref id="b77-genes-02-00925"><label>77.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johansen</surname><given-names>J.</given-names></name><name><surname>Eriksen</surname><given-names>M.</given-names></name><name><surname>Kallipolitis</surname><given-names>B.</given-names></name><name><surname>Valentin-Hansen</surname><given-names>P.</given-names></name></person-group><article-title>Down-regulation of outer membrane proteins by noncoding RNAs: Unraveling the cAMP-CRP- and sigmaE-dependent CyaR-ompX regulatory case</article-title><source>J. Mol. Biol.</source><year>2008</year><volume>383</volume><fpage>1</fpage><lpage>9</lpage><pub-id pub-id-type="doi">10.1016/j.jmb.2008.06.058</pub-id><pub-id pub-id-type="pmid">18619465</pub-id></citation></ref>
<ref id="b78-genes-02-00925"><label>78.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Papenfort</surname><given-names>K.</given-names></name><name><surname>Pfeiffer</surname><given-names>V.</given-names></name><name><surname>Lucchini</surname><given-names>S.</given-names></name><name><surname>Sonawane</surname><given-names>A.</given-names></name><name><surname>Hinton</surname><given-names>J.C.D.</given-names></name><name><surname>Vogel</surname><given-names>J.</given-names></name></person-group><article-title>Systematic deletion of <italic>Salmonella</italic> small RNA genes identifies CyaR, a conserved CRP-dependent riboregulator of OmpX synthesis</article-title><source>Mol. Microbiol.</source><year>2008</year><volume>68</volume><fpage>890</fpage><lpage>906</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2008.06189.x</pub-id><pub-id pub-id-type="pmid">18399940</pub-id></citation></ref>
<ref id="b79-genes-02-00925"><label>79.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lay</surname><given-names>N.D.</given-names></name><name><surname>Gottesman</surname><given-names>S.</given-names></name></person-group><article-title>The Crp-activated small noncoding regulatory RNA CyaR (RyeE) links nutritional status to group behavior</article-title><source>J. Bacteriol.</source><year>2009</year><volume>191</volume><fpage>461</fpage><lpage>476</lpage><pub-id pub-id-type="doi">10.1128/JB.01157-08</pub-id><pub-id pub-id-type="pmid">18978044</pub-id></citation></ref>
<ref id="b80-genes-02-00925"><label>80.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berghoff</surname><given-names>B.A.</given-names></name><name><surname>Glaeser</surname><given-names>J.</given-names></name><name><surname>Sharma</surname><given-names>C.M.</given-names></name><name><surname>Vogel</surname><given-names>J.</given-names></name><name><surname>Klug</surname><given-names>G.</given-names></name></person-group><article-title>Photooxidative stress-induced and abundant small RNAs in <italic>Rhodobacter sphaeroides</italic></article-title><source>Mol. Microbiol.</source><year>2009</year><volume>74</volume><fpage>1497</fpage><lpage>1512</lpage><pub-id pub-id-type="doi">10.1111/j.1365-2958.2009.06949.x</pub-id><pub-id pub-id-type="pmid">19906181</pub-id></citation></ref>
<ref id="b81-genes-02-00925"><label>81.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meyer</surname><given-names>F.</given-names></name><name><surname>Goesmann</surname><given-names>A.</given-names></name><name><surname>McHardy</surname><given-names>A.C.</given-names></name><name><surname>Bartels</surname><given-names>D.</given-names></name><name><surname>Bekel</surname><given-names>T.</given-names></name><name><surname>Clausen</surname><given-names>J.</given-names></name><name><surname>Kalinowski</surname><given-names>J.</given-names></name><name><surname>Linke</surname><given-names>B.</given-names></name><name><surname>Rupp</surname><given-names>O.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name><etal/></person-group><article-title>GenDB—An open source genome annotation system for prokaryote genomes</article-title><source>Nucleic Acids Res.</source><year>2003</year><volume>31</volume><fpage>2187</fpage><lpage>2195</lpage><pub-id pub-id-type="doi">10.1093/nar/gkg312</pub-id><pub-id pub-id-type="pmid">12682369</pub-id></citation></ref>
<ref id="b82-genes-02-00925"><label>82.</label><citation citation-type="web"><article-title>NCBI genomes FTP site</article-title><comment>Available online: <ext-link xlink:href="ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Bacteria/" ext-link-type="ftp">ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Bacteria/</ext-link> (accessed on 25 February 2011)</comment></citation></ref>
<ref id="b83-genes-02-00925"><label>83.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Menzel</surname><given-names>P.</given-names></name><name><surname>Gorodkin</surname><given-names>J.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name></person-group><article-title>The tedious task of finding homologous noncoding RNA genes</article-title><source>RNA</source><year>2009</year><volume>15</volume><fpage>2075</fpage><lpage>2082</lpage><pub-id pub-id-type="doi">10.1261/rna.1556009</pub-id><pub-id pub-id-type="pmid">19861422</pub-id></citation></ref>
<ref id="b84-genes-02-00925"><label>84.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altschul</surname><given-names>S.F.</given-names></name><name><surname>Madden</surname><given-names>T.L.</given-names></name><name><surname>Schäffer</surname><given-names>A.A.</given-names></name><name><surname>Zhang</surname><given-names>J.</given-names></name><name><surname>Zhang</surname><given-names>Z.</given-names></name><name><surname>Miller</surname><given-names>W.</given-names></name><name><surname>Lipman</surname><given-names>D.J.</given-names></name></person-group><article-title>Gapped BLAST and PSI-BLAST: A new generation of protein database search programs</article-title><source>Nucleic Acids Res.</source><year>1997</year><volume>25</volume><fpage>3389</fpage><lpage>3402</lpage><pub-id pub-id-type="doi">10.1093/nar/25.17.3389</pub-id><pub-id pub-id-type="pmid">9254694</pub-id></citation></ref>
<ref id="b85-genes-02-00925"><label>85.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hertel</surname><given-names>J.</given-names></name><name><surname>de Jong</surname><given-names>D.</given-names></name><name><surname>Marz</surname><given-names>M.</given-names></name><name><surname>Rose</surname><given-names>D.</given-names></name><name><surname>Tafer</surname><given-names>H.</given-names></name><name><surname>Tanzer</surname><given-names>A.</given-names></name><name><surname>Schierwater</surname><given-names>B.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name></person-group><article-title>Non-coding RNA annotation of the genome of <italic>Trichoplax adhaerens</italic></article-title><source>Nucleic Acids Res.</source><year>2009</year><volume>37</volume><fpage>1602</fpage><lpage>1615</lpage><pub-id pub-id-type="doi">10.1093/nar/gkn1084</pub-id><pub-id pub-id-type="pmid">19151082</pub-id></citation></ref>
<ref id="b86-genes-02-00925"><label>86.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krause</surname><given-names>A.</given-names></name><name><surname>Stoye</surname><given-names>J.</given-names></name><name><surname>Vingron</surname><given-names>M.</given-names></name></person-group><article-title>Large scale hierarchical clustering of protein sequences</article-title><source>BMC Bioinform.</source><year>2005</year><volume>6</volume><fpage>15:1</fpage><lpage>15:12</lpage></citation></ref>
<ref id="b87-genes-02-00925"><label>87.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Will</surname><given-names>S.</given-names></name><name><surname>Reiche</surname><given-names>K.</given-names></name><name><surname>Hofacker</surname><given-names>I.L.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name><name><surname>Backofen</surname><given-names>R.</given-names></name></person-group><article-title>Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering</article-title><source>PLoS Comput. Biol.</source><year>2007</year><volume>3</volume><fpage>e65:1</fpage><lpage>e65:15</lpage></citation></ref>
<ref id="b88-genes-02-00925"><label>88.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hofacker</surname><given-names>I.L.</given-names></name><name><surname>Fekete</surname><given-names>M.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name></person-group><article-title>Secondary structure prediction for aligned RNA sequences</article-title><source>J. Mol. Biol.</source><year>2002</year><volume>319</volume><fpage>1059</fpage><lpage>1066</lpage><pub-id pub-id-type="doi">10.1016/S0022-2836(02)00308-X</pub-id><pub-id pub-id-type="pmid">12079347</pub-id></citation></ref>
<ref id="b89-genes-02-00925"><label>89.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bernhart</surname><given-names>S.H.</given-names></name><name><surname>Hofacker</surname><given-names>I.L.</given-names></name><name><surname>Will</surname><given-names>S.</given-names></name><name><surname>Gruber</surname><given-names>A.R.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name></person-group><article-title>RNAalifold: Improved consensus structure prediction for RNA alignments</article-title><source>BMC Bioinform.</source><year>2008</year><volume>9</volume><fpage>474:1</fpage><lpage>474:13</lpage></citation></ref>
<ref id="b90-genes-02-00925"><label>90.</label><citation citation-type="web"><person-group person-group-type="author"><collab>The Bielefeld Bioinformatics Server</collab></person-group><comment>Available online: <ext-link xlink:href="http://bibiserv.techfak.uni-bielefeld.de/" ext-link-type="uri">http://bibiserv.techfak.uni-bielefeld.de/</ext-link> (accessed on 10 May 2011)</comment></citation></ref>
<ref id="b91-genes-02-00925"><label>91.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hofacker</surname><given-names>I.L.</given-names></name><name><surname>Fontana</surname><given-names>W.</given-names></name><name><surname>Stadler</surname><given-names>P.F.</given-names></name><name><surname>Bonhoeffer</surname><given-names>L.S.</given-names></name><name><surname>Tacker</surname><given-names>M.</given-names></name><name><surname>Schuster</surname><given-names>P.</given-names></name></person-group><article-title>Fast folding and comparison of RNA secondary structures</article-title><source>Monatsh. Chem.</source><year>1994</year><volume>125</volume><fpage>167</fpage><lpage>188</lpage><pub-id pub-id-type="doi">10.1007/BF00818163</pub-id></citation></ref>
<ref id="b92-genes-02-00925"><label>92.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giegerich</surname><given-names>R.</given-names></name><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Rehmsmeier</surname><given-names>M.</given-names></name></person-group><article-title>Abstract shapes of RNA</article-title><source>Nucleic Acids Res.</source><year>2004</year><volume>32</volume><fpage>4843</fpage><lpage>4851</lpage><pub-id pub-id-type="doi">10.1093/nar/gkh779</pub-id><pub-id pub-id-type="pmid">15371549</pub-id></citation></ref>
<ref id="b93-genes-02-00925"><label>93.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steffen</surname><given-names>P.</given-names></name><name><surname>Voss</surname><given-names>B.</given-names></name><name><surname>Rehmsmeier</surname><given-names>M.</given-names></name><name><surname>Reeder</surname><given-names>J.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name></person-group><article-title>RNAshapes: An integrated RNA analysis package based on abstract shapes</article-title><source>Bioinformatics</source><year>2006</year><volume>22</volume><fpage>500</fpage><lpage>503</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btk010</pub-id><pub-id pub-id-type="pmid">16357029</pub-id></citation></ref>
<ref id="b94-genes-02-00925"><label>94.</label><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Huang</surname><given-names>J.</given-names></name><name><surname>Voss</surname><given-names>B.</given-names></name></person-group><article-title>RNAHeliCes—Folding Space Analysis Based on Position Aware Structure Abstraction</article-title><conf-name>German Conference on Bioinformatics</conf-name><conf-loc>Weihenstephan, Germany</conf-loc><conf-date>7–9 September 2011</conf-date></citation></ref>
<ref id="b95-genes-02-00925"><label>95.</label><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reeder</surname><given-names>J.</given-names></name><name><surname>Giegerich</surname><given-names>R.</given-names></name></person-group><article-title>Consensus shapes: An alternative to the Sankoff algorithm for RNA consensus structure prediction</article-title><source>Bioinformatics</source><year>2005</year><volume>21</volume><fpage>3516</fpage><lpage>3523</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/bti577</pub-id><pub-id pub-id-type="pmid">16020472</pub-id></citation></ref></ref-list></back></article>
