Looked at Life from Both Sides Now

As the molecular top–down causality emerging through comparative genomics is combined with the bottom–up dynamic chemical networks of biochemistry, the molecular symbiotic relationships driving growth of the tree of life becomes strikingly apparent. These symbioses can be mutualistic or parasitic across many levels, but most foundational is the complex and intricate mutualism of nucleic acids and proteins known as the central dogma of biological information flow. This unification of digital and analog molecular information within a common chemical network enables processing of the vast amounts of information necessary for cellular life. Here we consider the molecular information pathways of these dynamic biopolymer networks from the perspective of their evolution and use that perspective to inform and constrain pathways for the construction of mutualistic polymers.

. The architectural core of prions and amyloids. Fibers are defined both by the length of the H-bonded β-strands and β-sheet stacking, or lamination. The distance between two H-bonded peptides in a β-sheet is 4.7Å and the approximate distance between laminated β-sheets is 10Å. The amino (N) terminus is colored blue and the carboxy (C) terminus is colored red. The vertical grey arrow indicates the H-bonding direction and the typical direction of the amyloid fiber long axis.

Biopolymer Diversity
Discrete geometries of hydrogen-bonding donor and acceptor pairs direct genomic information, achieving "digital-like" coding fidelity. In this case, functional diversity is accumulated through mutations within the nucleic acid polymer sequence that is expressed in phenotypic variations mapped to the tree of life. The iconic right-handed B-DNA double helix [26,27] can adopt other antiparallel double helix conformations, including A-DNA, Z-DNA and A-RNA, and their biological roles are still being explored [28][29][30]. Bulges, hairpins, and cruciform conformations further contribute to the structural diversity and extend the possible functions of this scaffold [31,32]. The ability of nucleic acids to fold into polymorphic non-canonical structures in response to metal coordination (G-quadruplexes) and pH (i-motif) [33][34][35], and the growing evidence for these structures existing in vivo [36,37], have continued to expand the functional possibilities [38,39]. The extension to new nucleic acid catalysts [5][6][7] through in vitro directed evolution is reviewed in this issue of Life (Muller) and elsewhere [40][41][42][43]. Clearly the nucleic acid scaffold has remarkable functional potential for both information storage and processing.
Proteins, with an even greater diversity in side chain functionality, are acutely sensitive to amino acid substitutions and are environmentally responsive in their folding dynamics. The polyamide backbone can access α-helices, β-sheets, and turns directed by non-covalent interactions ranging from van der Waals, hydrophobic effects, aromatic stacking, π-cation interactions, hydrogen-bonding and electrostatic interactions. All these factors influence secondary and tertiary peptide conformations in context-dependent ways [44,45] and provide remarkable diversity that still defies 3° and 4° structural predictions [44,46].
The growing recognition of the prevalence of protein misfolding diseases and prion infections has focused greater attention on defining higher order 4° structures and protein phases [47][48][49]. A central feature of all known prion and amyloid fibers is the range of accessible paracrystalline cross-β architectures [50,51]. Amyloid fiber cross-section is defined both by the length of the H-bonded β-strand and side-chain interactions that stabilize sheet stacking, or lamination ( Figure 1). These fiber ends template the addition of individual peptides, reducing the diverse conformational space any peptide can sample to a single state. Unlike information storage in nucleic acid duplexes, the template cross-β protein strand transmits conformational information to the incoming strand, which then serves as the template for further propagation to the next incoming strand. In prions, specific morphological forms of the peptide cross-β architecture are selected for and propagated from clonal ensembles of diverse amyloid templates [52][53][54][55].
The diversity of conformational forms accessible to these peptide templates may be most easily explored in studies with simple peptides. The nucleating core of the Aβ peptide associated with Alzheimer's disease, 17 LVFF 21 A, can access a range of morphological phases, each responsive to pH, media dialectic, surfaces, solvent composition, and various salts [56][57][58][59]. Despite this diversity, specific conditions have been found that allow for the growth of phases sufficiently homogeneous for structural characterization. At neutral pH, anti-parallel β-sheet fibrils predominate with Aβ [16][17][18][19][20][21][22], Ac-16 KLVFFA 22 E-NH2 ( Figure 2). Cross-strand pairing between the positively charged lysine and negatively charged glutamate residues of neighboring H-bonded strands defines an in-register arrangement [59,60]. Protonation of the glutamate at low pH favors a shift of the strands to out-of-register, creating more complementary β-sheet faces that allow the number of sheets to grow (laminate) into ribbons and nanotubes [59,[61][62][63]. The Aβ [16][17][18][19][20][21][22] E22L congener, Ac-KLVFFAL-NH2, removes this ionic pairing constraint completely and assembles independent of pH as nanotubes with the same out-of-register β-strands [57,59,63,64]. [16][17][18][19][20][21][22] peptide assembles into distinct morphologies depending on sequence and conditions. In the cartoons above, blue represents the positively charged lysine (K), red the negatively charged glutamate (E), and orange the uncharged glutamine (Q). Two of the four faces of the fiber in the anti-parallel β-sheets present lysine and glutamate residues on the surface (purple), whereas the fibers with parallel β-sheets have a lysine surface and a glutamine surface. (Unpublished EM images from members of Lynn Lab.)

Figure 2. Aβ
Single molecule experiments have been used to map the phase transitions [65,66] and have begun to reveal intermediate nucleation events during conformational progression [67][68][69]. The simple change of one O for an NH in Ac-KLVFFAQ-NH2 initially assembles as antiparallel β-strands, which arise as a kinetic intermediate controlled by charge repulsion in the initial particle phase. A secondary conformational mutation stabilized by glutamine side chain cross-strand pairing directs the nucleation and propagation of the thermodynamically more stable parallel β-strand registry [68]. This clear demonstration of kinetic intermediates expands the diversity of accessible informational forms that may propagate under different environmental conditions, and highlights the progressive nature of peptide nucleation and propagation as a mechanism for the selection of functional information.
The diversity of these peptide assemblies is derived not only from primary sequence, but also from the varied arrangements accessible to the ordered phases. In the simplest forms, even single aromatic amino acids [70] and dipeptides, containing aromatic [71][72][73][74] and aliphatic residues [75,76], can assemble as distinct phases. Aliphatic dipeptides (AI, IA, VV, VI, AV, VA, and IV) assemble into hexagonal prisms to form microporous crystalline materials, zeolites [75,76], and various other complex assemblies [77,78]. Diphenylalanine nanotubes (FF) can react with 2-iminothiolane, in which the subsequent thiolation of the N-terminal primary amine induces a morphological shift from nanotubes to spherical closed cages [74]. The opportunity to exploit even the simplest of peptide-assembly surfaces as templates for further chemistry with subsequent feedback control of the assembly certainly expands the potential functional diversity. As the structural understanding of these assemblies continues to grow and new surfaces are designed as templates for post-assembly modification, the functional diversity of these forms will continue to extend their informational potential.

Functional Assemblies
Genomic information present in the one-dimensional nucleic acid templates, as managed by proteins, produces the functional biopolymers of living dynamic cellular networks. Quite possibly the simplest functional form of this molecular information processing is found in viroids. These single stranded RNAs of only a few hundred nucleotides direct pathogenesis in many plant diseases [79]. Viroids are non-protein coding circular structures that assemble as long rod-like forms with a central conserved region and loops containing complex hydrogen-bonding patterns [80][81][82]. One class of viroids contains a hammerhead ribozyme for self-cleavage during replication cycles [80,83,84], allowing many of their functions to be self-contained and hailed as vestiges of an RNA World [85]. However, viroids are informational templates and depend on the host's cellular network for functional replication.
These activities may represent only a small fraction of the functions that provide the basis for the various amyloid diseases, and these capabilities have led to speculation about an amyloid world built on the potential of these surfaces to store and transfer information [109][110][111]. As impressive as these activities are, neither RNA nor amyloid alone achieve the functional diversity necessary for even the simplest cellular networks. The mutualism so evident in every step of information transmission in a cellular network suggests that life required both sides.

From Both Sides Now
Proteins and nucleic acids both store and process chemical information in cellular networks, but they do so interdependently; proteins are made from nucleic acid templates, and proteins read the templates. To the extent that there is a code or set of rules for such a mutualism may be revealed in the specific recognition of nucleic acids by RNA-and DNA-binding proteins and their elaborate structure-function relationship [112][113][114]. As mentioned above, the pinnacle may be the nucleic acid/peptide (NA/P) associations in the ribosome [9,115], but small nuclear ribonucleoprotein (snRNP) complexes and nucleases, including Ribonuclease P (RNaseP), provide simpler examples [5]. The woven intricacies of such co-assemblies are increasingly being defined [116,117] and have been reconstituted [9,118]. As rules for co-assembly emerge, co-evolutionary strategies become possible. Figure 3 outlines minimal functional capabilities that might be achieved with the co-assemblies. The nucleation of polymers (red and blue) might propagate as a template (purple) able to catalyze the independent production of more polymers (red and blue) in Figure 3A. Mutations in such a feedback system could allow for a minimal system capable of chemical evolution. This functional capability is of course dependent on the ability of these co-assemblies to transition to some unique functional form (mixture of squares and triangles, purple), as outlined in Figure 3B. While extant biology is a sophisticated network of efficiently functioning molecular partners, the many varied functions necessary for the existence of life, relies on the emergence of such co-assembling information networks capable of sustained growth in molecular order.
Data now exists for prion infectivity being altered by oligonucleotides [119][120][121][122][123], but little structural information is available. Similarly, in vitro guanine (dG)16 and cytosine (dC)16 hexadecamers, as well as their duplexes, associate with amphiphilic self-assembling peptides. The binding is pH dependent, occurs on a much faster timescale than for assembly of the neat peptide, and indeed gives rise to novel nucleic acid/peptide co-assemblies [124]. Hybridization of oligonucleotides containing sticky-ends can be catalyzed in the presence of self-assembling peptides, mimicking at least one integral step of replication [125]. And a five-nucleotide ribozyme produces multiple translational products, including the dipeptide FF [126]. In principle [127,128], when FF reaches the critical concentration for peptide self-assembly, the resulting peptide nanotube could selectively bind the ribozyme to create a dynamic system under feedback control.
Progress in this growing understanding of peptides and nucleic acids co-assembly may well depend on careful selection of systems, and the rich diversity of functional co-assemblies with existing biopolymers provides an exciting opportunity to define the basic mutualistic codes. In the simple assemblies outlined in Figure 3, the thermodynamic process of assembly would be physically coupled to polymerization of additional polymers from monomer building blocks, with the thermodynamic/kinetic tension manifested as feedback control. Elucidating the mechanisms of association, and the structural and functional diversity of the resulting complexes, will be necessary for a more general understanding of the chemical thresholds for early evolution of molecular mutualism. Simpler synthetic [129,130] and altered biopolymer [11,131,132] networks could then be extended to the dynamic processes of minimal native biopolymer co-assemblies [133][134][135][136].

Conclusions: Towards Molecular Mutualisms
The cooperative functioning of nucleic acids and peptides was recognized early with the designation of a central dogma decades ago, and it may well be that our attempts to simplify the system into metabolism-first, RNA-first, or amyloid-first has limited a more extensive exploration of mutualistic networks. Single biopolymer networks have contributed significantly to systems chemistry, but sidestepped the interdependence of metabolism and replication, analog and digital information processing, and the mutualism, so apparent in living systems, that is essential for the emergence of new functions [137]. The recognition of the importance of diverse, structurally complex, non-covalent assemblies in early evolution is certainly highlighted by their prevalance in biology [138][139][140][141], and routes to creating an ecology of biomolecules with sufficient functional dynamics to evolve chemically are now emerging. We may have reached the point where the creation of a mutualistic chemical ecology [142,143] can be used to inform the progressive growth of molecular information on Earth and understand the limits on chemical evolution broadly in our Universe [144][145][146]. As technology and methodologies to analyze complex networks improve, the next scientific discoveries may come from understanding harsh and extreme environments or even new "worlds" with fundamentally different systems of molecular networks [11,147,148]. Certainly we should now be able to define the limits on building to the complex networks we might call living.