Viroids-First—A Model for Life on Earth, Mars and Exoplanets

The search for extraterrestrial life, recently fueled by the discovery of exoplanets, requires defined biosignatures. Current biomarkers include those of extremophilic organisms, typically archaea. Yet these cellular organisms are highly complex, which makes it unlikely that similar life forms evolved on other planets. Earlier forms of life on Earth may serve as better models for extraterrestrial life. On modern Earth, the simplest and most abundant biological entities are viroids and viruses that exert many properties of life, such as the abilities to replicate and undergo Darwinian evolution. Viroids have virus-like features, and are related to ribozymes, consisting solely of non-coding RNA, and may serve as more universal models for early life than do cellular life forms. Among the various proposed concepts, such as “proteins-first” or “metabolism-first”, we think that “viruses-first” can be specified to “viroids-first” as the most likely scenario for the emergence of life on Earth, and possibly elsewhere. With this article we intend to inspire the integration of virus research and the biosignatures of viroids and viruses into the search for extraterrestrial life.


Potential for Life in the Universe
Exoplanets orbit around central stars distinct from our Sun. The number of detected exoplanets is increasing rapidly; 3800 have been discovered by 2018, 2300 of them by the Kepler telescope that recently stopped transmitting signals. The first exoplanets were detected by Mayor and Queloz in 1995 [1]. They are, for example, identified by transit photometry which detects the regular, minute dimming of their star by the passing planet. There are an estimated 10 10 exoplanets in our galaxy, and 10 25 in the entire Universe [2]. This enormous number makes it likely that some form of life exists somewhere else. Those exoplanets located in habitable zones are candidates for hosting life, if conditions are "neither too hot, nor too cold, but just right", like the third bowl of porridge in the fairy tale "Goldilocks and the Three Bears" [3]. The habitable zone is at a distance to the Sun that supports liquid water. Large amounts of liquid water were recently identified on Mars under the southern polar ice cap, and water-ice was also detected on the Moon [4]. Furthermore, a source of energy is required, which might be their sun itself, but also geothermal energy or gravitational forces that can generate energy. The large number of exoplanets suggests a high probability for extraterrestrial life.
The detection of life on distant exoplanets relies on indirect, mainly spectroscopic methods. No samples can be collected and sent back to Earth with current technology. Robots may collect samples on planets within our galaxy, perform experiments and transmit data. Sequencing instruments such as Oxford Nanopore may be used to detect and sequence nucleic acids. Current theories on the origin of life on Earth are speculative with numerous assumptions, but may provide models for the possible evolution of life elsewhere. The likely requirements for any evolution of life on Earth are liquid water, carbon-based biomolecules, one or more sources of energy, and relatively stable temperatures. Energy could originate from chemicals instead of sunlight, which does not reach deeper than~200 m below sea level for photosynthesis to occur. An atmosphere protecting against radioactive and UV radiation may have supported the development of life.
A "warm little pond", as proposed by Charles Darwin, or a niche or pore in which compounds are concentrated for chemical reactions to occur, may have been at the origin of prebiotic molecules. The primordial soup likely contained so many different molecules that the astronomical number of possible sequence arrangements raises the question of how information could arise. The random assembly of a genome of~1 Mio base pairs (a small bacterial genome) with a set of four nucleotides, would realistically require about 16 Mio attempts if each "correct" nucleotide is selected for and fixed, and the next nucleotide is randomized until it also gets fixed [16].
Nucleotide-like compounds such as purines and pyrimidines may have arrived on Earth with meteorites [12] (Figure 1). Abiotic synthesis pathways for both purine and pyrimidine nucleosides have been reported recently [17]. Short RNA molecules can form polymers chemically [18]. RNA was likely among the earliest biomolecules on Earth. RNA can replicate without protein-based polymerases, yet, it is a rather unstable molecule. Alternatives were proposed, including PNA (peptide nucleic acid) with more stable peptide bonds instead of phosphodiester linkages. Other possible structures may have been TNA (threose nucleic acid), GNA (glycerol-derived nucleic acid analogue), PLA (pyranosyl nucleic acid), or XNA (xeno nucleic acid) [12,19]. Amino acids can help stabilize RNA or ribozymes, and even increase their catalytic activity as discussed below Geosciences 2019, 9, x FOR PEER REVIEW 3 of 18 Current theories on the origin of life on Earth are speculative with numerous assumptions, but may provide models for the possible evolution of life elsewhere. The likely requirements for any evolution of life on Earth are liquid water, carbon-based biomolecules, one or more sources of energy, and relatively stable temperatures. Energy could originate from chemicals instead of sunlight, which does not reach deeper than ~200 m below sea level for photosynthesis to occur. An atmosphere protecting against radioactive and UV radiation may have supported the development of life.
A "warm little pond", as proposed by Charles Darwin, or a niche or pore in which compounds are concentrated for chemical reactions to occur, may have been at the origin of prebiotic molecules. The primordial soup likely contained so many different molecules that the astronomical number of possible sequence arrangements raises the question of how information could arise. The random assembly of a genome of ~1 Mio base pairs (a small bacterial genome) with a set of four nucleotides, would realistically require about 16 Mio attempts if each "correct" nucleotide is selected for and fixed, and the next nucleotide is randomized until it also gets fixed [16].
Nucleotide-like compounds such as purines and pyrimidines may have arrived on Earth with meteorites [12] (Figure 1). Abiotic synthesis pathways for both purine and pyrimidine nucleosides have been reported recently [17]. Short RNA molecules can form polymers chemically [18]. RNA was likely among the earliest biomolecules on Earth. RNA can replicate without protein-based polymerases, yet, it is a rather unstable molecule. Alternatives were proposed, including PNA (peptide nucleic acid) with more stable peptide bonds instead of phosphodiester linkages. Other possible structures may have been TNA (threose nucleic acid), GNA (glycerol-derived nucleic acid analogue), PLA (pyranosyl nucleic acid), or XNA (xeno nucleic acid) [12,19]. Amino acids can help stabilize RNA or ribozymes, and even increase their catalytic activity as discussed below Figure 1. Building blocks of life. Some were identified in meteorite samples, such as amino acids, dipeptides, and nucleotide-like molecules. The most abundant elements in present-day biomolecules are C, H, O, P, N and S. Life may have started close to hydrothermal vents that provided thermal the energy required to accelerate chemical reactions. There, compartments such as pores may have helped to concentrate substrates for more complex compounds to form, and catalysts may have included metal cations that today are crucial cofactors in many biological processes. PNA, peptide nucleic acid.

The RNA world
Early forms of life must have had the ability for autonomous replication and Darwinian evolution in response to environmental conditions. Thus, they must have followed NASA's definition of life as "A self-sustaining chemical system capable of Darwinian evolution" [20]. Known molecules with these characteristics are self-replicating RNAs such as ribozymes, viroids or related polymers such as PNA. Building blocks of life. Some were identified in meteorite samples, such as amino acids, dipeptides, and nucleotide-like molecules. The most abundant elements in present-day biomolecules are C, H, O, P, N and S. Life may have started close to hydrothermal vents that provided thermal the energy required to accelerate chemical reactions. There, compartments such as pores may have helped to concentrate substrates for more complex compounds to form, and catalysts may have included metal cations that today are crucial cofactors in many biological processes. PNA, peptide nucleic acid.

The RNA world
Early forms of life must have had the ability for autonomous replication and Darwinian evolution in response to environmental conditions. Thus, they must have followed NASA's definition of life as "A self-sustaining chemical system capable of Darwinian evolution" [20]. Known molecules with these characteristics are self-replicating RNAs such as ribozymes, viroids or related polymers such as PNA.
In the test tube, RNA has been shown to be able to evolve from an enzymatically inactive to an active form. Large pools such as 10 15 random 220-mer RNA molecules give rise to species with various three-dimensional conformations. About one in 10 10 is catalytically active, others may become catalytically active in response to a provided substrate [21]. The in vitro selection for functional nucleic acids gave rise to catalytically active RNA molecules: The ribozymes. Other conformations formed by random RNA sequences include structures such as cloverleaves that resemble today's transfer RNAs (tRNAs) [21,22].
Ribozymes are multifunctional molecules-a property which makes them likely to be one of the earliest biomolecules on Earth. RNA is the only naturally occurring molecule that is both "hardware and software", a physical entity with a structure-mediated function, and that is also a carrier of information [23]. Ribozymes have been shown to be able to cleave, join and replicate other ribozymes, thus generating progeny, and evolve depending on environmental conditions. Even RNA polymerization can be achieved by ribozymes [21,24]. Ribozymes can also form peptide bonds and generate oligopeptides from the various amino acids available. Individual amino acids can be attached to tRNAs, especially leucine or phenylalanine, a step towards protein synthesis [19]. There are viruses of plants and fungi including Turnip Yellow Mosaic Virus, Brome Mosaic Virus or Narnaviruses that contain a tRNA-like structure linked to an amino acid such as valine, histidine, or tyrosine. These viral structures were described as "mimicry" of translation [25,26] and may represent an evolutionary step towards protein synthesis.
In today's cells, protein synthesis is achieved by ribosomes. How may this complex machinery have evolved? First, tRNA-like RNAs may have evolved to bind amino acids and found an RNA to bind to while carrying the amino acid cargo. The genetic code evolved [27]. Then, the ribozymatically active RNA formed peptide bonds, leading to a pre-ribosome. A ribozyme is the enzymatic center of ribosomes, the protein-synthesizing machinery of all cellular organisms. "The ribosome is a ribozyme" was the phrase coined by Thomas Cech [28]. Additional non-coding RNAs (ncRNAs) are part of today's ribosomes as scaffolds, as well as~80 proteins (in the eukaryotic ribosome) [29]. Phylogenetic methods suggest that ribosomes have evolved from simpler ribonucleoproteins of an ancient ribonucleoprotein world [30].
Ribonucleotides (the precursors of RNA) could have been reduced to deoxyribonucleotides (the precursors of DNA) chemically, without an enzyme such as a ribonucleotide reductase [19]. DNA can also form catalytic deoxyribozymes (similar to ribozymes) in the laboratory that are, however, not known to occur naturally [31]. In support of a ribonucleoprotein world, Szostak pointed out peptide-like compounds that preserved an RNA tail, including acetyl-CoA or vitamin B12. They appear like intermediates from the RNA to the protein world with "forgotten" RNA tails-which may be stabilizing appendages [32,33]. However, even before the advent of protein synthesis and ribosomes, RNA likely exerted multifaceted roles. In the proposed RNA world, before the evolution of the genetic code, all replicating RNAs were ncRNAs. Ribozymes are among the simplest known biomolecules with many characteristics of life, such as replication and evolvability.
Life may have started in the oceans at hydrothermal vents, at the sites of collision of tectonic plates that form the fire belt [34]. Niches promoted molecular interactions, and minerals were available as catalysts. Broad ranges of temperatures and pressures and chemical energy were available. Also, ice may have been an important initial milieu for the evolution of life, as water displacement during ice crystals formation can concentrate reacting substrates [35].  Yet, RNA is a rather labile molecule that likely needed shielding from UV radiation, radioactivity and cosmic rays, possibly by rocks or other light-absorbing layers. One possible consequence is that early self-replicating RNAs were restricted in their maximum size, and required fast synthesizing rates to outpace degradation [19]. Also, today's RNA viruses have considerably smaller genomes than DNA viruses, possibly due to the instability of RNA [37]. Furthermore, it has been suggested that borate minerals present in primitive oceans may have stabilized RNA against decomposition [38]. Also, amino acids can bind to and protect RNA from degradation (see below). Later during evolution, when proteins emerged, RNAs were likely protected by RNA-binding proteins, as evidenced by the fact that virtually all known RNA viruses, including HIV and influenza, express RNA-binding proteins [39]. In addition, a more stable form of nucleic acid, DNA, also evolved.
When did the DNA arise? The building blocks, deoxyribonucleotides, may have been produced by a chemical deoxygenation of ribonucleotides [21,40]. Then, DNA copies of RNA templates could have been synthesized by the action of ribozymes [41,42]. All this could have been possible before a reverse transcriptase protein enzyme, a hallmark of retroviruses, accelerated the reaction from RNA to DNA.
Retroviruses could have originated from a simple RNA genome, combined with a tRNA linked to an amino acid, that may have evolved into the reverse transcriptase. This enzyme was likely a major step towards DNA synthesis, as protein polymerases accelerated replication, and thereby evolution. For example, a highly optimized RNA polymerase ribozyme was reported to have an extension rate of ~1.2 nucleotides per minute [43]. In contrast, protein polymerases typically have extension rates of 1000 nucleotides per minute, exert a higher processivity and incorporate fewer errors [44]. Thus, ribozyme reverse transcriptases may have been restricted to the conversion of relatively short RNAs, whereas the protein reverse transcriptase could convert much longer RNA molecules. It has been suggested that non-coding group II introns, mobile RNA elements with ribozyme activity, acquired a reverse transcriptase gene, and then further evolved into retroviruses [45][46][47]. This example of a gene uptake suggests that ribozymes can increase their information content, grow and evolve into more complex genetic elements.
It should be noted that, before the advent of reverse transcriptase, the RNA world likely did not have a genetic code. Instead, information of ncRNAs resides in their structure, typically partially double-stranded regions with single-stranded hairpin loops. These ncRNAs are still relevant today as regulators of gene expression in cellular organisms. It is quite surprising that possible remnants of the RNA world still exert key functions in the present DNA-protein world. The majority of the eukaryotic genomes encodes ncRNAs that mainly serve as regulators of gene expression. Yet, RNA is a rather labile molecule that likely needed shielding from UV radiation, radioactivity and cosmic rays, possibly by rocks or other light-absorbing layers. One possible consequence is that early self-replicating RNAs were restricted in their maximum size, and required fast synthesizing rates to outpace degradation [19]. Also, today's RNA viruses have considerably smaller genomes than DNA viruses, possibly due to the instability of RNA [37]. Furthermore, it has been suggested that borate minerals present in primitive oceans may have stabilized RNA against decomposition [38]. Also, amino acids can bind to and protect RNA from degradation (see below). Later during evolution, when proteins emerged, RNAs were likely protected by RNA-binding proteins, as evidenced by the fact that virtually all known RNA viruses, including HIV and influenza, express RNA-binding proteins [39]. In addition, a more stable form of nucleic acid, DNA, also evolved.
When did the DNA arise? The building blocks, deoxyribonucleotides, may have been produced by a chemical deoxygenation of ribonucleotides [21,40]. Then, DNA copies of RNA templates could have been synthesized by the action of ribozymes [41,42]. All this could have been possible before a reverse transcriptase protein enzyme, a hallmark of retroviruses, accelerated the reaction from RNA to DNA.
Retroviruses could have originated from a simple RNA genome, combined with a tRNA linked to an amino acid, that may have evolved into the reverse transcriptase. This enzyme was likely a major step towards DNA synthesis, as protein polymerases accelerated replication, and thereby evolution. For example, a highly optimized RNA polymerase ribozyme was reported to have an extension rate of~1.2 nucleotides per minute [43]. In contrast, protein polymerases typically have extension rates of 1000 nucleotides per minute, exert a higher processivity and incorporate fewer errors [44]. Thus, ribozyme reverse transcriptases may have been restricted to the conversion of relatively short RNAs, whereas the protein reverse transcriptase could convert much longer RNA molecules. It has been suggested that non-coding group II introns, mobile RNA elements with ribozyme activity, acquired a reverse transcriptase gene, and then further evolved into retroviruses [45][46][47]. This example of a gene uptake suggests that ribozymes can increase their information content, grow and evolve into more complex genetic elements.
It should be noted that, before the advent of reverse transcriptase, the RNA world likely did not have a genetic code. Instead, information of ncRNAs resides in their structure, typically partially double-stranded regions with single-stranded hairpin loops. These ncRNAs are still relevant today as regulators of gene expression in cellular organisms. It is quite surprising that possible remnants of the RNA world still exert key functions in the present DNA-protein world. The majority of the eukaryotic genomes encodes ncRNAs that mainly serve as regulators of gene expression.
In human cells, a large proportion of the 98% ncDNA is transcribed into ncRNAs that regulate expression of the 2% coding DNA [48]. One of the regulatory RNAs are the circular RNAs (circRNAs) that structurally resemble ribozymes, and were recently described to serve as "sponges" for other regulatory RNAs in mammalian cells, thus serving as chief gene regulators [49]. The ncRNAs that are mainly based on structural information are also important components of RNA viruses, providing regulatory elements, such as sites for ribosomal entry, primer binding and dimerization [39]. (Figure 3).

Viroids
Ribozymes are closely related to viroids, virus-like elements with non-coding hairpin-loop structured circular RNA, sometimes with catalytic activity, and without a coat that is typical for viruses ( Figure 3). Many viroids are plant pathogens that cause significant economic damage to potatoes, papayas, flowers, and others. They were designated as "living fossils" at the "frontier of life" [50][51][52][53][54]. Passage through plants can lead to loss of their catalytic activity, likely as a result from adaptation to the presence of host enzymes, a reductive evolution in a rich environment (see below). Carnation flowers are infected with a viroid-like agent, the Carnation Small Viroid-like (CarSV) RNA, only 275 nucleotides in size [55]. It is unique, in that it is a retroviroid with a DNA form whose formation requires the reverse transcriptase of a plant pararetrovirus [55]. Thus, ncRNA can give rise to a DNA viroid, indicating that viroids can be modified in ways we may be unaware of. In addition, viroids have been shown to be able to increase in complexity by acquiring a coding capacity [56,57]. These capabilities support the idea that simple viroids may have evolved to more complex entities.
expression of the 2% coding DNA [48]. One of the regulatory RNAs are the circular RNAs (circRNAs) that structurally resemble ribozymes, and were recently described to serve as "sponges" for other regulatory RNAs in mammalian cells, thus serving as chief gene regulators [49]. The ncRNAs that are mainly based on structural information are also important components of RNA viruses, providing regulatory elements, such as sites for ribosomal entry, primer binding and dimerization [39]. (Figure 3).

Viroids
Ribozymes are closely related to viroids, virus-like elements with non-coding hairpin-loop structured circular RNA, sometimes with catalytic activity, and without a coat that is typical for viruses ( Figure 3). Many viroids are plant pathogens that cause significant economic damage to potatoes, papayas, flowers, and others. They were designated as "living fossils" at the "frontier of life" [50][51][52][53][54]. Passage through plants can lead to loss of their catalytic activity, likely as a result from adaptation to the presence of host enzymes, a reductive evolution in a rich environment (see below). Carnation flowers are infected with a viroid-like agent, the Carnation Small Viroid-like (CarSV) RNA, only 275 nucleotides in size [55]. It is unique, in that it is a retroviroid with a DNA form whose formation requires the reverse transcriptase of a plant pararetrovirus [55]. Thus, ncRNA can give rise to a DNA viroid, indicating that viroids can be modified in ways we may be unaware of. In addition, viroids have been shown to be able to increase in complexity by acquiring a coding capacity [56,57]. These capabilities support the idea that simple viroids may have evolved to more complex entities.
The human pathogenic hepatitis delta virus (HDV) originated from a catalytically active plant viroid that acquired coding information-a rare event for a ribozyme or viroid. The gene originated from liver cells that express proteins supporting processing and transport of the viroid. Coinfection with the hepatitis B virus (HBV) allows HDV to gain the ability to infect other cells by using the HBV envelope protein, thereby becoming a "real" virus [56]. The uptake of a cellular gene by the precursor of HDV further supports that ribozymes can gain additional information (Figure 3).  [50][51][52][53][54], contribute as HDV to liver cancer in humans [56], can perform peptide synthesis in ribosomes [28], act as regulatory circRNA [49], have been used for sequence-specific cleavage in gene therapy approaches [58], and ribozymes have evolved gradually to today's retroviruses by acquiring coding information [47]. RT, reverse transcriptase; RBP, RNA-binding protein; APE, apurinic  [50][51][52][53][54], contribute as HDV to liver cancer in humans [56], can perform peptide synthesis in ribosomes [28], act as regulatory circRNA [49], have been used for sequence-specific cleavage in gene therapy approaches [58], and ribozymes have evolved gradually to today's retroviruses by acquiring coding information [47]. RT, reverse transcriptase; RBP, RNA-binding protein; APE, apurinic endonuclease; RH, ribonuclease H; Gag, group-specific antigen; PR, protease; Int, integrase; Env, envelope. The human pathogenic hepatitis delta virus (HDV) originated from a catalytically active plant viroid that acquired coding information-a rare event for a ribozyme or viroid. The gene originated from liver cells that express proteins supporting processing and transport of the viroid. Coinfection with the hepatitis B virus (HBV) allows HDV to gain the ability to infect other cells by using the HBV envelope protein, thereby becoming a "real" virus [56]. The uptake of a cellular gene by the precursor of HDV further supports that ribozymes can gain additional information (Figure 3).
Adding positively charged amino acids or peptides has been shown to enhance the catalytic activity of ribozymes by a hundredfold or more in vitro. Furthermore, positively charged proteins help to disentangle RNA secondary structures, acting as chaperones, which allows for faster replication or transcription rates, and protects the RNA from degradation [59]. RNA-binding proteins are present as nucleocapsid proteins in RNA viruses [60]. It is therefore conceivable that replicating RNAs in the RNA world also associated with amino acids or peptides.
Many different amino acids and other biomolecules may have arrived on Earth from space as evidenced by the Murchison Meteorite [12]. The synthesis of some of these building blocks has also been achieved in the laboratory. The famous original Miller-Urey Experiment from 1959 allowed for the synthesis of organic compounds from simple inorganic precursors such as water (H 2 O), methane (CH 4 ), ammonia (NH 3 ), and hydrogen (H 2 ). These molecules were likely abundant on the Early Earth, and lightning may have been a source of energy for chemical reactions to occur. Among the many biomolecules were five different amino acids as racemic mixtures of right-and left-handed enantiomers [61,62]. The single-handed preference of life forms on Earth arose by mechanisms not yet fully understood. Recently, experiments that extended the Miller-Urey Experiment were conducted by Sutherland and colleagues, who performed a "one-pot" synthesis of lipids, amino acids and nucleic acids from simple precursors containing the elements C, H, O, P, N and S [14].
The last universal common ancestor (LUCA) is thought to be the ancient organism that preceded all cellular life on Earth. However, LUCA was not the beginning of life, since it must have already been a complex organism capable of protein synthesis, DNA replication, and transcription. Simpler replicating elements that preceded LUCA may have originated from structures resembling ribozyme-like viroids or viruses, which were even required for protein synthesis in the form of ribosomes. This view of a viral origin of cells is supported by the recent discovery of giant viruses such as mimiviruses, that possess most of the genes required for autonomous replication, and are as complex as some living species [63]. The giant viruses support the notion that viruses can acquire genetic information via horizontal gene transfer, and thereby become complex cell-like entities [64]. The giant viruses can even be hosts to viruses termed virophages, a characteristic previously assumed to be restricted to cellular organisms [65].
Present viruses are parasites that require cells to replicate. How then can viruses have evolved before cells? Perhaps the ancestors of present-day viruses were free-living entities originally, that later during evolution have lost their autonomy and became intracellular parasites. By the classical definition a virus is an infectious agent that can replicate only within a cell. However, even some of today's viruses, such as mammalian poliovirus, can replicate outside of cells, in a rich environment provided by cell lysates [66]. Loss of genes in response to a rich environment is a general evolutionary process that may explain how parasites can originate from previously autonomous entities [47,67].

Viruses-first
Virus-like entities have evolved from complex molecules like nucleic acids and proteins [68][69][70][71]. Viruses contributed to the evolution of cellular life. The concept is compatible with the proposed ancient RNA world and the replication first approach [50,77], and with the "viroids-first" view described here.

Proteins-first
Life emerged from a self-reproducing system of interacting proteins [75]. Concentrated peptide interactors surrounded by lipid membranes formed protocells [78]. Nucleic acids evolved later and stored the information of protein/peptide interactions.

Metabolism-first
Metabolic networks arose before nucleic acids [72,79,80]. Complex homeostatic metabolic reactions occurred in micellar structures that divided by fission and were capable of Darwinian evolution [72,81]. The combined catalytic reactions of such micelles can be regarded as "compositional genomes".

Spiegelman's Monster
We consider ncRNA to have been among the first biomolecules. Surprisingly, ncRNA is still implicated in many key aspects of modern cellular biology. Are these ncRNAs remnants of an ancient world?
This ncRNA can arise even today under experimental conditions. This was shown with the coding RNA of phage Qß, 4217 nucleotides in length, in the presence of an RNA polymerase, free nucleotides and salts-a rich milieu. Serial transfers and regrowth in new test tubes gave rise to RNA that replicated faster and became smaller. After 74 generations, the RNA has evolved to a fast replicating ncRNA of only 218 nucleotides designated as "Spiegelman's Monster" [82][83][84]. Thus, ancient-appearing RNA can evolve today under the "right", rich conditions. This experiment demonstrated that complex structures can undergo reduction in complexity, size and information content. All 'ballast' has been lost ( Figure 4). interactors surrounded by lipid membranes formed protocells [78]. Nucleic acids evolved later and stored the information of protein/peptide interactions.

Metabolism-first
Metabolic networks arose before nucleic acids [72,79,80]. Complex homeostatic metabolic reactions occurred in micellar structures that divided by fission and were capable of Darwinian evolution [72,81].
The combined catalytic reactions of such micelles can be regarded as "compositional genomes".

Spiegelman's Monster
We consider ncRNA to have been among the first biomolecules. Surprisingly, ncRNA is still implicated in many key aspects of modern cellular biology. Are these ncRNAs remnants of an ancient world?
This ncRNA can arise even today under experimental conditions. This was shown with the coding RNA of phage Qß, 4217 nucleotides in length, in the presence of an RNA polymerase, free nucleotides and salts-a rich milieu. Serial transfers and regrowth in new test tubes gave rise to RNA that replicated faster and became smaller. After 74 generations, the RNA has evolved to a fast replicating ncRNA of only 218 nucleotides designated as "Spiegelman's Monster" [82][83][84]. Thus, ancient-appearing RNA can evolve today under the "right", rich conditions. This experiment demonstrated that complex structures can undergo reduction in complexity, size and information content. All 'ballast' has been lost ( Figure 4). Apparently, a rich environment can lead to a reduction of information that can be accompanied by the loss of autonomous replication. Examples are the bacteria that gave rise to mitochondria and chloroplasts as described below [85,86]. Gene reduction is also characteristic of the microbial population in the intestine, the microbiota. A sugar-rich diet reduces its complexity, possibly by activating phages that lyse bacteria. The selected microbiota has growth advantages, and the loss of genetic complexity is difficult to revert. Consequently, obese people cannot easily lose weight [87]. On the contrary, hunter-gatherers from Africa with a diverse diet have highly complex microbiota [88].

Endosymbiosis and Giant Viruses
Endosymbiosis has been critical for the evolution of life on Earth. Endosymbiotic events are considered rare, and may not have happened on other planets. However, it occurred at least twice during evolution, when formerly free-living bacteria became intracellular entities as mitochondria and chloroplasts [85,86,89]. The majority of their genes were eliminated or delegated to the nucleus, whereby mitochondria, for example, retained only ~37 genes. The endosymbionts specialized into Apparently, a rich environment can lead to a reduction of information that can be accompanied by the loss of autonomous replication. Examples are the bacteria that gave rise to mitochondria and chloroplasts as described below [85,86]. Gene reduction is also characteristic of the microbial population in the intestine, the microbiota. A sugar-rich diet reduces its complexity, possibly by activating phages that lyse bacteria. The selected microbiota has growth advantages, and the loss of genetic complexity is difficult to revert. Consequently, obese people cannot easily lose weight [87]. On the contrary, hunter-gatherers from Africa with a diverse diet have highly complex microbiota [88].

Endosymbiosis and Giant Viruses
Endosymbiosis has been critical for the evolution of life on Earth. Endosymbiotic events are considered rare, and may not have happened on other planets. However, it occurred at least twice during evolution, when formerly free-living bacteria became intracellular entities as mitochondria and chloroplasts [85,86,89]. The majority of their genes were eliminated or delegated to the nucleus, whereby mitochondria, for example, retained only~37 genes. The endosymbionts specialized into becoming the powerhouses of the cell [90]. Although such events are considered extremely rare, analogous processes may occur frequently even today when microbes or viruses infect host cells. Viruses can even integrate their DNA copies into host genomes, and are thereby inherited for many generations, as discussed below. Bacteria of the Rickettsia genus provide another recent example for endosymbiosis. Rickettsia, formerly extracellular bacteria, now exert an obligatory intracellular life-style, accompanied by the loss of~24% of their genome [91].
Giant viruses underwent extensive gene "loss and gain" events during their evolutionary history [92]. Although they likely evolved from smaller viruses, the notion that they can undergo extensive gene loss events lends support to the notion that viruses may have been autonomously replicating entities at some stage of their development.
Giant viruses harbor many components that are considered as hallmarks of autonomous life, such as components of ribosomes and protein synthesis. Many of these genes have likely been acquired recently through horizontal gene transfer. Yet no known virus has the complete set required for independence. Giant viruses can be considered as intermediate forms between the living and the non-living worlds, a transition that is not a sharp border, but a continuum [93].
Based on the concept of endosymbiosis it is considered here that viruses may have undergone a reductive evolution. They may have become intracellular parasites by adapting to the environmentally luxurious conditions provided by the host cell. In the prebiotic world, pre-cells and pre-viruses may have looked similar, and may have evolved into present-day cells and viruses. Present-day viruses may have originated from autonomous viroids that first became more complex, and then evolved into host-dependent parasites.

Retroviruses
With the advent of protein polymerases, most importantly the reverse transcriptase, DNA synthesis accelerated. This enzyme emerged as an important supplier of DNA to cellular genomes, because it converts the retroviral RNA to DNA, which is integrated into the host genome during replication. Double-stranded DNA is more stable than single-stranded RNA. The reverse transcriptase, first discovered in retroviruses, is surprisingly abundant. Reverse transcriptases are involved in the replication of retrotransposable elements that amplify in cellular genomes by a copy-and-paste mechanism. Transposable elements modify cellular genomes by causing gene duplications, insertional mutagenesis, transcriptional activation or a repression of genes. These processes accelerate adjustment of cellular organisms to environmental changes. Genes for retroviral enzymes, such as the reverse transcriptase and the associated ribonuclease H (RNase H) [94], constitute about 13.5 % of the metagenomes of marine plankton, suggesting a contribution to actively ongoing evolutionary events and strong selection pressures [47,95]. The RNase H-fold similarly turned out to be highly abundant, and is one of the five most ancient protein folds found in nature [96,97]. RNase H-like proteins are required, for instance, to counteract viral infections and are integral to many cellular immune systems [47,54]. An RNase H enzyme, Piwi, is essential for retrotransposon silencing in the germline. Lack of Piwi results in male infertility due to too many retrotransposition events, described by Manfred Eigen as "error catastrophe" [98,99].
Retroviruses do not only integrate into somatic cells, but rarely also into germline cells. Retroviruses may have originated from reverse transcriptase-encoding group II introns that are found frequently in prokaryotic genomes [46,47]. Similar to newly integrated group II introns in prokaryotes, the germline-infecting retroviruses are inherited to future generations and become endogenous viruses. Retroviruses significantly influenced the evolution of eukaryotic genomes. Around 50% of the human genome, for example, consists of retrovirus-like elements [100]. Among other functions, they may protect the cell against superinfection by related retroviruses. Most endogenous retroviruses are millions of years old. Their ancestry from infectious viruses was proven by an infectious retrovirus reconstructed from a consensus sequence of a human endogenous retroviruses, designated as "Phoenix" [101]. Envelope genes of endogenous retroviruses have been captured by mammalian species, and as syncytins are critical for placental development [102]. The process of endogenization is actively ongoing in koalas since about ten generations. Starting in the 1920s, koala populations were infected with a retrovirus related to the gibbon ape leukemia virus. Animals that survived the infection had endogenized the retrovirus, which may have protected against superinfection by the exogenous form [47,54,103]. Another example of antiviral protection mediated by endogenous viruses is simian immunodeficiency virus, that infects some monkey species without causing disease [104,105]. Similarly, bats carry various viruses including Ebola and SARS coronavirus without signs of disease [106].
Can HIV eventually become an endogenous virus that renders humans resistant against new infections? Human germline cells can be infected, which is a prerequisite for endogenization to occur [107].
Endogenization of retroviruses demonstrates that viruses can supply new genes, and can be regarded as drivers of evolution. The endogenous retroviruses protect the host cell against superinfection by related viruses, a mechanism originally described for phage-infected bacteria as "superinfection exclusion". Prokaryotes evolved the CRISPR (Clustered regularly interspersed short palindromic repeats) immune system, whereby inserted phage-specific sequences mediate superinfection exclusion against other phages. Viruses protect against viruses, and phages protect against phages [54]. The similarities between the immune systems of prokaryotes and eukaryotes are striking ( Figure 5). Can HIV eventually become an endogenous virus that renders humans resistant against new infections? Human germline cells can be infected, which is a prerequisite for endogenization to occur [107].
Endogenization of retroviruses demonstrates that viruses can supply new genes, and can be regarded as drivers of evolution. The endogenous retroviruses protect the host cell against superinfection by related viruses, a mechanism originally described for phage-infected bacteria as "superinfection exclusion". Prokaryotes evolved the CRISPR (Clustered regularly interspersed short palindromic repeats) immune system, whereby inserted phage-specific sequences mediate superinfection exclusion against other phages. Viruses protect against viruses, and phages protect against phages [54]. The similarities between the immune systems of prokaryotes and eukaryotes are striking ( Figure 5).

Habitable zones
Where can life be expected? There are four celestial bodies in our solar system located within habitable zones: The planet Mars, Europa (a moon of Jupiter), Enceladus (a moon of Uranus), and Titan (a moon of Saturn) [108]. They may serve as model systems for possible early life on Earth, and have water, carbon-containing molecules and energy sources required for primitive forms of life [109][110][111][112].
The climate on Mars about 4 Bya may have supported liquid water, and resembled the climate on the Early Earth, as suggested by landscape structures that likely originate from former rivers [113]. But the water on Mars evaporated, and today's atmosphere is too thin to support life as we know it. Life forms may, however, have survived below the surface. Future Mars missions will aim at collecting samples from below the surface, where possible biological samples would have been protected from destructive cosmic rays.
The moons of Jupiter, Uranus and Saturn are covered with thick layers of ice. There are,

Habitable zones
Where can life be expected? There are four celestial bodies in our solar system located within habitable zones: The planet Mars, Europa (a moon of Jupiter), Enceladus (a moon of Uranus), and Titan (a moon of Saturn) [108]. They may serve as model systems for possible early life on Earth, and have water, carbon-containing molecules and energy sources required for primitive forms of life [109][110][111][112].
The climate on Mars about 4 Bya may have supported liquid water, and resembled the climate on the Early Earth, as suggested by landscape structures that likely originate from former rivers [113]. But the water on Mars evaporated, and today's atmosphere is too thin to support life as we know it.
Life forms may, however, have survived below the surface. Future Mars missions will aim at collecting samples from below the surface, where possible biological samples would have been protected from destructive cosmic rays.
The moons of Jupiter, Uranus and Saturn are covered with thick layers of ice. There are, however, no plate tectonics, as is typical for Earth, and heat generated by radioactivity is not strong enough to melt the ice crusts. In the case of Europa, however, gravitational forces exerted by Jupiter could create friction that generates heat, which likely results in liquid water beneath the ice crust [114]. Similarly, Enceladus likely has a subsurface ocean.
Titan is an ice moon with an atmosphere composed mainly of nitrogen and small amounts of methane and hydrogen. Surface temperatures are about −179 • C. Titan is rich in organic compounds possibly similar to those on primordial Earth.
One theory even proposes that life on Titan may originate from Earth [115]. The possibility of that of contamination raises concerns, as it is difficult to keep tools completely sterile. We need to guarantee planetary protection.
The origin and early development of life is investigated on Earth in terrestrial analog sites. Five "planetary field analog sites" were selected by the Europlanet 2020 Research Infrastructure [116]. They are considered as the closest analogs of the surfaces of Mars, Europa and Titan, to which missions have either recently been conducted, or are being planned: 1. Rio Tinto, Spain, 2. Ibn Battuta Centre, Marrakech, Morocco, 3. Glacial and volcanically active areas of Iceland, 4. Danakil Depression, Ethiopia, 5. Tirez Lake, Spain. From Rio Tinto ten mineral samples were selected containing natural iron oxides, red rust, and sulfates. They will be tested for spectral properties by UV and infrared spectroscopy. The Ibn Battuta Centre allows the testing of instruments for the exploration of Mars and of the Moon.
A combined effort of 20 nations was started in 2014 with a Soyuz Rocket to the International Space Station (ISS), designated as BIOMEX [117]. Samples from more than 20 experiments were analyzed. Some biological samples were attached to the outside of the ISS for several months. The samples were returned to Earth in 2016 for laboratory analyses. Specimens included cyanobacteria, archaea, and lichens (composite organisms of fungi and cyanobacteria or algae). Plants were also exposed to UV and cosmic rays, and are still being investigated for survival, mutations and genome integrity by DNA sequencing. A recent conclusion from the BIOMEX study showed that life on Mars appears to be possible.
Another approach is taken by mimicking space-or Mars-like conditions in the laboratory [118][119][120]. Silicified sediments are analyzed for early signatures of life [121]. A recently developed miniature DNA sequencing machine, the Oxford Nanopore, may offer new opportunities for producing sequencing data in space or on other celestial bodies [122].

Archaea and Archaeal Viruses
Many archaea can tolerate extreme conditions, surviving extreme temperatures, high or low pressure, acidic or basic pH, or high salt concentrations [123]. Archaea prove that life under extreme conditions is possible, which stimulated the search for life outside of Earth. Yet, archaea typically have large genomes of over a million base pairs, and require protein synthesis.
Archaea can use various energy sources, including sugars, ammonia, hydrogen gas, and sunlight. They populate almost all habitats on Earth, and are part of the microbiota of humans and other species. They are particularly abundant in the oceans, and as part of plankton, may be the most abundant group of organisms on our planet. Archaea may be evolutionarily older than bacteria. As all other cellular life forms, archaea also harbor viruses. The archaeal viruses are unique, in that most of the encoded proteins do not share significant sequence similarities with known proteins, and more than a quarter of the proteins whose structure has been determined display unique folds with no structural homologs in public databases [124,125].

Spores and Survival under Extreme Conditions
Spores form as part of the life cycle of many bacteria, plants, algae, fungi, and protozoa. Bacterial spores are highly resistant structures that allow for survival under unfavorable conditions. Fungi commonly produce spores for reproductive purposes.
How long can spores survive? Bacterial spores were discovered and revived from 25-40 Mio years old amber [126]. Spores of another bacterium, Bacillus permians, were reported to have been recovered from a salt crystal, and were estimated to be 250 Mio years old, but its antiquity has been doubted because of potential contamination [127]. Some spores from Streptococcus mitis were sent to the Moon by accident with the Mission Surveyor 3, returned to Earth after 31 months, and were brought back to life. Thus, there is a potential danger of seeding species from Earth to other planets.
The idea that life on Earth originated from other planets, the "panspermia hypothesis", is old. The Greek philosopher Anaxagoras considered this in the 5 th -century BC, and later also scientists including Francis Crick and Leslie Orgel [128] even hypothesized that life on Earth may have been deliberately seeded by extraterrestrial civilizations. Recently the idea was repeated by Steele et al., who considered an extraterrestrial origin of retroviruses during the Cambrian explosion about 541 Mya when life diversified at an enormous speed [129]. The authors also discussed the SARS coronavirus as originating from space. It is difficult to present convincing scientific information that support this speculation. Yet, all hypotheses on the origin of life have this problem to some extent. The voyage of virus particles through extraterrestrial space would destroy their genetic material, RNA or DNA, as previously discussed [130]. Meteorites could protect against degradation, but so far there is contradicting evidence for the presence of cellular or viral structures on meteorites.
A model organism to investigate survival under extreme conditions is the tardigrade that was sent to space and survived [131]. Due to highly effective DNA repair systems tardigrades can resist strong doses of radiation. They can endure extreme environmental conditions and hibernate for several years without water. At −20 • C tardigrades can survive for 30 years [132]. Also, the nematode Caenorhabditis elegans was reported to have been revitalized from 42,000-years-old Siberian permafrost [133].
Samples from accretion ice of Lake Vostok, a subglacial lake in the Antarctic located below an ice layer of 3700 m, revealed the presence of over 3500 different species, mainly bacteria, many of which were previously unknown. These species lived there in the dark for millions of years. Tardigrades were detected there as well [134].
A highly robust bacterium, Deinococcus radiodurans, colonizes the water basins of nuclear reactors in spite of exposure to strong radioactivity [135]. The genome encodes genes for highly efficient DNA repair mechanisms and homologs of plant desiccation resistant genes. More survival strategies may be discovered in the future.
Heat-resistant RNA viruses and related elements have also been reported. Different RNA viruses have been shown to acquire heat resistance when exposed to high temperatures up to 50 • C [136,137], and hyperthermophilic archaea that live at temperatures above 80 • C host RNA viruses [138]. In addition, a thermostable reverse transcriptase of a group II intron (the putative evolutionary precursor of retroviruses) has been reported [139].

Outlook-Including Viral Signatures may Help in the Search for Extraterrestrial Life
All known species on Earth are infected by viruses. Viruses are important drivers of evolution. Their large, rapidly changing gene repertoire has helped them and their hosts to adjust to the environment. Viruses have majorly contributed to the evolution of cellular genomes. Many viruses are not known to be pathogens, but instead mutualists or commensals [140]. This might reflect their evolutionary age and long coevolution with all forms of life. Indeed, a recent definition of life on Earth integrates the important contribution of viruses and RNA networks [71].
Archaea may be the most ancient species existing on Earth. Their chemical traces have been identified, for example, in 3.8 billion year old sediments in Greenland [141]. Traces of biotic life were discovered in a 4.1 billion-year-old zircon in Western Australia [142]. Yet, prokaryotes are highly complex, with double-stranded DNA genomes typically of more than 1 Mio base pairs, a protein synthesis machinery, and the ability to read genetic code. Life forms similar to prokaryotes with such complex and specialized molecular machineries could only be expected on exoplanets if environmental conditions there are highly similar to those on Earth. However, even if all building blocks were the same in other habitats, the same kind of life forms are unlikely to evolve [16]. We think that it is more likely to find entities on other planets that resemble in principle the relatively simple ribozymes or viroids, which can be assembled from fewer molecular building blocks than cellular life. Even the more complex "real" viruses, the most abundant biological entities on Earth, typically have much simpler structures than cells (with the notable exception of the highly complex giant viruses). Therefore, we would like to inspire the integration of biosignatures of viroids/ribozymes and viruses into the search for extraterrestrial life.
Virus particles may be directly detected by electron microscopy, as many virions have distinct shapes, often icosahedral or helical [39,143]. Viroids can also be visualized by electron microscopy, showing distinct hairpin-loop structures [144]. Enveloped viruses may provide a distinct detectable chemical signature [145]. Indirect effects of viruses on the populations of their hosts may also be detectable. For example, viruses strongly contribute to global nutrient cycling, and significantly affect the carbon content in the oceans [146]. An evolutionary stage comparable to the proposed RNA world may well exist on other planets, with a characteristic proto-metabolism, as described previously [147,148].
The message we convey here is that the beginning of life must have been simple and that viruses or virus-like entities may be signatures of early life. Biomolecules such as ribozymes or viroids may serve as models for early forms of life on Earth, other planets or moons. They are relatively small (compared to viruses and cells) and their main properties (versatility, mutability and autonomous replication) are based on structural information in the absence of the genetic code and proteins. Other polymers distinct from RNA, but with similar structural features, may exist outside of Earth that have life-like characteristics of ribozymes or viroids. Thus, viroids/ribozymes should be considered models in the search for extraterrestrial life.
Author Contributions: Both authors contributed equally to the concept, the writing the design, the figures.
Funding: This research received no external funding