1. Introduction
Phosphorus is an essential element of life, and presumably played a critical role in its origins. One hypothesis on the origins of phosphorus on Earth is that it was delivered, together with other inorganic and organic matter, by asteroids or meteorites [
1,
2]. It has been suggested that at the very early stages, phosphorus (III) served as a precursor for the formation of organophosphorus compounds [
3,
4]. As the young Earth cooled, phosphorus (III) might have been present as a phosphite species (HPO
32−), which has been identified on marine sediments of the Archean era [
5]. In turn, phosphite ions might have been generated by oxidation from phosphides [
6], which have been found in the form of various minerals in accessory phases in meteorites, lunar rocks, interplanetary dust particles, and comets [
7]. Under current geochemical conditions, phosphorus exists mostly in a pentavalent state as orthophosphate ion (PO
43−). To consider what role phosphorus could have played in prebiotic peptide formation, it is useful to briefly review its current role in the highly evolved process of protein synthesis.
In vivo protein synthesis involves the
O-phosphorylation of the C-termini of amino acids, converting them to energy-rich amino acyl adenylates (
Figure 1). Such activation of amino acids requires the enzyme aminoacyl tRNA synthetase, as well as adenosine triphosphate (ATP), a relatively complex organic polyphosphate. Mechanisms for the synthesis of ATP on the primitive Earth are an active topic of current debate [
8]. In the absence of enzymes and ATP, amino acids can be activated using simpler, more elemental condensing agents, such as inorganic polyphosphates [
9]. Such activation typically proceeds at the amino termini, where
N-phosphorylation can release more free energy than the
O-phosphorylation of catalyzed protein synthesis [
10]. In nature,
N-phosphoryl amino acids might represent a “living fossil” of prebiotic chemistries that have persisted; they are likely intermediates for many highly evolved enzymatic processes [
11,
12]. On the primitive Earth, inorganic polyphosphates such as trimetaphosphate might have served as a source of free energy. Trimetaphosphate (P
3) could have been made directly from an acidic phosphate as a byproduct of volcanic processing [
3,
13]. More recently, the structural characterization of
N-phosphoryl amino acids and the formation of the P–N bond have been elucidated by NMR [
14] and IR [
15] analysis, which have provided significant insight into the reaction mechanism [
16]. Despite quite extensive studies on the activation of amino acids by polyphosphates, little effort has been devoted to study the subsequent dehydration (or condensation) reactions resulting in polypeptide formation.
Here, we tested the ability of trimetaphosphate (P3), a simple inorganic polyphosphate, to promote the synthesis of polypeptides during drying–heating processes from aqueous solutions at different temperatures and pH. Diglycine was selected as a reaction substrate because its α-carbon contributes minimal effects of steric and side-group chemistry. Further, the susceptibility of its single peptide bond to hydrolysis might be readily detected from the distribution of reaction products. Our studies indicate strong correlations between environmental conditions for peptide bond formation, its cleavage, and the final composition of the reaction mixture. We also demonstrate the stabilizing effect of polyphosphate against hydrolysis on the peptide bond, a characteristic that might have been important during prebiotic protein evolution.
2. Materials and Methods
2.1. Materials
All used chemicals were of analytical grade purity, purchased from Sigma Aldrich Chemical Company (St. Louis, MO, USA), and used without further purification. Reactions were carried out in 1.5 mL low-retention Eppendorf tubes. A standard modular heater with temperature control was used as a heat source (VWR).
2.2. General Procedure for Drying-Induced Condensation of Diglycine in the Presence of Base
A solution of diglycine (6 mg; 50 μmol, 50 μL; 1 M aq. sol.) and NaOH (1.2 mg; 30 μmol, 30 μL, 1 M aq. sol.) in dd H2O (120 μL) was heated for 24 h at 70 °C open to the atmosphere. The dried pellet was then redissolved in 1 mL dd H2O and subjected to analysis by high performance liquid chromatography (HPLC).
2.3. General Procedure for Drying-Induced Condensation of Diglycine in the Presence of Trimetaphosphate
A solution of diglycine (6 mg; 50 μmol, 50 μL; 1 M aq. sol.) and sodium trimetaphosphate (4.6 mg; 15 μmol, 30 μL; 0.5 M aq. sol.) in dd H2O (120 μL) was heated for 24 h at 70 °C open to the atmosphere. The dried pellet was then redissolved in 1 mL dd H2O and subjected to analysis by HPLC.
2.4. General Procedure for Drying-Induced Condensation of Diglycine in the Presence of Trimetaphosphate and Base
A solution of diglycine (6 mg; 50 μmol, 50 μL, 1 M aq. sol.) an NaOH (1.2 mg; 30 μmol, 30 μL, 1 M aq. sol.) and sodium trimetaphosphate (4.6 mg; 15 μmol, 30 μL, 0.5 M aq. sol.) in dd H2O (90 μL) was heated for 24 h at 70 °C open to the atmosphere. The dried pellet was then redissolved in 1 mL dd H2O and subjected to analysis by HPLC.
2.5. IP-HPLC Analysis and Product Identification
A sample analysis was run using a Shimadzu Nexera XR IP-HPLC system fitted with a reversed-phase C18 column (Phenomenex Aeris XB-C18, 150 mm × 4.6 mm, 3.6 μL, Phenomenex Torrance, CA, USA). Samples were auto-injected in 10 μL aliquots (Shimadzu Nexera X2 Autosampler, Schimadzu Nakagyo-ku, Kyoto, Japan), and analysis was done in isocratic mode with a flow rate of 1 mL/min. The mobile phase was 50 mM KH2PO4 and 7.5 mM of C6H13SO3Na solution adjusted to pH 2.5 with H3PO4. The oligomeric products were detected at 195 nm, and the retention times were confirmed by comparison with pre-made standards containing glycine monomer and its oligomers.
2.6. Product Quantification
The concentration of oligomer products was determined by the integration of absorbance values (195 nm) and calibration with commercially available standards (r2 ≥ 0.999).
2.7. Mass Spectrometry
High resolution mass spectrometry (HRMS) analyses were performed on a MALDI-LTQ-Orbitrap XL (MALDI, matrix-assisted laser desorption/ionization) and an ESI Q Exactive HF Hybrid Quadrupole-Orbitrap (ESI, electrospray ionization) mass spectrometers (Thermo Fisher Scientific, Bremen, Germany) using positive ion mode. Prior to MS analyses, samples were mixed with equal volumes of acetic acid. For MALDI MS analysis, 1 μL of 2,5-dihydroxybenzoic acid (DHB) matrix (150 mg/mL in 49.95:49.95:0.1 = MeOH:H2O:formic acid) was mixed with 1 μL of analyte, and spotted onto a MALDI target plate. Full MS was acquired at m/z 100–800 with a mass resolution of 30,000 (at m/z 400). MS/MS analysis of each peak was performed using higher-energy collisional dissociation (HCD). Normalized collisional energy was optimized for each peak for optimum fragmentation efficiency. For ESI MS analysis, analyte was introduced into the mass spectrometer by direct infusion with a spray voltage of 3.5 kV. Full MS was acquired at m/z 80–1200 with a mass resolution of 30,000 (at m/z 200). MS/MS analysis was also performed using HCD with normalized collisional energy set to be 35.
2.8. Diglycine Reaction Mixture Species Mass Identification
Triglycine: HRMS exact mass calculated for C6H12N3O4 [M + H]+ 190.0828, found 190.0837, HRMS exact mass calculated for C6H11N2O4Na [M + Na]+ 212.0647, found 212.0660;
Tetraglycine: HRMS exact mass calculated for C8H15N4O5 [M + H]+ 247.1043, found 247.1060, HRMS exact mass calculated for C8H14N4O5Na [M + Na]+ 269.0862, found 269.0877;
Pentaglycine: HRMS exact mass calculated for C10H18N5O6 [M + H]+ 304.1257, found 304.1295, HRMS exact mass calculated for C10H17N5O6Na [M + Na]+ 326.1077, found 326.1099;
Hexaglycine: HRMS exact mass calculated for C12H21N6O7 [M + H]+ 361.1472, found 361.1367, HRMS exact mass calculated for C12H20N6O7Na [M + Na]+ 383.1291, found 383.1314;
Heptaglycine: HRMS exact mass calculated for C14H24N7O8 [M + H]+ 418.1686, found 418.1682, HRMS exact mass calculated for C14H23N7O8 [M + Na]+ 440.3688, found 440.3692;
Octaglycine: HRMS exact mass calculated for C16H27N8O9 [M + H]+ 475.1901, found 475.1901, HRMS exact mass calculated for C16H26N8O9 [M + Na]+ 497.1270, found 497.1268.
3. Results
A typical reaction mixture consisted of 200 mM diglycine solution and sodium trimetaphosphate adjusted to pH ~ 9.5 with NaOH. The mixture was left open to the atmosphere while maintained at constant temperature for a specific time. Ambient humidity levels were not controlled during the course of the experiment. The resulting dry pellet was then redissolved in ultrapure water (18.2 Ω), and the mixture was analyzed by ion pair high-performance liquid chromatography (IP-HPLC). Analysis of the products after rehydration revealed several peaks, indicating the formation of different oligomers under different conditions (
Figure 2). The identity of detected compounds was confirmed by comparison of their retention times with commercially available standards as well as mass spectrometric analysis.
First, we tested the influence of temperature and incubation time on polymerization rates. Various aqueous solutions of diglycine were incubated at 70 °C, 80 °C, and 95 °C, and open to the atmosphere for 24 h and 48 h to allow evaporation and dry-state reactions. HPLC analysis of redissolved solids showed that the conversion of diglycine into longer peptides increases both with higher temperatures and longer incubation time (
Figure 3), which is in agreement with previous observations made by Borsook [
17,
18] on activated amino acids. In addition, we observed that the highest rates of substrate conversion into longer oligomers under mild temperatures (70 °C and 80 °C) were achieved from alkalized solutions of trimetaphosphate. When the reaction was carried out at elevated temperatures of 95 °C, the highest conversion of substrate (close to 50%) was observed in the mixtures of alkalized diglycine.
Next, the ability of diglycine to polymerize was tested from the mixture consisting of the dimer and triphosphate in neutral solution (pH ~ 6.5), followed by the control reaction, which was an alkalized solution of the dipeptide. Here, HPLC profiles of analyzed mixtures revealed notably different results. When the alkalized solution of pH 9.5 permitted peptide hydrolysis, as reflected by the presence of the odd-numbered species Glyn, where
n = 3, 5, 7, etc. (
Figure 2 Curve B), the neutral reaction of diglycine with P
3 yielded only even-numbered oligomers, indicating a suppression of amide bond cleavage (
Figure 2 Curve A) by a stabilizing effect of the polyphosphate.
4. Discussion
The dipeptide glycylglycine (Gly
2) can undergo several reactions in aqueous solution: a linear condensation reaction, cyclization to diketopiperazine (DKP), peptide bond hydrolysis, and decomposition via decarboxylation and deamination (
Figure 4). The rates of these reactions will depend on the pH of the aqueous media and the temperature and pressure of the reaction mixture, which will influence the distribution of the different species in solution over time [
19]. Although the rate of the amide bond cleavage exponentially increases with increasing temperature, estimated standard molal thermodynamic properties of aqueous dipeptides and their constituent amino acids indicate that temperature increases correspond to the increased stability of peptide bonds relative to hydrolysis, resulting in a bias toward polymerization processes [
19].
As expected, the conversion of diglycine to its longer oligomers increased with temperature (
Figure 3), in agreement with the observation that for temperatures above 60 °C, the free energy barrier of peptide bond formation decreases, and therefore, bond formation between unprotected amino acids or peptides can be exergonic [
1]. The extent of polymerization was likewise enhanced by incubation time (
Figure 3), where samples typically appear visually dry by 6 h. This result indicates that the majority of the oligomer synthesis might take place in the absence of a bulk solvent in a dry solid state.
In addition to the temperature, the solution pH would also be expected to affect the relative rates of polymerization and hydrolysis. An increase in pH from neutral to alkaline changes the ionic state of diglycine (
Figure 5), which not only shifts the hydrolysis from acid-promoted through water-dependent neutral solutions to base-promoted hydrolysis assisted by OH
− ions, but more importantly affects nucleophilic and electrophilic relativities, which will be of vital importance for predicting rates of individual steps in the absence of enzymes. Peptides are formed from amino acids through an S
N2 (
Scheme 1) reaction that is initiated by the nucleophilic attack of an amino nitrogen on a carbonyl carbon, upon which electrons of the carboxyl double bond migrate to the oxygen atom in order to maintain the octet for the carbonyl atom. This step results in the formation of a tetrahedral intermediate. At the same time, the positive charge on the nitrogen is neutralized by proton migration to the negatively charged oxygen. Condensation is followed by the removal of the hydroxyl group and reformation of the double bond. This step starts with electrons moving back from the oxygen atom into the double bond, and expelling the –OH group from the carbonyl-carbon. Although the hydroxy moiety is not a good leaving group, this process is facilitated by the energetically favorable formation of the carbon-oxygen double bond and water removal during the drying process [
20]. The enhancement of the reaction at alkaline pH can be explained mainly by the electrostatic properties of the substrates, as well as an increase of the nucleophilicity of the amine resulting from its deprotonation [
21]. At pH ~ 10 diglycine is present mostly in an anionic state Gly
2−, with only a small fraction of the zwitterionic Gly
2± form (
Figure 5). Since Gly
2− has a deprotonated amino group with higher nucleophilicity, it has a great advantage for attacking the carbonyl carbon on the negatively charged C-terminus of a neighboring zwitterionic Gly
2± and anionic Gly
2−, promoting peptide formation. This scenario assumes that the reaction medium is an ideal electrolyte solution, where only H
+ transfers between the substrate, and dissolved species are considered. However, our reaction medium is neither ideal nor a solution, so other plausible mechanisms will need to be elucidated by in situ experimental and computational studies.
The introduction of trimetaphosphate (P
3) into the reaction mixture allows the formation of a highly active intermediate as a phosphoryl–carboxyl mixed anhydride (
Scheme 2) [
22]. Formation of the
N–P bond is now possible, and it starts with the deprotonation of the α-amino group, followed by the nucleophilic attack of –NH
2 at phosphorus in P
3, and fragmentation of the resulting adduct with the elimination of pyrophosphate [
14,
23]. The fragmentation may proceed through a cyclic intermediate, during which migration of the phosphate residue from
N– to
O– is possible. However, if the formation of a five-membered ring is restricted by substrate conformations or the low proximity of reactive sites, then the probability of an intermolecular transfer becomes more plausible (
Scheme 3) [
24]. Here, the nucleophilic attack of a free amine group from one amino acid or peptide on the phosphate-activated carboxyl carbon yields a new peptide product.
The propensity of polymerization reactions to be favored at high pH can be explained by electrostatic interactions influencing the ionic character of the reacting molecules. Moreover, an activating character of phosphate is clearly visible if we compare diglycine conversion to higher oligomers in alkalized solutions in the presence vs. absence of trimetaphosphate (
Figure 3). Although the amide bond is relatively unreactive, one issue connected to the condensation is the parallel and inevitable progression of hydrolysis (
Scheme 4). In alkalized diglycine mixtures (lacking phosphate), the presence of odd-numbered species Gly
n (where
n = 3, 5, 7, etc.), along with the expected series of even-numbered oligomers (for
n = 2, 4, 6, etc.), are consistent with previous observations suggesting dynamic combinatorial processes that form and break the C–N bond [
25]. In this way, any odd-numbered species could be a product of at least two different pathways when condensation and hydrolysis proceed as two parallel events. For example, triglycine could be a product of either a diglycine condensation with glycine, or the hydrolysis of any of the higher oligomers, such as tetraglycine (
Scheme 5,
Figure 6). However, while analyzing products formed in the presence of the phosphate, we noted a suppression of hydrolysis. Drastically lower concentrations of tri- and pentaglycine indicate that phosphates must have a stabilizing effect on the peptide bond, most likely by increasing the hydrogen interaction of the amide proton with lone pair electrons localized on the phosphate oxygen. Since the reaction mixture has an alkaline environment, base-promoted hydrolysis is still possible, but it clearly proceeds with lower yields (
Scheme 6). Now, the formation of an odd numbered oligopeptide (
Scheme 7) would involve a five-membered phosphoryl-carboxyl mixed anhydride intermediate, which can be attacked by any available free amino group (not protonated and not involved in a P–N bond). This step proceeds regardless of whether the C-terminus of the attacking nucleophile has its oxygen in a phosphate bond or it carries a lone pair of electrons. Also, the product of this reaction could be either further elongated by condensation, or cleaved by hydrolysis reactions, respectively.
To check whether the peptide bond is stabilized against hydrolysis by phosphate, we performed a drying-induced polymerization of diglycine in neutral mixtures of trimetaphosphate, and noticed the complete inhibition of the peptide bond hydrolysis (
Figure 2A). However, the conversion of diglycine was slightly lower when compared with alkaline reactions mixtures, which might be attributed to the higher level of zwitterion form and its associated lower nucleophilicity of the amino group. Moreover, under neutral pH, the carbonyl group activation might be able to proceed only through a potentially rate-limiting intermolecular reaction of the migrating phosphate from
N– to
O– (
Scheme 8). This stabilizing effect of phosphate moiety was not reported before, either in works of Yamagata [
26] or Yamanaka [
22].
Taken together, these results indicate that one can have a significant environmental influence over the processes that determine the final composition of the reaction mixture. If the reaction environment allows the simultaneous formation and hydrolysis of the amide bond, starting from different monomers species, then a potentially much broader diversity of paths is made available for the production of each heteropolymer. The combinatorial possibilities, which have been worked out, indicate a key role for hydrolysis, in tandem with condensation, to enable the emergence and closure of sets of oligopeptides that collectively and cooperatively self-replicate [
27]. This simple and artificially created environment, with control over temperature, pH, and the presence of an inorganic polyphosphate, will serve as a platform for creating more diversified chemical settings. If at the early stages of chemical evolution, one molecule could perform only one action, there might have been a need for many players present in the system, which could over time have combined and evolved into multitasking macromolecules.
How information-rich highly functional polymers could arise and persist in the absence of pre-existing templates or precursors is still a key gap in our understanding of prebiotic chemistry and the chemical origins of life. Here, we have added to our understanding of the chemical evolution of N-phosphoryl amino acids by showing how they can influence condensation and hydrolysis reactions during the drying-induced synthesis of polypeptides. Expansion of this work will employ more diverse starting materials under abiotically relevant defined conditions. Use of different monomers will enable the synthesis of heteropolymers that encode information about their physical and chemical environments. Such products may eventually catalyze reactions that contribute to their own synthesis and persistence.