Dissecting the Supramolecular Dispersion of Fullerenes by Proteins/Peptides: Amino Acid Ranking and Driving Forces for Binding to C60

Molecular dynamics simulations were used to quantitatively investigate the interactions between the twenty proteinogenic amino acids and C60. The conserved amino acid backbone gave a constant energetic interaction ~5.4 kcal mol−1, while the contribution to the binding due to the amino acid side chains was found to be up to ~5 kcal mol−1 for tryptophan but lower, to a point where it was slightly destabilizing, for glutamic acid. The effects of the interplay between van der Waals, hydrophobic, and polar solvation interactions on the various aspects of the binding of the amino acids, which were grouped as aromatic, charged, polar and hydrophobic, are discussed. Although π–π interactions were dominant, surfactant-like and hydrophobic effects were also observed. In the molecular dynamics simulations, the interacting residues displayed a tendency to visit configurations (i.e., regions of the Ramachandran plot) that were absent when C60 was not present. The amino acid backbone assumed a “tepee-like” geometrical structure to maximize interactions with the fullerene cage. Well-defined conformations of the most interactive amino acids (Trp, Arg, Met) side chains were identified upon C60 binding.

Pioneering approaches based on reverse docking were attempted to identify proteins characterized by one or more hydrophobic pockets suitable for accommodating a fullerene [30][31][32]. Proteins were also engineered into scaffolds for C 60 binding on the protein surface [33,34].
However, a full comprehension of the process governing the association of peptides/proteins with fullerenes is still lacking. Detailed knowledge about the formation of complexes of peptides/proteins with C 60 can advance bio-based applications of fullerenes [35]. For instance, inhibitors can be developed by increasing the peptide/fullerene binding [32,[36][37][38], the antioxidant properties of fullerenes can be enhanced by protein interaction [39], and fullerenes' ability to generate ROS under irradiation can be tuned [20,23,25,26,40].
Amino acids are the elementary building blocks of proteins and, therefore, they must be the starting point for investigating the complex phenomena that govern the interactions between peptides/proteins and fullerenes [41,42].
In this study, we determined the adsorption energies of all twenty natural amino acids with C 60 , ranked their propensity to interact with C 60 , and identified the thermodynamic contributions responsible for their interactions. The atomistic details of the interaction between amino acids and fullerene were obtained using molecular dynamics (MD) simulations. MD simulations are a powerful tool to unravel the mechanisms of interaction of proteins with carbon nanomaterials, providing detailed structural information [43]. An analysis of the binding components of the energy was performed by using Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) methodologies [44]. The analysis of the energy contributions to the binding between amino acids and C 60 can offer guidelines of general applicability to understand and design protein-fullerene interactions. This information is crucial to (i) understand in detail the driving forces in the interaction between peptides/proteins and fullerenes, (ii) enhance our ability to engineer such complexes, (iii) predict a priori the strength of adsorption and structure of a protein upon fullerene binding.

Results and Discussion
The geometries of the twenty C 60 -amino acid complexes were built. The amino acids were capped by acetyl (ACE) and n-methyl amide (NME) groups at the N-and C-termini (ACE-AA-NME), (Scheme 1) to reproduce the typical interactions between C 60 and an amino acid inserted into the peptide/protein sequence.
Subsequently, 100 ns of molecular dynamics simulations were carried out for each complex to sample the possible interaction geometries between the C 60 and the twenty amino acids. Post-processing of the trajectories allowed us to estimate the binding energy between the amino acids and the C 60  Subsequently, 100 ns of molecular dynamics simulations were carried out for each complex to sample the possible interaction geometries between the C60 and the twenty amino acids. Post-processing of the trajectories allowed us to estimate the binding energy between the amino acids and the C60 and the binding components of the energy by MM/GBSA and MM/PBSA.

Ranking the Binding Energies
The interaction energies between the capped amino acids and the C60 were calculated at MM/GBSA and MM/PBSA levels of theory ( Figure S1). The results of the two methodologies were very similar (MM/GBSA systematically provided a more stabilizing interaction energy of −0.5 kcal mol −1 compared to MM/PBSA, due to the different treatment of solvation). In the following, we describe in detail the MM/GBSA ranking, because such calculations are faster (a crucial issue for applications in virtual screening or MD simulations of large biosystems) and mostly because the GB terms are pair-wise decomposable (PB non-polar solvation energies are currently not decomposable) [44].
The ranking obtained by the binding energies of the capped amino acids ( Figure 1a) and of the amino acid side chains (Figure 1b) was the same. This implied a constant contribution from the conserved amino acid backbone (with a value of ~ 5.4 kcal mol −1 , similar to the interaction of Gly with C60). The contribution due to the amino acid side chains varied and was responsible for the different interactions of the twenty amino acids (from −5.0 kcal mol −1 of Trp to +0.1 kcal mol −1 of Glu).

Ranking the Binding Energies
The interaction energies between the capped amino acids and the C 60 were calculated at MM/GBSA and MM/PBSA levels of theory ( Figure S1). The results of the two methodologies were very similar (MM/GBSA systematically provided a more stabilizing interaction energy of −0.5 kcal mol −1 compared to MM/PBSA, due to the different treatment of solvation). In the following, we describe in detail the MM/GBSA ranking, because such calculations are faster (a crucial issue for applications in virtual screening or MD simulations of large biosystems) and mostly because the GB terms are pair-wise decomposable (PB non-polar solvation energies are currently not decomposable) [44].
The ranking obtained by the binding energies of the capped amino acids ( Figure 1a) and of the amino acid side chains (Figure 1b) was the same. This implied a constant contribution from the conserved amino acid backbone (with a value of E binding~5 .4 kcal mol −1 , similar to the interaction of Gly with C 60 ). The contribution due to the amino acid side chains varied and was responsible for the different interactions of the twenty amino acids (from E binding = −5.0 kcal mol −1 of Trp to E binding = +0.1 kcal mol −1 of Glu). Van der Waals interactions are the driving force in binding ( Figure 2) [35,45]. It is important to emphasize that in the present model, the van der Waals term includes π-π interactions.
Hydrophobic interactions, i.e., nonpolar solvation (ESURF), assist binding, even if the Van der Waals interactions are the driving force in binding ( Figure 2) [35,45]. It is important to emphasize that in the present model, the van der Waals term includes π-π interactions.

Aromatic Amino Acids
If we consider the extended π-system of the fullerene, it is not surprising that aromatic amino acids such as tryptophan, tyrosine, phenylalanine, and histidine interacted the most with it ( Figure 3) [20,24,[33][34][35]39,[46][47][48][49][50][51][52][53][54]. In particular, Trp exhibited the largest value of binding energy among the twenty proteinogenic amino acids. Trp is commonly recognized as the amino acid that most interacts with carbon nanoparticles, such as carbon nanotubes [55] or graphene [56]. The large indole group of tryptophan also demonstrated the largest interaction with the C60 ( −5.0 kcal mol −1 ), followed by tyrosine ( −2.5 kcal mol −1 ), histidine ( −2.5 kcal mol −1 ), and phenylalanine ( −2.4 kcal mol −1 ).  Hydrophobic interactions, i.e., nonpolar solvation (E SURF ), assist binding, even if the corresponding values are small. Contributions to the nonpolar solvation originate in the hydrophobic parts of some amino acids, such as aliphatic chains or aromatic rings, which, upon the formation of the complex with C 60 , come into contact with the hydrophobic surface of the C 60 cage instead of water molecules, which interact unfavorably with these regions.
Polar solvation terms (E GB ) are quite important and deserve an accurate analysis. These terms are detrimental to binding and their contribution generally features an energy level greater than zero, i.e., it is destabilizing. The hydrophilic part of many amino acids, upon interaction with C 60 , is forcedly desolvated, increasing the energy of the system. This is particularly evident in the cases of charged (Arg, Lys, Glu, Asp) and polar (Gln, Asn, Thr, Ser) amino acids: these residues, after interaction with C 60 , are screened from water and are no longer able to maintain a full solvation shell. Other amino acids commonly considered hydrophobic, such as Trp, Tyr, and His, are also characterized by positive desolvation energy, albeit to a lesser extent. This behavior is due to the presence of hydrophilic moieties, such as the N-H or O-H groups, which are desolvated upon C 60 binding. Of course, purely hydrophobic residues, such as Met, Leu, Val, Ile, Pro, Ala and Gly, do not suffer this energy penalty upon binding.

Figure 2.
Analysis of the components of the binding energy of the amino acid side chains by MM/GBSA.
After Trp, Arg was the second amino acid that most strongly interacted with the C60. The high affinity of Arg with carbon nanomaterials is well-known experimentally [57,58] and widely discussed computationally [57,[59][60][61]. The affinity of Lys with C60 is lower than that of Arg, although both are positively charged and their side chains are characterized by the same surface (the accessible surface area, ASA, is 2.8 nm 2 for Lys and 2.9 nm 2 for Arg). The presence of the guanidinium group plays a crucial role in the preferential adsorption of Arg onto carbon nanomaterials because of the presence of: (i) the π-π interactions that the planar guanidinium group establishes with C60 (the vdW interactions of the Arg side chain are 4.3 kcal mol −1 , while for Lys they are 3.3 kcal mol −1 ); (ii) the surfactant-like behavior of Arg when interacting with hydrophobic nanoparticles. The Arg residues orient themselves so that the guanidinium head anchors on the cage of the fullerene, while the aliphatic chain sticks to its hydrophobic surface. In this way, Arg effectively becomes a surfactant that creates an amphiphilic shield around the hydrophobic nanoparticle [61].  After Trp, Arg was the second amino acid that most strongly interacted with the C 60 . The high affinity of Arg with carbon nanomaterials is well-known experimentally [57,58] and widely discussed computationally [57,[59][60][61]. The affinity of Lys with C 60 is lower than that of Arg, although both are positively charged and their side chains are characterized by the same surface (the accessible surface area, ASA, is 2.8 nm 2 for Lys and 2.9 nm 2 for Arg). The presence of the guanidinium group plays a crucial role in the preferential adsorption of Arg onto carbon nanomaterials because of the presence of: (i) the π-π interactions that the planar guanidinium group establishes with C 60 (the vdW interactions of the Arg side chain are 4.3 kcal mol −1 , while for Lys they are 3.3 kcal mol −1 ); (ii) the surfactant-like behavior of Arg when interacting with hydrophobic nanoparticles. The Arg residues orient themselves so that the guanidinium head anchors on the cage of the fullerene, while the aliphatic chain sticks to its hydrophobic surface. In this way, Arg effectively becomes a surfactant that creates an amphiphilic shield around the hydrophobic nanoparticle [61].

Polar Amino Acids
Polar amino acids, such as glutamine, asparagine, cysteine, threonine, and serine, are amphiphilic residues that interact with C 60 , producing surfactant-like interactions ( Figure 5). In this case, the hydrophobic part of the residues interacted with the fullerene surface, while the hydrophilic part continued to interact with water. The affinity of the residues is directly proportional to the vdW interactions that the hydrophobic chain established with the C60 cage. For example, Gln ( −3.1 kcal mol −1 ) interacted more strongly than Asn ( −2.6 kcal mol −1 ) due to a longer aliphatic chain, and Thr ( −1.5 kcal mol −1 ) interacted more strongly than Ser ( −1.1 kcal mol −1 ) due to the presence of an additional methyl group. The affinity of the residues was also inversely proportional to the desolvation energy penalty. Cys interacted more strongly than Ser because the desolvation of the more polar -OH group (0.6 kcal mol −1 ) represented a higher energy penalty than the desolvation of the less polar -SH group (0.1 kcal mol −1 ).

Hydrophobic Amino Acids
In protein/peptide-fullerene complexes, several aliphatic residues are usually part of the C60 binding pocket and contribute to binding [33,35,49]. Hydrophobic interactions can be established with methionine, valine, isoleucine, leucine, proline, alanine, and glycine ( Figure 6). The hydrophobic effect can be defined as the absence of a desolvation penalty in the binding of a residue with the C60; indeed, even negative values in the EGB are found. Concurrently, the larger the contact area between the amino acid side chain and the C60, the larger the stabilizing effect due to vdW interactions (dispersion forces). Shape complementarity represents a quick way to estimate both the stabilizing van der Waals and hydrophobic interactions between proteins and carbon nanomaterials [21,45,62,63].
In this context, the behavior of Met deserves special attention. Met was the amino acid with the third strongest level of interaction in the ranking in this study. The presence of a sulfur atom strongly favors the interaction of the side chain with the C60. This stabilizing interaction can be interpreted as the soft sulfur atom that strongly interacts with the soft and electron-rich fullerene cage. In practice, a deep potential energy well is formed In this case, the hydrophobic part of the residues interacted with the fullerene surface, while the hydrophilic part continued to interact with water. The affinity of the residues is directly proportional to the vdW interactions that the hydrophobic chain established with the C 60 cage. For example, Gln (E binding = −3.1 kcal mol −1 ) interacted more strongly than Asn (E binding = −2.6 kcal mol −1 ) due to a longer aliphatic chain, and Thr (E binding = −1.5 kcal mol −1 ) interacted more strongly than Ser (E binding = −1.1 kcal mol −1 ) due to the presence of an additional methyl group. The affinity of the residues was also inversely proportional to the desolvation energy penalty. Cys interacted more strongly than Ser because the desolvation of the more polar -OH group (0.6 kcal mol −1 ) represented a higher energy penalty than the desolvation of the less polar -SH group (0.1 kcal mol −1 ).

Hydrophobic Amino Acids
In protein/peptide-fullerene complexes, several aliphatic residues are usually part of the C 60 binding pocket and contribute to binding [33,35,49]. Hydrophobic interactions can be established with methionine, valine, isoleucine, leucine, proline, alanine, and glycine ( Figure 6). The hydrophobic effect can be defined as the absence of a desolvation penalty in the binding of a residue with the C 60 ; indeed, even negative values in the E GB are found. Concurrently, the larger the contact area between the amino acid side chain and the C 60 , the larger the stabilizing effect due to vdW interactions (dispersion forces). Shape complementarity represents a quick way to estimate both the stabilizing van der Waals and hydrophobic interactions between proteins and carbon nanomaterials [21,45,62,63].

Conformations of Amino Acids in Water and Upon Interaction with C60
To investigate the effect of the absorption of amino acids on C60, for the three most interactive residues (Trp, Arg, and Met) we carried out a thorough conformational analysis of the amino acids in water and upon C60 binding.

Analysis of the Torsional Angles of the Amino Acid Backbone
We sampled the torsional angles phi (φ) and psi (ψ) of the amino acid backbone during the MD simulations in water and upon interaction with C60. The simulations of the amino acids in water showed that the "classic" and more stable regions of the righthanded α-helix and β-sheet were the most populated (Figure 7a-c). By contrast, the binding with C60 altered the φ-ψ distributions by affecting the conformations observed in water and the amino acids sampled a new φ-ψ region (indicated by a red circle in Figures 7d-f), which was rarely observed in bulk water. The amino acids in this region of the Ramachandran plot assumed a "tepee-like" geometrical structure to optimize the interactions with the C60 cage of the side chain of the amino acid and the N-terminal and C-terminal peptide bonds, as represented in Figures 7g-i. In this context, the behavior of Met deserves special attention. Met was the amino acid with the third strongest level of interaction in the ranking in this study. The presence of a sulfur atom strongly favors the interaction of the side chain with the C 60 . This stabilizing interaction can be interpreted as the soft sulfur atom that strongly interacts with the soft and electron-rich fullerene cage. In practice, a deep potential energy well is formed by the interaction of the S atom of methionine with the C atoms of the C 60 . Such strong stabilization is not present for pure aliphatic chains such as Val, Ile, Leu.

Conformations of Amino Acids in Water and Upon Interaction with C 60
To investigate the effect of the absorption of amino acids on C 60 , for the three most interactive residues (Trp, Arg, and Met) we carried out a thorough conformational analysis of the amino acids in water and upon C 60 binding.

Analysis of the Torsional Angles of the Amino Acid Backbone
We sampled the torsional angles phi (ϕ) and psi (ψ) of the amino acid backbone during the MD simulations in water and upon interaction with C 60 . The simulations of the amino acids in water showed that the "classic" and more stable regions of the right-handed α-helix and β-sheet were the most populated (Figure 7a-c). By contrast, the binding with C 60 altered the ϕ-ψ distributions by affecting the conformations observed in water and the amino acids sampled a new ϕ-ψ region (indicated by a red circle in Figure 7d-f), which was rarely observed in bulk water. The amino acids in this region of the Ramachandran plot assumed a "tepee-like" geometrical structure to optimize the interactions with the C 60 cage of the side chain of the amino acid and the N-terminal and C-terminal peptide bonds, as represented in Figure 7g

Analysis of the Side Chain Conformation of the Amino Acids upon C60 Binding
Orientational preferences of the side chains of the amino acids upon C60 binding clearly appeared in the Free Energy Surfaces, FES, of the complexes of Trp and Arg ( Figure  8 and Figure S2).
The FES of Trp@C60 displayed the existence of two minima. In the more populated/stable minimum, the conformation was sandwich-like. It was characterized by π-π interactions between the indole ring of Trp and the fullerene cage (conformation I in Figure 8a). In the less populated minimum, the conformation was T-shape-like; the indole group assumed a perpendicular configuration with respect to the fullerene cage (conformation II in Figure 8a). These conformations were experimentally observed and discussed in the C60@lysozyme complex, in which sandwich-like interactions were recognized for Trp62 and T-shape-like interactions for Trp63.

Analysis of the Side Chain Conformation of the Amino Acids upon C 60 Binding
Orientational preferences of the side chains of the amino acids upon C 60 binding clearly appeared in the Free Energy Surfaces, FES, of the complexes of Trp and Arg (Figure 8 and Figure S2) .
The FES of Trp@C 60 displayed the existence of two minima. In the more populated/stable minimum, the conformation was sandwich-like. It was characterized by π-π interactions between the indole ring of Trp and the fullerene cage (conformation I in Figure 8a). In the less populated minimum, the conformation was T-shape-like; the indole group assumed a perpendicular configuration with respect to the fullerene cage (conformation II in Figure 8a). These conformations were experimentally observed and discussed in the C 60 @lysozyme complex, in which sandwich-like interactions were recognized for Trp62 and T-shape-like interactions for Trp63. The FES of Arg@C60 displayed a single minimum, corresponding to the establishment of π-π interactions between the planar guanidinium group and the C60. This minimum was broader if compared with that of the Trp complex because the smaller guanidinium group adhered less to C60, if compared to the indole group, and may have rotated freely towards the C60 surface.
In the case of Met@C60 we analyzed the gluing of the Met side chain to the C60 cage. The CA atom was anchored to C60 by the interactions of the amino acid also with its backbone, and the mean distance during the MD simulation was small (1.7 Å). However, all the other atoms of the aliphatic side chain remained attached to the surface of the C60 during the MD simulation in a "hug" to the fullerene cage. The mean distances of all the atoms of the Met side chain remain close to the C60's surface, (ranging from 2.4 to 2.9 Å) and, unsurprisingly, the sulfur atom acted as the ligand. In fact, even if it was characterized by The FES of Arg@C 60 displayed a single minimum, corresponding to the establishment of π-π interactions between the planar guanidinium group and the C 60 . This minimum was broader if compared with that of the Trp complex because the smaller guanidinium group adhered less to C 60 , if compared to the indole group, and may have rotated freely towards the C 60 surface.
In the case of Met@C 60 we analyzed the gluing of the Met side chain to the C 60 cage. The CA atom was anchored to C 60 by the interactions of the amino acid also with its backbone, and the mean distance during the MD simulation was small (1.7 Å). However, all the other atoms of the aliphatic side chain remained attached to the surface of the C 60 during the MD simulation in a "hug" to the fullerene cage. The mean distances of all the atoms of the Met side chain remain close to the C 60 's surface, (ranging from 2.4 to 2.9 Å) and, unsurprisingly, the sulfur atom acted as the ligand. In fact, even if it was characterized by a larger vdW radius compared to C atoms, the S atom was, on average, the closest to the C 60 after CA (2.4 Å).

System Setup
Amber force fields ff14SB [64] was used to model the amino acids (AA). The twenty proteinogenic amino acids (Ala, Arg, Asn, Asp, Cys, Glu, Gln, Gly, Hie, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, and Val) were considered in their physiological protonation state. For the histidine, both the protomers HID (Nδ1-protonated histidine) and HIE (Nε1protonated histidine) were taken in account. The amino acids were capped by acetyl (ACE) and n-methyl amide (NMA) groups at the N-and C-termini (ACE-AA-NME), to reproduce the typical interactions between C 60 and a peptide/protein host. The C 60 carbon atoms were modelled as uncharged Lennard-Jones particles by using sp2 carbon parameters taken from the ff14SB force field [64].

MD Simulations
AMBER 16 was used to run the simulations [65]. A total of 1000 steps of steepest descent minimization were performed with SANDER on the generated ACE-AA-NME@C 60 complexes. The minimized structures (which were only cleared from severe steric clashes) were considered for a 1 ns equilibration step, and heated from 0 to 300 K (Langevin thermostat). Periodic boundary conditions (PBC) and Particle Mesh Ewald summation were used throughout (with a cut-off radius of 10 Å for the direct space sum). The MD simulations of the amino acids were performed with explicit solvent by using the TIP3P water model. Chloride or sodium counterions were included to exactly neutralize the charged amino acids. At the end of the equilibration process, a production MD of 100 ns was carried out for every system at 300 K.

Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and Molecular Mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) Analysis
A total of 4000 snapshots were used for the MM/GB(PB)SA analysis [44]. An infinite cut-off was used for all the interactions. The electrostatic contribution to the solvation free energy was calculated with the Poisson-Boltzmann (PB) model or the Generalized Born (GB) model, as implemented in MMPBSA.py [66]. The nonpolar contribution to the solvation free energy was determined with solvent-accessible surface-area dependent terms.

Conformational Analysis of the Amino Acids
The Ramachandran plots and FES were obtained using MDAnalysis software [63,64]. The analysis of distances and angles in the MD trajectories was carried out by CPPTRAJ [67], as implemented in AMBER 16. The energy landscape of the amino acids interacting with C 60 was visualized in terms of free-energy, which was projected as contour lines onto a two-dimensional space formed by two internal coordinates of the system. The free-energy change associated to the passage between two different states was ∆G = −RT (ln p 1 /p 2 ), where R is the ideal gas constant, T is the temperature, and p i is the probability of finding the system in state i. The two-dimensional space defined by the selected coordinates was divided into a grid and the free energy calculated for each bin of the grid. The p i values correspond to the number of times the molecular dynamics trajectory "visits" a given bin. The whole set of G values was normalized [68,69] in such a way that the lowest value (i.e., the most populated bin) corresponds to 1.

Conclusions
The potential of fullerenes in bio-medical applications is significant and can only partly be exploited because their extremely hydrophobic nature makes them rare in physiological environments. Of all the strategies that have been developed to make fullerenes bio-available, the "Trojan horse" approach, in which they are stashed away in water-soluble proteins, appears to be the greenest and most environmentally friendly [21]. Propensity rules able to identify where a fullerene can be accommodated in a protein were developed. Molecular dynamics simulations that properly include van der Waals (including π-stacking), hydrophobic, and solvation interactions, were used to determine quantitatively the binding affinity of all the proteinogenic amino acids with C 60 . Van der Waals interactions are the driving forces in binding. Hydrophobic interactions, such as nonpolar solvation, assist in binding, even if the corresponding values are small. Polar solvation terms are detrimental to binding in the cases of charged and polar amino acids. Only purely hydrophobic residues do not suffer this energy penalty upon binding. The complex interplay between these terms makes tryptophan, arginine, and methionine the residues with the strongest interactions. A thorough conformational analysis of these amino acids demonstrated that upon C 60 binding, the interacting residues show a tendency to visit configurations not "experienced" in water, and the backbone assumes a "tepee-like structure" to maximize the interaction with the C 60 cage. The binding of Trp is characterized by π-π interactions between the indole ring of Trp and the fullerene cage. These interactions may be sandwich-like (the most common) or T-shaped. Interactions of the π-π type also govern the binding between Arg and C 60 involving the planar guanidinium group and C 60 . The hydrophobic Met interacts with C 60 in a "hug" between its aliphatic side chain and the fullerene cage. The presence of a sulfur atom strongly favors the interaction of the aliphatic side chain with C 60 .

Conflicts of Interest:
The authors declare no conflict of interest.