Atomic Details of Carbon-Based Nanomolecules Interacting with Proteins

Since the discovery of fullerene, carbon-based nanomolecules sparked a wealth of research across biological, medical and material sciences. Understanding the interactions of these materials with biological samples at the atomic level is crucial for improving the applications of nanomolecules and address safety aspects concerning their use in medicine. Protein crystallography provides the interface view between proteins and carbon-based nanomolecules. We review forefront structural studies of nanomolecules interacting with proteins and the mechanism underlying these interactions. We provide a systematic analysis of approaches used to select proteins interacting with carbon-based nanomolecules explored from the worldwide Protein Data Bank (wwPDB) and scientific literature. The analysis of van der Waals interactions from available data provides important aspects of interactions between proteins and nanomolecules with implications on functional consequences. Carbon-based nanomolecules modulate protein surface electrostatic and, by forming ordered clusters, could modify protein quaternary structures. Lessons learned from structural studies are exemplary and will guide new projects for bioimaging tools, tuning of intrinsically disordered proteins, and design assembly of precise hybrid materials.


Introduction
Since the discovery of buckminsterfullerene in 1985, a discrete molecule made of 60 atoms of carbon arranged to form a I h symmetrical hollow sphere with surprising properties, it has become a favorite subject in nanotechnology and related disciplines [1]. The truncated icosahedral fullerene sphere has a van der Waals diameter of about one nanometer and several variants of smaller and larger diameters are known including elongated-shaped molecules recognized as nanotubes [2,3]. The chemical structure of these carbon-only molecules make them better conductors of electricity than common metals on much smaller scale [4]. The interest for these nanomolecules is consequence of their light weights, making them ideal for technological applications as well as biology and related fields applications [5,6]. By the 2000s, a number of studies had explored chemical strategies to link proteins and nucleic acids to nanomolecules, including metal clusters, with the aim to engineer devices for medical and biotechnological applications [7].
Among emerging carbon-made nanostructures, graphene, a single carbon sheet derived from graphite, and its related graphene oxide are very promising materials for tissue engineering, drug delivery, nerve tissue regeneration and biosensing [8][9][10][11].
Major obstacles with use of carbon-only nanomolecules for biological and medical purposes include their very poor solubility in water and poor affinity for a given protein target.

All-Carbon Nanomolecules Interacting with Proteins
A widespread method used by researchers to promote protein affinity towards fullerenes (or nanotubes) is the use of covalently linked pyrenyl group anchored to the protein through surface lysines. Pyrenyl behaves as a molecular "glue", able to stick to the nanotube wall via non-covalent π-stacking interactions [60].
Immunization of mice with fullerene derivatives represents another method of producing in vivo IgG antibodies with high affinity towards fullerene (or nanotubes) [61,62]. With this approach, papain-cleaved Fab-IgG chains were obtained and purified and they showed high affinity for fullerene, measured in 22 nM [63]. The unbound Fab-IgG chains structure was solved by X-ray crystallography (pdb entry ID 1emt) [63]. Similarly, a recent antifullerene antibody Fab-C 60 was obtained from mouse immunization and the structure of the complex of heavy (H) and light (L) chains solved by X-ray crystallography (Figure 1, pdb entry 6H3H) [64]. The structure of Fab-C 60 shows a binding pocket consisting of a canonical CDR region that contains various aromatic residues (Tyr50 (H), Tyr101 (H), Tyr34 (L), Trp93 (L), and Trp98 (L)) and an aspartate residue (Asp100 (L)) [64]. The segment Asp100-Tyr101 solvent exposed the conformational disorder and is postulated to facilitate fullerene binding [64].
This approach was also used to select Fab chains with a high affinity towards a nanotube [65]. Another in vivo approach is the phage display technique that allows peptide selection from a library in presence of a nanotube used as target. During rounds of evolution while bacteria is infected, a peptide gene of higher binding affinity is isolated [47].
Among methods to select proteins with good affinity for nanotubes de novo design offered an exemplary strategy. Researchers noticed that the geometry of an ideal alpha helix matches the honeycomb geometry of graphene. So, they positioned alanine amino acids along an alpha helix to match the center of the repeating hexagonal unit of the graphene sheet. Then, they engineered interactions based on a previously designed four-helix bundle in order to wrap helices around the nanotube. As expected, the designed peptide composed by the following thirty amino acids sequence -AEAESALEYAQQALEKAQLALQAARQALKA-binds to nanotubes and its structure solved by X-ray crystallography shows an Ala-rich surface in agreement with the designed peptide (named Hexcoil-Ala, pdb entry 3s0r) [66]. Serendipitously, the designed alpha helix, called COP (C 60 -organizing peptide), forms a crystalline complex also when mixed with buckminsterfullerene. The crystal structure of COP ( Figure 2, Table 1, pdb entry ID 5et3) shows how the peptide, organized in a four-helix bundle motif (Figure 2), recognizes the fullerene with their tyrosine amino acids. Each nanomolecule is sandwiched by two four-helix bundles forming a large superstructure (Figure 2b) [67]. Astonishingly, when tested, fullerenes or COP proteins by themselves are not conductive, but the hybrid material with this 3D lattice does conduct electricity. Table 1. Crystal structures of protein binding to nanomolecules or protein-nanomolecule complexes discussed in this review and retrieved from the Protein Data Bank. Nanomolecule buried surface area upon complex formation (Å 2 ), surface areas of individual nanomolecule (uncomplexed, Å 2 ), and protein and/or nanomolecule function are indicated. Ligand chemical structures and other ligand annotations can be retrieved using the indicated nanomolecule cif code from the following link: https://www.rcsb.org.  obtained from mouse immunization and the structure of the complex of heavy (H) and light (L) chains solved by X-ray crystallography (Figure 1, pdb entry 6H3H) [64]. The structure of Fab-C60 shows a binding pocket consisting of a canonical CDR region that contains various aromatic residues (Tyr50 (H), Tyr101 (H), Tyr34 (L), Trp93 (L), and Trp98 (L)) and an aspartate residue (Asp100 (L)) [64]. The segment Asp100-Tyr101 solvent exposed the conformational disorder and is postulated to facilitate fullerene binding [64]. This approach was also used to select Fab chains with a high affinity towards a nanotube [65]. Another in vivo approach is the phage display technique that allows peptide selection from a library in presence of a nanotube used as target. During rounds of evolution while bacteria is infected, a peptide gene of higher binding affinity is isolated [47].

Carbon-Based
Among methods to select proteins with good affinity for nanotubes de novo design offered an exemplary strategy. Researchers noticed that the geometry of an ideal alpha helix matches the honeycomb geometry of graphene. So, they positioned alanine amino acids along an alpha helix to match the center of the repeating hexagonal unit of the graphene sheet. Then, they engineered interactions based on a previously designed four-helix bundle in order to wrap helices around the nanotube. As expected, the designed peptide composed by the following thirty amino acids sequence -AEAESALEYAQQALEKAQLALQAARQALKA-binds to nanotubes and its structure solved by X-ray crystallography shows an Ala-rich surface in agreement with the designed peptide (named  Hexcoil-Ala, pdb entry 3s0r) [66]. Serendipitously, the designed alpha helix, called COP (C60organizing peptide), forms a crystalline complex also when mixed with buckminsterfullerene. The crystal structure of COP ( Figure 2, Table 1, pdb entry ID 5et3) shows how the peptide, organized in a four-helix bundle motif ( Figure 2), recognizes the fullerene with their tyrosine amino acids. Each nanomolecule is sandwiched by two four-helix bundles forming a large superstructure ( Figure 2b) [67]. Astonishingly, when tested, fullerenes or COP proteins by themselves are not conductive, but the hybrid material with this 3D lattice does conduct electricity. In summary, different strategies from in vivo selection or through de novo design are available to produce artificial proteins/peptides with high affinity to fullerenes and nanotubes. Fullerene does interact with Tyr and the methyl group of an Ala, or other aromatic residues properly placed in a protein sequence, resembling nanoparticle geometrical periodic features.

Carbon-Based Nanomolecules Interacting with Proteins
Typical carbon-based nanomolecules interacting with proteins are organic macrocycles widely used in supramolecular host-guest chemistry. Crown ethers, cyclodextrins, calixarenes, porphyrins, cryptophanes, molecular tweezers, cucurbiturils and organic foldamers resembling DNA are examples of host molecules for recognitions of guest counterparts, organic and metal ions, peptides and other organic molecules [68][69][70]. Considering the size and chemical properties of nanomolecules interacting with proteins, one major functional implication is the possible modification of the protein Each fullerene molecule is bound to two four helix bundles through the side chain of a Tyr residue (green). This figure is obtained from the Molecule of the Month column "Proteins and Nanoparticles" (pdb101.rcsb.org/motm/222). Inset: fullerene molecule from the crystal structure of COP-fullerene complex (gray spheres, cif code 60C).
In summary, different strategies from in vivo selection or through de novo design are available to produce artificial proteins/peptides with high affinity to fullerenes and nanotubes. Fullerene does interact with Tyr and the methyl group of an Ala, or other aromatic residues properly placed in a protein sequence, resembling nanoparticle geometrical periodic features.

Carbon-Based Nanomolecules Interacting with Proteins
Typical carbon-based nanomolecules interacting with proteins are organic macrocycles widely used in supramolecular host-guest chemistry. Crown ethers, cyclodextrins, calixarenes, porphyrins, cryptophanes, molecular tweezers, cucurbiturils and organic foldamers resembling DNA are examples of host molecules for recognitions of guest counterparts, organic and metal ions, peptides and other organic molecules [68][69][70]. Considering the size and chemical properties of nanomolecules interacting with proteins, one major functional implication is the possible modification of the protein quaternary structure. Therefore, the use of these nanomolecules is potentially important to modulate many biochemical signals based on protein-protein interactions [71,72]. A recent perspective paper highlighted synthetic host molecules interacting with proteins and available structural data, and the modulation of protein function typical of supramolecular chemistry [73]. Crystal structures are particularly important in this field, because they are used as starting point for "retrostructural" analysis in order to improve further design of a specific nanomolecule interacting with a protein [74].

Molecular Cages
Pines and collaborators used cryptophanes and cyclodextrins, carbon-based macrocages, for xenon binding to develop protein binding materials for biosensing technique [75,76]. An isotope of xenon (129-Xe) is used as a contrasting agent for magnetic resonance imaging (MRI) in medical diagnostic testing, both as a gas to image airspaces in the lung and dissolved in body fluids to image the bloodstream and tissues [77].
Cryptophanes are molecular cages chemically made by linking two cyclotriveratrylene cups to form a hollow shell through -CH2-CH2-or other aliphatic and ether linkers. The overall shape and size of the cage allow for a dynamic entry and exit of small gas molecules [78,79]. A specific cryptophane was designed for binding to carbonic anhydrase, an enzyme that interconverts carbon dioxide and bicarbonate. The studied cryptophane bears two functions. On one hand, it is composed of a right sized cage for good affinity of a single xenon atom, and on the other hand, it is branched with a known benzenesulfonamide inhibitor to bind the zinc ion of carbonic anhydrase active site. This designed macrocage has a good affinity for the enzyme with a KD = 100 nM, measured by ITC. Xenon, when trapped to cryptophane, displays a distinctive MRI spectrum [16]. The structure of human carbonic anhydrase II in complex with cryptophane-xenon represented the largest ligand known in the Protein Data Bank by 2008 (ligand codes 1CR, 0CR). It displays bound xenon in the central core through non-covalent interactions, while cryptophane is anchored to the zinc ion through a benzenesulfonamide group ( Figure 3, Table 1, pdb entry 3cyu). The main interactions between the macrocage and the enzyme are van der Walls interactions, and about a third of the total surface area available to solvent (Table 1) is buried in the active site of the enzyme. The quaternary structure of the enzyme is also affected as a consequence of significant crystallographic contacts occurring between symmetry-related macrocages buried within the enzyme [16]. In summary, macrocages can be used to target a specific enzyme in order to deliver noble gases for bioimaging applications, and they can function as promoter of large protein assemblies. An inhibitor or a ligand with a good protein affinity is combined with macrocage, which in turn is selective for specific gas molecules. available to solvent (Table 1) is buried in the active site of the enzyme. The quaternary structure of the enzyme is also affected as a consequence of significant crystallographic contacts occurring between symmetry-related macrocages buried within the enzyme [16]. In summary, macrocages can be used to target a specific enzyme in order to deliver noble gases for bioimaging applications, and they can function as promoter of large protein assemblies. An inhibitor or a ligand with a good protein affinity is combined with macrocage, which in turn is selective for specific gas molecules.
The first example of calixarene use for quaternary structure modulation is represented by a mutant (R337H) of tumor suppressor protein p53. This mutant promotes tumor growth because of the protein's inability to form its natural tetrameric state that bind to the genome [89]. The calix [4]arene, functionalized with positive guanidiniomethyl groups at the upper rim and neutral hydrophobic loops at lower rim, rescues the functional activity of the protein restoring its quaternary structure [90]. Therefore, this study is exemplary for protein-protein interactions assisted by calixarene molecules, however, no experimental crystal structure is available for this complex.
The first X-ray structure of a calix-protein crystal reported by Crowley and colleagues is the complex between a negatively charged calix [4]arene and cytochrome-c, an electron carrier protein with a surface containing a large number of positively charged residues [91] (pdb entry 3tyi, Table 1) [92]. The structure revealed the ability of the sulfonatocalix [4]arene (sclx 4 ) molecule to explore and camouflage the lysine positive charges [92]. Similarly, the structure of the complex between egg-white lysozyme and sclx 4 revealed the calix molecule bound to enzyme surface (pdb entry 4prq, Table 1, Figure 4) [93]. However, in this study, calixarene molecules behave in two different ways. One molecule binds and "camouflages" the charge of an arginine amino acid on the lysozyme surface, and the other hosts a PEG molecule from crystallization medium. These interactions allow calixarene to promote the assembly of lysozyme to form tetramers, which, in turn, further assemble into long repeating chains in the crystal.
white lysozyme and sclx4 revealed the calix molecule bound to enzyme surface (pdb entry 4prq, Table 1, Figure 4) [93]. However, in this study, calixarene molecules behave in two different ways. One molecule binds and "camouflages" the charge of an arginine amino acid on the lysozyme surface, and the other hosts a PEG molecule from crystallization medium. These interactions allow calixarene to promote the assembly of lysozyme to form tetramers, which, in turn, further assemble into long repeating chains in the crystal. The sclx4 molecule is also able to selectively recognize post translational modifications of lysine residues, as observed in the crystal structure of sclx4 bound to dimethyllysine residues of lysozyme (pdb entry 4n0j, Table 1) [94].
The binding property of calixarene to proteins was then used to explore the ability of calixarenebioconjugates to promote non-covalent PEGylation, which can increase the half-life of therapeutic proteins. In this proof-of-concept exercise, mono-(pdb entry 6egy) or di-(pdb entry 6egz) PEGylated sulfonatocalix [4]arene are bound the cytochrome-c (Table 1) similar to the parent sclx4 [95].
In another example, protein recognition was explored to study the interaction of cytochrome-c with a series of sclx4 derivatives where one sulphonate group at the upper rim is replaced with a bromine (pdb entry 5lft, Table 1) or a phenyl group (pdb entry 5kpf, Table 1) [96]. Substituted calixarenes are bound to different lysine residues in function of specific chemical properties of the substituents: the -phenyl derivative packs against the protein through a hydrophobic cluster, while the -bromine substituted calix interacts with the carbonyl group of its bound lysine [96]. Therefore, calixarenes can be used as programmable molecules to control specific protein assemblies or to guide their binding towards a selected surface region either for design purpose or a probe to "hide" undesired genetic mutations. The sclx 4 molecule is also able to selectively recognize post translational modifications of lysine residues, as observed in the crystal structure of sclx 4 bound to dimethyllysine residues of lysozyme (pdb entry 4n0j, Table 1) [94].
The binding property of calixarene to proteins was then used to explore the ability of calixarene-bioconjugates to promote non-covalent PEGylation, which can increase the half-life of therapeutic proteins. In this proof-of-concept exercise, mono-(pdb entry 6egy) or di-(pdb entry 6egz) PEGylated sulfonatocalix [4]arene are bound the cytochrome-c (Table 1) similar to the parent sclx 4 [95].
In another example, protein recognition was explored to study the interaction of cytochrome-c with a series of sclx 4 derivatives where one sulphonate group at the upper rim is replaced with a bromine (pdb entry 5lft, Table 1) or a phenyl group (pdb entry 5kpf, Table 1) [96]. Substituted calixarenes are bound to different lysine residues in function of specific chemical properties of the substituents: the -phenyl derivative packs against the protein through a hydrophobic cluster, while the -bromine substituted calix interacts with the carbonyl group of its bound lysine [96]. Therefore, calixarenes can be used as programmable molecules to control specific protein assemblies or to guide their binding towards a selected surface region either for design purpose or a probe to "hide" undesired genetic mutations.
In the series of sulfonato-calix[n]arenes sclx n , the increasing number of arenes (letter n) increases the net charge of the scaffold and contemporarily increases the dimension and flexibility of the macro-ring. Calix [4]arenes are generally more pre-organized rigid cones with respect to calix [6]arenes, and even more with respect to calix [8]arenes. These negatively charged molecules function like "molecular glue" interfacing two or more proteins. The X-ray structures of cytochrome-c crystallized with the series sclx n (n = 4, 6, 8) evidence an increased porosity of protein crystalline frameworks with an increasing calix[n]arene dimension. In particular, both sclx 6 (pdb entry 6rgi) and sclx 8 (pdb entry 6gd6, Table 1) induce highly porous assemblies of cytochrome-c (Table 1). While sclx 4 shows a normal protein crystal packing (~45% solvent content), sclx 6 yielded a honeycomb arrangement (~65% solvent content) [97] and sclx 8 mediated a high-porosity framework (~85% solvent content) [98]. Owing to their 'floppiness', sclx 6 and sclx 8 can reshape to the protein surface and form large interfaces. Recently, it was shown that calix [8]arene conformation changes ( Figure 5) are mediated by an effector, PEG-molecule (pdb entry 6haj, Table 1) or spermine (pdb entry 6rsl, Table 1), which in-turn modulates the porosity of cytc-sclx 8 assemblies (~70% solvent content) [99]. protein crystal packing (~45% solvent content), sclx6 yielded a honeycomb arrangement (~65% solvent content) [97] and sclx8 mediated a high-porosity framework (~85% solvent content) [98]. Owing to their 'floppiness', sclx6 and sclx8 can reshape to the protein surface and form large interfaces. Recently, it was shown that calix [8]arene conformation changes ( Figure 5) are mediated by an effector, PEGmolecule (pdb entry 6haj, Table 1) or spermine (pdb entry 6rsl, Table 1), which in-turn modulates the porosity of cytc-sclx8 assemblies (~70% solvent content) [99]. However, the observed trend in protein crystalline open frameworks can be ascribed not only to the flexibility of the supramolecular mediator, but possibly also to its increasing net charge. The effect of the increase in these two properties was reported for few sclxn examples. In order to probe the determinants that contribute to modulate the crystal architecture, it is necessary to uncouple flexibility and charge, using a more rigid and, at the same time, more negatively charged calixarene; or using a more flexible and less negatively charged calixarene. However, due to solubility problems, the latter example is less compatible with protein co-crystallization experiments. Therefore, the former strategy was recently adopted and the assembly-inducing behaviour of an octa-anionic calix [4]arene, sclx4mc was investigated [54]. This compound is a sclx4 derivative with four carboxylate functionalities at the lower rim. In particular, the presence of four chelating oxomethylcarboxylate (O-CH2-COO − ) units at the lower rim confers the ability of these podand-like calixarenes to coordinate metal ions [100,101]. Metal complexation rigidifies the cone structure (prevents ''breathing of the calix'') [102] and enhances the binding of cationic guests in the calix [4]arene cavity such as the mesotetrakis(4-N-methylpyridyl)porphyrin [103,104]. Two crystal structures of sclx4mc in complex with yeast (pdb entry 6suy), or horse heart cytochrome-c (pdb entry 6suv) were obtained [54]. The calixarene binds to a similar site on each protein but different assemblies were observed from crystal structure studies of these complexes: a honeycomb arrangement of yeast cythocrome-c ( Figure 6) (~75% solvent content) and a tubular assembly of horse cythocrome-c (~55% solvent content) [54]. Interestingly, in the less porous structure, one carboxylate unit of calixarene coordinates an arsenic atom derived from the cacodylate buffer. The comparison of the buried surface areas (Table 1) for the complexed ligand in the two structures shows that the extra arsenic atom causes a tighter crystal packing and a major enclosure by protein residues. The comparison with crystal structures obtained However, the observed trend in protein crystalline open frameworks can be ascribed not only to the flexibility of the supramolecular mediator, but possibly also to its increasing net charge. The effect of the increase in these two properties was reported for few sclx n examples. In order to probe the determinants that contribute to modulate the crystal architecture, it is necessary to uncouple flexibility and charge, using a more rigid and, at the same time, more negatively charged calixarene; or using a more flexible and less negatively charged calixarene. However, due to solubility problems, the latter example is less compatible with protein co-crystallization experiments. Therefore, the former strategy was recently adopted and the assembly-inducing behaviour of an octa-anionic calix [4]arene, sclx 4 mc was investigated [54]. This compound is a sclx 4 derivative with four carboxylate functionalities at the lower rim. In particular, the presence of four chelating oxomethylcarboxylate (O-CH 2 -COO − ) units at the lower rim confers the ability of these podand-like calixarenes to coordinate metal ions [100,101]. Metal complexation rigidifies the cone structure (prevents "breathing of the calix") [102] and enhances the binding of cationic guests in the calix [4]arene cavity such as the meso-tetrakis(4-N-methylpyridyl)porphyrin [103,104]. Two crystal structures of sclx 4 mc in complex with yeast (pdb entry 6suy), or horse heart cytochrome-c (pdb entry 6suv) were obtained [54]. The calixarene binds to a similar site on each protein but different assemblies were observed from crystal structure studies of these complexes: a honeycomb arrangement of yeast cythocrome-c ( Figure 6) (~75% solvent content) and a tubular assembly of horse cythocrome-c (~55% solvent content) [54]. Interestingly, in the less porous structure, one carboxylate unit of calixarene coordinates an arsenic atom derived from the cacodylate buffer. The comparison of the buried surface areas (Table 1) for the complexed ligand in the two structures shows that the extra arsenic atom causes a tighter crystal packing and a major enclosure by protein residues. The comparison with crystal structures obtained with the series sclx n suggested that the ligand charge is a crucial contributing factor to porous architectures.
The property of sclxn (with n = 4, 6, and 8) molecules to facilitate protein crystallization was further investigated with a small antifungal protein, PAF. The role of calixarene-mediated assembly of this basic protein was confirmed in the PAF-sclx4 co-crystal (pdb entry 6ha4), PAF-sclx6 co-crystal (pdb entry 6hah) and PAF-sclx8 co-crystal (pdb entry 6haj) [105]. One of the PAF complex structures is shown in Figure 5.
Other water soluble calixarene variants were used for the complexation of cytochrome c: the p-methylphosphonatocalix [4]arene (pdb entry 5ncv) [106] and the phosphonato-calix [6]arene (pdb entry 5lyc) [107]. In summary, a variety of calixarenes derivatives were explored to understand the effects on protein binding. Smaller rigid calixarenes interact and explore long positive charges of surface Arg or Lys residues and can change protein surface electrostatic. Larger calixarenes change their skeleton conformation to bind, adapt to protein surface. Finally, calixarenes with multi charges cause different protein aggregations. with the series sclxn suggested that the ligand charge is a crucial contributing factor to porous architectures. The property of sclxn (with n = 4, 6, and 8) molecules to facilitate protein crystallization was further investigated with a small antifungal protein, PAF. The role of calixarene-mediated assembly of this basic protein was confirmed in the PAF-sclx4 co-crystal (pdb entry 6ha4), PAF-sclx6 co-crystal (pdb entry 6hah) and PAF-sclx8 co-crystal (pdb entry 6haj) [105]. One of the PAF complex structures is shown in Figure 5.
Other water soluble calixarene variants were used for the complexation of cytochrome c: the pmethylphosphonatocalix [4]arene (pdb entry 5ncv) [106] and the phosphonato-calix [6]arene (pdb entry 5lyc) [107]. In summary, a variety of calixarenes derivatives were explored to understand the effects on protein binding. Smaller rigid calixarenes interact and explore long positive charges of surface Arg or Lys residues and can change protein surface electrostatic. Larger calixarenes change their skeleton conformation to bind, adapt to protein surface. Finally, calixarenes with multi charges cause different protein aggregations.

Cyclodextrins
Cyclodextrins are cyclic oligosaccharides formed by five or more glucose monomers linked by α-1,4 glycosidic bonds arranged in a cyclic structure. Cyclodextrins have a variety of applications for food and pharmacological industries. These molecules are biochemically produced from starch enzymatic digestion of α/β TIM-barrel fold hydrolytic enzymes. The best known cyclodextrins contain a number of glucose monomers ranging from six to eight glucose units, known as αcyclodextrin (6 glucose units); β-cyclodextrin (7 glucose units) and γ-cyclodextrin (8 glucose units).
The cyclic repeat of glucose units causes a characteristic regular conformational shape of the ring. A search in the Protein Data Bank retrieves a number of hydrolytic enzymes crystal structures in complex to α-and β-cyclodextrins (ligand codes ACX and BCD, respectively) where the bound oligosaccharides shows a regular conformation described by their six-or sevenfold axis. For instance, the crystal structure of the amylase soybean β-amylase in complex with α-cyclodextrin revealed a leucine side chain (Leu 379) hosted in the hydrophobic cavity of the cyclic oligosaccharides (pdb entry 1btc) [53]. Notably, while all six glucose units of the α-cyclodextrin lie essentially in the plane of the oligosaccharide, it adopts a toroidal shape that resembles calixarene with a hydrophobic cavity.
The thermostable alpha-amylase enzyme in complex with α-cyclodextrin revealed a methionine side chain hosted in the cyclic sugar cavity and outstanding interactions with two Trp residues (Table  1, pdb entry 3bcd) [108]. Similarly, the crystal structure of cytochrome P450 in complex with vitamin

Cyclodextrins
Cyclodextrins are cyclic oligosaccharides formed by five or more glucose monomers linked by α-1,4 glycosidic bonds arranged in a cyclic structure. Cyclodextrins have a variety of applications for food and pharmacological industries. These molecules are biochemically produced from starch enzymatic digestion of α/β TIM-barrel fold hydrolytic enzymes. The best known cyclodextrins contain a number of glucose monomers ranging from six to eight glucose units, known as α-cyclodextrin (6 glucose units); β-cyclodextrin (7 glucose units) and γ-cyclodextrin (8 glucose units).
The cyclic repeat of glucose units causes a characteristic regular conformational shape of the ring. A search in the Protein Data Bank retrieves a number of hydrolytic enzymes crystal structures in complex to αand β-cyclodextrins (ligand codes ACX and BCD, respectively) where the bound oligosaccharides shows a regular conformation described by their six-or sevenfold axis. For instance, the crystal structure of the amylase soybean β-amylase in complex with α-cyclodextrin revealed a leucine side chain (Leu 379) hosted in the hydrophobic cavity of the cyclic oligosaccharides (pdb entry 1btc) [53]. Notably, while all six glucose units of the α-cyclodextrin lie essentially in the plane of the oligosaccharide, it adopts a toroidal shape that resembles calixarene with a hydrophobic cavity.
The thermostable alpha-amylase enzyme in complex with α-cyclodextrin revealed a methionine side chain hosted in the cyclic sugar cavity and outstanding interactions with two Trp residues (Table 1, pdb entry 3bcd) [108]. Similarly, the crystal structure of cytochrome P450 in complex with vitamin D2 and β-cyclodextrin revealed a phenylalanine side chain from a surface loop (Phe 214) hosted within the oligosaccharide cavity (Table 1, pdb entry 3czh).
Because of these characteristic conformational properties cyclodextrins present relevant host-guest properties and are often used to build supramolecular architectures. However, there are exceptions to the symmetrical conformation of an oligosaccharide. The conformation of the macrocycle could deviate from a regular arrangement of the sugar moieties especially for larger oligosaccharides, or to accommodate a specific hydrolytic mechanism. For instance, the crystal structures of maltodextrin binding protein MalE1 or cyclomaltodextrinase bound to γ-cyclodextrin show a bending of the oligosaccharide. In addition, different pairs of residues are hosted in the cavity of the γ-cyclodextrin (ligand codes RCD) when the crystal structures of the two enzymes are compared (Asn/Ala; pdb entry 5mka, Figure 7; and Arg/Glu, pdb entry 3edk) and a variety of aromatic protein residues interact with the external surface of the cyclic sugar (Table 1) [109][110][111]. In summary, cyclic polysaccharides of different sizes form an inner cavity that adapts to interactions with hydrophobic residues Leu or Phe (smaller rings) and aromatic residues.
show a bending of the oligosaccharide. In addition, different pairs of residues are hosted in the cavity of the γ-cyclodextrin (ligand codes RCD) when the crystal structures of the two enzymes are compared (Asn/Ala; pdb entry 5mka, Figure 7; and Arg/Glu, pdb entry 3edk) and a variety of aromatic protein residues interact with the external surface of the cyclic sugar (Table 1) [109][110][111]. In summary, cyclic polysaccharides of different sizes form an inner cavity that adapts to interactions with hydrophobic residues Leu or Phe (smaller rings) and aromatic residues.

Cucurbituril Molecules
Cucurbituril molecules are formed by the condensation of five or more monomeric units of glycoluril and formaldehyde arranged symmetrically in ringed structures with an overall characteristic pumpkin shape [112][113][114]. For instance, the hexameric macrocyclic compound, cucurbit [6]uril has a cavity with ~5.8 Å diameter and a portal (narrower entrance) of 3.9 Å diameter [115]. Cucurbiturils are used as joining bead molecules for building supramolecular architectures [113]. Cucurbiturils chemical structure form spatial dipoles that favor binding of a variety of pyridinium-based molecules, metal ions, cationic organic molecules, gas molecules [115].
There are several examples illustrating interactions between cucurbiturils and proteins. Cucurbit [7]uril, composed of seven monomer units, was used as a synthetic receptor for human insulin. The crystal structure of human insulin in complex with cucurbit [7]uril revealed the Nterminal phenylalanine residue of the hormone hosted in the core of cucurbit and the nitrogen amino acid interacting with the oxygen atom group of the host molecule (Table 1, pdb entry 3q6e) [116]. Cucurbiturils have the function to regulate protein interactions, as demonstrated with cucurbit [8]uril, which is able to recognize an epitope of a signaling tetratricopeptide repeat (TPR) binding protein 14-3-3 (pdb entry 5n10, Table 1) [73,117].
Cucurbit [7]uril is able to regulate the protein quaternary structure of lectin binding protein. The structure of lectin binding protein in complex with cucurbit [7]uril ( Figure 8, Table 1, pdb entry 6f7w) reveals the selective binding of the nanomolecule for post translational modifications of lysine residues, similar to those observed for the structure of sclx4, with the side chain of the surface residue

Cucurbituril Molecules
Cucurbituril molecules are formed by the condensation of five or more monomeric units of glycoluril and formaldehyde arranged symmetrically in ringed structures with an overall characteristic pumpkin shape [112][113][114]. For instance, the hexameric macrocyclic compound, cucurbit [6]uril has a cavity with~5.8 Å diameter and a portal (narrower entrance) of 3.9 Å diameter [115]. Cucurbiturils are used as joining bead molecules for building supramolecular architectures [113]. Cucurbiturils chemical structure form spatial dipoles that favor binding of a variety of pyridinium-based molecules, metal ions, cationic organic molecules, gas molecules [115].
There are several examples illustrating interactions between cucurbiturils and proteins. Cucurbit [7]uril, composed of seven monomer units, was used as a synthetic receptor for human insulin. The crystal structure of human insulin in complex with cucurbit [7]uril revealed the N-terminal phenylalanine residue of the hormone hosted in the core of cucurbit and the nitrogen amino acid interacting with the oxygen atom group of the host molecule (Table 1, pdb entry 3q6e) [116]. Cucurbiturils have the function to regulate protein interactions, as demonstrated with cucurbit [8]uril, which is able to recognize an epitope of a signaling tetratricopeptide repeat (TPR) binding protein 14-3-3 (pdb entry 5n10, Table 1) [73,117].
Cucurbit [7]uril is able to regulate the protein quaternary structure of lectin binding protein. The structure of lectin binding protein in complex with cucurbit [7]uril ( Figure 8, Table 1, pdb entry 6f7w) reveals the selective binding of the nanomolecule for post translational modifications of lysine residues, similar to those observed for the structure of sclx 4 , with the side chain of the surface residue hosted within the capsule inner cavity (Figure 9) [118]. The crystal packing reveals formation of an ordered cucurbiturils cluster that promote assembly of the protein. This study suggests a use of cucurbiturils as a strategy to engineer complex and specific protein architectures [118]. In summary, the rigid and symmetrical skeleton of cucurbiturils is well suited to interact with phenylalanine and methylated side chain lysine and form clusters to direct protein assembly.
hosted within the capsule inner cavity (Figure 9) [118]. The crystal packing reveals formation of an ordered cucurbiturils cluster that promote assembly of the protein. This study suggests a use of cucurbiturils as a strategy to engineer complex and specific protein architectures [118]. In summary, the rigid and symmetrical skeleton of cucurbiturils is well suited to interact with phenylalanine and methylated side chain lysine and form clusters to direct protein assembly.  [7]uril (pdb entry 6f7w). Cucurbit [7]uril molecules show selective binding towards post translational modifications of a surface lysine residues. One of the bound cucurbit [7]uril molecules interact with a sodium ion (Na purple). (b) Cucurbit [7]uril molecule (cif code QQ7). Figure 9. (a) Ribbon drawing of binding protein 14-3-3 protein zeta/delta in complex with phosphatase peptide (orange string) and molecular tweezer CLR01 (pdb entry 5m37). The cavity of CLR01 molecules host a side chain of an arginine or a lysine residue. (b) CLR01 molecule (cif code 9SZ).

Molecular Tweezers
Among the nanomolecules able to host protein amino acids side chains, C-shaped "molecular tweezers" revealed interesting properties for promoting protein-protein interactions. The bestknown tweezer, CLR01, is composed by alternating norbornadiene and benzene chemical groups forming a ring. A phosphate anion group is bound to the upper and lower rim of the CLR01 similar to the charged groups on calixarene rims (Table 1) [119]. The phosphate groups on CLR01 carbon skeleton improve water-solubility and binding to positively charged residues of lysine and arginine [120]. The molecule CLR01, with promising properties for developing Alzheimer's disease therapy, has the ability to cause disruption of hydrophobic and electrostatic interactions proving its inhibition  [7]uril (pdb entry 6f7w). Cucurbit [7]uril molecules show selective binding towards post translational modifications of a surface lysine residues. One of the bound cucurbit [7]uril molecules interact with a sodium ion (Na purple). (b) Cucurbit [7]uril molecule (cif code QQ7).
hosted within the capsule inner cavity (Figure 9) [118]. The crystal packing reveals formation of an ordered cucurbiturils cluster that promote assembly of the protein. This study suggests a use of cucurbiturils as a strategy to engineer complex and specific protein architectures [118]. In summary, the rigid and symmetrical skeleton of cucurbiturils is well suited to interact with phenylalanine and methylated side chain lysine and form clusters to direct protein assembly.  [7]uril (pdb entry 6f7w). Cucurbit [7]uril molecules show selective binding towards post translational modifications of a surface lysine residues. One of the bound cucurbit [7]uril molecules interact with a sodium ion (Na purple). (b) Cucurbit [7]uril molecule (cif code QQ7). Figure 9. (a) Ribbon drawing of binding protein 14-3-3 protein zeta/delta in complex with phosphatase peptide (orange string) and molecular tweezer CLR01 (pdb entry 5m37). The cavity of CLR01 molecules host a side chain of an arginine or a lysine residue. (b) CLR01 molecule (cif code 9SZ).

Molecular Tweezers
Among the nanomolecules able to host protein amino acids side chains, C-shaped "molecular tweezers" revealed interesting properties for promoting protein-protein interactions. The bestknown tweezer, CLR01, is composed by alternating norbornadiene and benzene chemical groups forming a ring. A phosphate anion group is bound to the upper and lower rim of the CLR01 similar to the charged groups on calixarene rims (Table 1) [119]. The phosphate groups on CLR01 carbon skeleton improve water-solubility and binding to positively charged residues of lysine and arginine [120]. The molecule CLR01, with promising properties for developing Alzheimer's disease therapy, has the ability to cause disruption of hydrophobic and electrostatic interactions proving its inhibition Figure 9. (a) Ribbon drawing of binding protein 14-3-3 protein zeta/delta in complex with phosphatase peptide (orange string) and molecular tweezer CLR01 (pdb entry 5m37). The cavity of CLR01 molecules host a side chain of an arginine or a lysine residue. (b) CLR01 molecule (cif code 9SZ).

Molecular Tweezers
Among the nanomolecules able to host protein amino acids side chains, C-shaped "molecular tweezers" revealed interesting properties for promoting protein-protein interactions. The best-known tweezer, CLR01, is composed by alternating norbornadiene and benzene chemical groups forming a ring. A phosphate anion group is bound to the upper and lower rim of the CLR01 similar to the charged groups on calixarene rims (Table 1) [119]. The phosphate groups on CLR01 carbon skeleton improve water-solubility and binding to positively charged residues of lysine and arginine [120]. The molecule CLR01, with promising properties for developing Alzheimer's disease therapy, has the ability to cause disruption of hydrophobic and electrostatic interactions proving its inhibition of nucleation and oligomerization of amyloidogenic proteins [121]. For similar reasons, CLR01 functioned as an inhibitor of superoxide dismutase (SOD1) aggregation. This dimeric copper-zinc enzyme, responsible for clearing our cells from toxic and reactive radicals (O 2 − ), contains eleven surface lysine residues and, therefore, it is a good target for tweezer molecules [122,123]. However, no X-ray structures revealing details of interactions for these examples are available as of this writing.
Crystal structure determination of the complex between a signaling tetratricopeptide repeat (TPR) binding protein 14-3-3 and CLR01 revealed the mechanism that modulates protein-protein interactions [73]. Binding of 14-3-3 proteins by inactive kinase (e.g., Raf kinase-1), is linked to a number of malignancies and developmental syndromes Noonan and LEOPARD, and therefore, represents a target for drug discovery [124,125]. The comparison between the crystal structure of the binary complex 14-3-3 protein zeta/delta and peptide binding region of Raf kinase-1 (pdb entry 3nkx) and the complex between 14-3-3 protein zeta/delta and CLR01 (pdb entry 5oeh) reveals the potential interface for the tweezer inhibition mechanism [73,126,127]. CLR01 hosts a single surface-exposed lysine near to the binding of a symmetrically equivalent 14-3-3 protein zeta/delta [73]. Recent structure determination of 14-3-3 protein zeta/delta in complex with peptide binding region of M-phase inducer phosphatase 3 (another binding partner of 14-3-3 proteins) soaked with CLR01 (pdb entry 5m37, Table 1) ( Figure 9). Therefore, CLR01 can tune protein-protein binding interactions beyond the simple inhibition mechanism. The binding of the CLR01 in this ternary complex (pdb entry 5m37) reveals a C-terminal peptide arginine residue being hosted in the inner core of CLR01 and stabilized by van der Waals interactions with neighboring residues and an electrostatic interaction with one of phosphate group [128]. Furthermore, this structure reveals the molecular basis for higher binding affinity of the peptide measured in presence of CLR01 molecule. Intrinsic disorder is a key feature of partners that bind 14-3-3 proteins and, therefore, tweezer can provide useful insight in how to stabilize these interactions [128,129]. In summary, CLR01 can be used to tune protein-protein interactions by affecting the binding affinity of specific proteins/peptides, or the changing the dynamic flexibility of intrinsically disordered proteins [125,130]. A rigid C-shaped skeleton combined with a negatively charged phosphate group is well suited to interact and explore long positive charges of surface Arg or Lys residues.

Conclusions and Outlook
For each carbon-based nanoparticle discussed in this review, we indicated a brief summary. A recent nanomaterial database resource (PubVINAS) archives a total of 705 unique nanomaterials corresponding to twelve materials types [25,131]. At the time of this writing (July 2020), eighty of these nanomaterials are represented by carbon nanotubes, forty-eight by C 60 fullerene derivatives, and twenty by carbon nanoparticles. Carbon-based nanomolecules research is rapidly growing due to potential applications ranging across biological, medical, and material sciences [132,133]. Applications involving multifunctional cyclodextrins, used for molecules delivery, received a widespread interest and are already in use for clinical purposes [134]. The gathering of carbon-based nanomolecules with biological samples has the potential for trending areas of medical chemistry including protein-protein interactions and conformational flexibility of disordered proteins for which metal based nanomolecules were explored [135].
In order to improve the property of carbon-based nanomolecules and address their safety for medical use, it is crucial to have a clear understanding of their interactions with a target protein [18,132]. X-ray crystallography proved instrumental to understand the key interactions of proteins and carbon-based nanomolecules and inspired many of the studies we reviewed. Although these interactions are similar to those involving typical small molecules, the presence of a larger number of aromatic groups in carbon-based nanomolecules implies an important role of π-interactions (see Table 1). Despite the binding of large size carbon-based nanomolecules, these ligands often have a negligible effect on protein overall shape [16]. Carbon-based nanomolecules have the propensity to cluster because of their significant radii and rigid skeletons with resulting effect on protein quaternary structure [93]. Carbon-based nanomolecules coupled with charged or other chemical groups could change protein electrostatic surface [90]. Carbon-based nanomolecules can be used as framework to tune crystalline porosity by simple use of common buffer molecule as an additive.
Therefore, new chemical modifications of carbon-based nanomolecules have potential as creative ways to address specific questions involving targeted proteins. Lessons learned from structural studies examined here are exemplary for the future use of carbon-based nanomolecules to stoichiometrically combine a number of protein entities to build functional hybrid materials [67,136,137].

Conflicts of Interest:
The authors declare no conflict of interest.