Insights into Network of Hot Spots of Aggregation in Nucleophosmin 1

In a protein, point mutations associated with diseases can alter the native structure and provide loss or alteration of functional levels, and an internal structural network defines the connectivity among domains, as well as aggregate/soluble states’ equilibria. Nucleophosmin (NPM)1 is an abundant nucleolar protein, which becomes mutated in acute myeloid leukemia (AML) patients. NPM1-dependent leukemogenesis, which leads to its aggregation in the cytoplasm (NPMc+), is still obscure, but the investigations have outlined a direct link between AML mutations and amyloid aggregation. Protein aggregation can be due to the cooperation among several hot spots located within the aggregation-prone regions (APR), often predictable with bioinformatic tools. In the present study, we investigated potential APRs in the entire NPM1 not yet investigated. On the basis of bioinformatic predictions and experimental structures, we designed several protein fragments and analyzed them through typical aggrsegation experiments, such as Thioflavin T (ThT), fluorescence and scanning electron microscopy (SEM) experiments, carried out at different times; in addition, their biocompatibility in SHSY5 cells was also evaluated. The presented data clearly demonstrate the existence of hot spots of aggregation located in different regions, mostly in the N-terminal domain (NTD) of the entire NPM1 protein, and provide a more comprehensive view of the molecular details potentially at the basis of NPMc+-dependent AML.


Introduction
Amyloids can be divided into three main groups [1]: (1) pathological amyloids, which were the first to be discovered [2,3]; (2) artificial amyloids, often deriving from natural or de novo conceived sequences [4][5][6]; (3) naturally occurring functional amyloids, which perform a wide range of biological functions in diverse organisms (bacterial biofilms [7], scaffolding for melanin synthesis [8], storing peptide hormones [9]), including the formation of protein complexes in subcellular condensates [10][11][12]. In neurodegeneration [13], pathological aggregates can form amorphous assemblies [14] and/or highly ordered crossβ amyloid fibers [15]; the toxic species are often small, disordered oligomers, as precursors of fibrils [16]. To unveil the basis of toxicity of soluble aggregated, proto-and mature fibrils, it is of fundamental interest to deepen the mechanisms of fibrillogenesis. The successful prediction of the aggregation propensity of amino acidic sequences helps investigate the amyloid process [17]. Hence, the identification of short protein stretches, called the aggregation-prone regions (APR), is a powerful reductionist approach, opposite to the experimental complexity due to protein length, composition and concentration [18,19]. The self-assembly of APRs is modulated by both homo-and heterotypic interactions, as

•
Structure prediction and aggregation propensity through bioinformatic tools.
The amyloid propensity of a great number of proteins depends on the presence and collocation of APRs. Often, flexible segments belonging to IDRs, even if located away from APRs, can act as conformational wings in the formation of amyloid assemblies. Aggregation predictor algorithms aim to "read" the aggregation propensities from the primary sequence even if, during folding, the APRs can be protected by chaperones and self-chaperoning interactions [5].
Our recent results demonstrate, unexpectedly and unequivocally, that the CTD of NPM1 in AML mutated forms are prone to aggregate. In the present study, we aimed to identify new hot spots of aggregation on the whole primary sequence (1-294 residues) of NPM1.
For this purpose, we analyzed the protein sequence through https://services.mbi.ucla. edu/zipperdb/intro. Figure 1 presents a plot with the primary sequence of NPM1 on the X-axis with a histogram bar proportional to the Rosetta energy of each residue. The orangered segments with energy values below the indicated threshold of −23 kcal/mol (gray line) are predicted to form fibrils. Hence, we designed, ad hoc, six peptides covering the protein fragments reported in Table 1. These regions were conceived to both contain red histogramsresidues and defined secondary structures. The conformational knowledge derived from known experimental structures of separated domains, NTD (1-117 residues) [37] and CTD (243-294 amino acids) [43], which were further confirmed by using the prediction server PSIPRED (http://bioinf.cs.ucl.ac.uk/psipred/) for the entire 1-294 protein ( Figure S2). Thus, during the design process, two main factors were considered: (i) the restriction to the minimum fragment containing hot spots of aggregation and (ii) the presence of defined secondary structures, providing fragments with very different lengths. In them, to avoid the "extremity effects" of protein dissection, the predicted amyloid stretch was located at the center of the dissected regions. Noticeably, from this analysis, the protein fragment 242-259, corresponding to the first helix of the three-helix bundle (H1), emerged. This fragment was already investigated by us for its helical features [58] but not for the amyloidogenic propensity, and thus, we included it in the present study. Conversely, in the second helix of the bundle (H2 region), the fragment centered on the 269-277 stretch was outlined from amyloidogenic prediction ( Figure 1). In this case, we did not include it in the present study, since this region was subject to many previous investigations [59].  (Figures 1 and S2). The peptide sequences and conformations (predicted and experimental) are reported in Table 1. All sequences were synthesized, in the acetylated and amidated form, by SPPS with discrete yields using Fmoc methodologies and purified as already reported [47].
The peptides corresponding to regions 41-65 ( Figure 2), 69-83 ( Figure 3) and 107-120 ( Figure 4) were firstly analyzed for their ability to bind ThT dye. The NPM169-83 peptide appeared already aggregated at t = 0 ( Figure 3A) with a very slow signal decrease; noticeably, this sequence is the only one to present a positive net charge at neutral pH (Table 1). NPM141-65 also appeared partially aggregated at t = 0, suggestive of the presence of low order aggregates [54] ( Figure S3). Over time, both NPM141-65 and NPM1107-120 exhibited increasing profiles of ThT fluorescence, even if with great differences in the kinetic, as evaluable from the comparison of t½ values that represent the time values at which fluores- In the NTD, an extended β-sheet structure is present ( Figure S1) [37]. Herein, we selected several fragments centered on the hot spots of aggregation: 41-65 on Thr 46 , Ala 62 , 69-83 on Val 74 , Thr 75 Ala 77 , Thr 78 and 107-120 on Gly 113 , Gln 114 , His 115 . On the other hand, fragment 84-93, with Thr 86 as the amyloidogenic residue, was designed as a random fragment in accordance with both experimental evidence and bioinformatics predictions. In the end, fragment 127-139 with Val 132 , Lys 133 Leu 134 was predicted to be endowed with helical conformation, while 242-259, bearing Gln 252 , Ala 253 , Ser 254 as spots, covered the H1 helix [43] (Figures 1 and S2). The peptide sequences and conformations (predicted and experimental) are reported in Table 1. All sequences were synthesized, in the acetylated and amidated form, by SPPS with discrete yields using Fmoc methodologies and purified as already reported [47].
The peptides corresponding to regions 41-65 ( Figure 2), 69-83 ( Figure 3) and 107-120 ( Figure 4) were firstly analyzed for their ability to bind ThT dye. The NPM1 69-83 peptide appeared already aggregated at t = 0 ( Figure 3A) with a very slow signal decrease; noticeably, this sequence is the only one to present a positive net charge at neutral pH (Table 1). NPM1 41-65 also appeared partially aggregated at t = 0, suggestive of the presence of low order aggregates [54] ( Figure S3). Over time, both NPM1 41-65 and NPM1 107-120 exhibited increasing profiles of ThT fluorescence, even if with great differences in the kinetic, as evaluable from the comparison of t 1/2 values that represent the time values at which fluorescence intensity reaches its maximum value/2. Indeed, while NPM1 41-65 appeared to aggregate quickly ( Figure 2A) with a t 1/2 = 5 min and a subsequent net decrease in signal due to fibrillization in 3 h, NPM1 107-120 presented longer times of aggregation ( Figure 4A) with a t 1/2 = 20 min and the total abolishment of the fluorescence signal, in~20 h. Both sequences present a negative net charge at neutral pH-fragment 41-65 with a double charge with respect to 107-120 (Table 1)          The conformational preferences of these peptides were analyzed over time through CD spectroscopy, and deconvolution data are reported in Table 2. As expected, all three peptides at t = 0 exhibited a mixture of conformation, with a prevalence of the β-structure, but the evolution over time was peculiar for each sequence. Indeed, while NPM1 41-65 showed a stable profile over time ( Figure 2B), NPM1 107-120 presented a slight increase in the β-structure at the expense of helical content and a decrease in the Cotton effect, starting from 2 h ( Figure 4B). More markedly, NPM1 69-83 exhibited a transition toward the β-sheet in 2 h (up to~50% of beta) ( Figure 3B, Table 2), allowing the formation of a well-defined secondary structure despite the presence of Pro 71 , which is often reported to interrupt secondary structures [60].
A few peptides were also analyzed by 1D [ 1 H] and 2D [ 1 H, 1 H] NMR spectroscopy at different times. For the 69-83 fragment, the comparison of 1D spectra of freshly prepared sample (t = 0) and after 4 h and 2 days ( Figure S4A) did not show a chemical shift and/or intensity changes, as also evident from the overlay of 2D TOCSY spectra acquired at t = 0 and 4 days ( Figure S4B). The 2D NOESY 300 spectrum ( Figure S4C) contained almost solely diagonal peaks and pointed out extended/random conformations typical of a low molecular weight species. The SEM analyses of all peptides were carried out at two different times of aggregation, 0 and 24 h. The SEM images of NPM1 41 Conversely, NPM1 69-83 displayed amyloid fibers "in formation" at t 0 ( Figure S5D-F) and mature at 24 h ( Figure 3C-E), with an average length of (4.6 ± 0.5) × 10 µm and a diameter of (4.10 ± 0.16) µm ( Figure 3D).
With similar assays and times of acquisition, the random fragment 84-93 was investigated. This sequence appeared completely unable to bind ThT ( Figure 5A), since no signal variation was detected over 20 h ( Figure S3) and in a CD analysis (Figure 5B), confirming a prevalent random content mixing with α-helix, which did not change over time (Table 2). In agreement with CD data, the comparison of 1D spectra indicated no conformational changes between t = 0 and t = 3 d ( Figure S7A), and in the 2D NOESY 300 spectrum (Figure S7B), the lack of contacts outside the diagonal peaks confirmed the absence of peptide structuration. nal variation was detected over 20 h ( Figure S3) and in a CD analysis ( Figure 5B), confirming a prevalent random content mixing with α-helix, which did not change over time (Table 2). In agreement with CD data, the comparison of 1D spectra indicated no conformational changes between t = 0 and t = 3 d (Figure S7A), and in the 2D NOESY 300 spectrum ( Figure S7B), the lack of contacts outside the diagonal peaks confirmed the absence of peptide structuration. Coherently, a SEM analysis of NPM1 84-93 did not evidence the amyloid features of a few unripe fibers ( Figure S6D-F), which appeared mainly still in formation, even at 24 h ( Figure 5C-E).
The 127-139 and 242-259 fragments presented a similar behavior in the ThT fluorescence assay. Both revealed an inability to bind the amyloid dye, even at long times of stirring ( Figures 6A, 7A and S3). As expected, the 242-259 fragment presented a good helical content, especially at t = 0 of aggregation ( Figure 7B, Table 2), which, after 30 h, was partially lost. Conversely, NPM1127-139, differently from the prediction, exhibited a prevalent random state that persisted for 28 h of analysis ( Figure 6B, Table 2).
In the SEM analysis, NPM1127-139 showed the formation of a dense network of fibers [61] already at t = 0. Using higher magnification, it was possible to observe how the fibers tend to associate themselves in the form of wide ribbons or bundles ( Figure S8A-C). In detail, these bundles appeared thin (2.3 ± 1.0 µm) in diameter and in length (9.0 ± 4 × 10 µm) ( Figure S8B), but after 24 h, they disassembled to form insoluble aggregates ( Figure  6C-E). On the other hand, poor aggregation propensity was found for the peptide NPM1242-259 (Figures S8D-F and 7C-E), whose aggregates were unable to evolve toward amyloid fibers. Coherently, a SEM analysis of NPM1 84-93 did not evidence the amyloid features of a few unripe fibers ( Figure S6D-F), which appeared mainly still in formation, even at 24 h ( Figure 5C-E).
The 127-139 and 242-259 fragments presented a similar behavior in the ThT fluorescence assay. Both revealed an inability to bind the amyloid dye, even at long times of stirring ( Figures 6A, 7A and S3). As expected, the 242-259 fragment presented a good helical content, especially at t = 0 of aggregation ( Figure 7B, Table 2), which, after 30 h, was partially lost. Conversely, NPM1 127-139 , differently from the prediction, exhibited a prevalent random state that persisted for 28 h of analysis ( Figure 6B, Table 2).   In the SEM analysis, NPM1 127-139 showed the formation of a dense network of fibers [61] already at t = 0. Using higher magnification, it was possible to observe how the fibers tend to associate themselves in the form of wide ribbons or bundles ( Figure S8A-C). In detail, these bundles appeared thin (2.3 ± 1.0 µm) in diameter and in length (9.0 ± 4 × 10 µm) ( Figure S8B), but after 24 h, they disassembled to form insoluble aggregates ( Figure 6C-E).
On the other hand, poor aggregation propensity was found for the peptide NPM1 242-259 ( Figures S8D-F and 7C-E), whose aggregates were unable to evolve toward amyloid fibers.  •

Cellular effects of NPM1 fragments
With the aim to evaluate the potential toxic effects of NPM1 fragments [62], several designed peptides were analyzed in a cell viability assay employing SHSY5 cells at different times of aggregation. From the MTT assay reported in Figure 8, none of the NPM1 fragments turned cytotoxic, while a slight increase in cell viability was observed at t = 0 h only for NPM1107-120, whereas no statistically significant effect was observed for NPM169-83.

• Cellular effects of NPM1 fragments
With the aim to evaluate the potential toxic effects of NPM1 fragments [62], several designed peptides were analyzed in a cell viability assay employing SHSY5 cells at different times of aggregation. From the MTT assay reported in Figure 8, none of the NPM1 fragments turned cytotoxic, while a slight increase in cell viability was observed at t = 0 h only for NPM1 107-120 , whereas no statistically significant effect was observed for NPM1 [69][70][71][72][73][74][75][76][77][78][79][80][81][82][83] . Results are expressed as mean ± SD. The statistical analyses were performed with the GraphPad Prism 9 software using two-way ANOVA corrected for multiple comparison by the Dunnet test (** p < 0.005).

Peptide Synthesis
The reagents for solid-phase peptide synthesis (SSPS) were purchased from Iris Biotech (Marktredwitz, Germany) and the solvents for HPLC analyses from Romil (Dublin, Ireland). All peptides were chemically synthesized following Fmoc solid-phase peptide synthesis protocols, purified by RP-HPLC and identified through LC-MS. The peptides were pre-treated overnight with hexafluoro-2-propanol (HFIP), lyophilized and stored at −20 °C until use.

Far-UV CD Spectroscopy
The samples were prepared by dilution of freshly prepared stock solutions (1 mM peptide, on average). CD spectra were recorded on a Jasco J-815 spectropolarimeter (JASCO, Tokyo, Japan) at 25 °C in the far-UV region from 190 to 260 nm in a 0.1 cm quartz ✱✱ ✱✱ Figure 8. Cell viability effects of NPM1 fragment. The histogram reports the percent of cell viability (100% viable cells represent the control, CTRL) treated with peptides pre-incubated at three different times: 0, 2 and 24 h. The histogram is representative of a single experiment performed in triplicate.
Results are expressed as mean ± SD. The statistical analyses were performed with the GraphPad Prism 9 software using two-way ANOVA corrected for multiple comparison by the Dunnet test (** p < 0.005).

Peptide Synthesis
The reagents for solid-phase peptide synthesis (SSPS) were purchased from Iris Biotech (Marktredwitz, Germany) and the solvents for HPLC analyses from Romil (Dublin, Ireland). All peptides were chemically synthesized following Fmoc solid-phase peptide synthesis protocols, purified by RP-HPLC and identified through LC-MS. The peptides were pre-treated overnight with hexafluoro-2-propanol (HFIP), lyophilized and stored at −20 • C until use.

SEM Analysis
NPM1 peptides were analyzed by SEM microscopy, as already reported [59]. All peptides, except NPM1 107-120 (200 µM), were dissolved at 800 µM in 50 mM phosphate buffer at pH 7.4 and analyzed at t 0 and under stirring after 24 h. In detail, samples were dropped on a typical SEM stub and gold-sputtered at 20 nm thickness with the HR208 Cressington sputter coater and analyzed at 5-10 kV with an SE2 detector by Ultra Plus FESEM scanning electron microscope (Zeiss, Oberkochen, Germany).

NMR Experiments
NMR spectra were registered at a temperature of 25 • C on a Varian Unity Inova 600 MHz NMR spectrometer provided with a cold probe. For NMR sample preparation, the peptides were dissolved in a total volume of 540 µL, including 500 µL of 10 mM NaP buffer and 40 µL of D 2 O (Deuterium Oxide, 98% D, Sigma-Aldrich, Milan, Italy). Both NPM1 69- H] spectra were recorded for the freshly prepared sample (t = 0), and after 4 h (t = 4 h) and 2 days (t = 2 d) after sample preparation, whereas 2D [ 1 H, 1 H] TOCSY spectra were recorded at t = 0 and after 4 days (t = 4 d). For the NPM1 84-93 peptide, 1D [ 1 H] spectra were recorded at t = 0, and after 4 h (t = 4 h) and 3 days (t = 3 d). The 2D [ 1 H, 1 H] NOESY spectra were also acquired for both peptides 4 days after sample preparation. The NMR samples were stored at 4 • C in between the experiments registered at different times. Water suppression was obtained by Excitation Sculpting. The software VNMRJ 1.1D (Varian, Italy) was used for spectra processing; NEASY [66], included in CARA (http://cara.nmr.ch/doku.php), and UCSF sparky [67] were employed for spectra analyses. The water signal was implemented for chemical shifts referencing (4.75 ppm).
Cells were seeded in triplicate in 96-well plates at a density of 7500 cells/well. Cells were incubated with the peptides at the time points mentioned above for 24 h at 37 • C in a humidified atmosphere of 5% CO 2 . In the last 4 h of incubation, 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) was added to the cells. DMSO was then added to allow the reduction of MTT into formazan crystals by living cells, as previously reported [68,69]. The absorbance was measured at 560 nm by Glomax ® Discover Microplate Reader (Promegam Madison, WI, USA).

Conclusions
NPM1 is at the center of a wide and crucial interactome, which becomes dysregulated in AML cells [70]. Hence, a deep analysis of the structural factors involved in leukemogenesis is of utmost importance. Our recent investigations correlate, directly and undoubtedly, the aggregation to AML mutations occurring in the third helix of the CTD. Starting from the importance of the mutual influence among domains in NPM1 [45], herein, we analyzed the presence of APRs within the entire protein through the combination of theoretical and experimental procedures. On the basis of the prediction of aggregation propensity, we designed several peptides covering different protein regions located almost completely outside the CTD. By analyzing the fragments located in the β-structure regions, we observed the most evident conformational plasticity eventually prone to aggregation. Indeed, the 41-65, 69-83 and 107-120 fragments demonstrated the ability to bind ThT even with different kinetics of aggregation that depend on their aminoacidic composition and that can, in turn, explain the SEM results. In detail, sequence 69-83 demonstrated greater conformational transitions in the CD analysis and appeared bound to ThT at t = 0 of the analysis. It provided fibers with a defined amyloid character even at t = 0. Conversely, the fast binding to ThT exhibited by the 41-65 fragment only led to amorphous aggregation, likely due to the short times of organization of the peptide chains. On the other hand, the slow aggregation exhibited by 107-120 caused the formation of unripe fibers even after 24 h of aggregation. Noticeably, all the other investigated NPM1 protein regions did not exhibit ThT binding and presented poor conformational variations over time. In detail, the random 84-93 fragment provided a typical CD profile, constant over time, and immature fibers at SEM analysis; the 242-259 fragment was also confirmed as a stable helix with no amyloid evolution. The overall data unveil the presence of hot spots of aggregation, mostly in the NTD, but none of the identified APRs demonstrated a well-defined amyloid character causing cytotoxicity in SHSY5, as instead demonstrated by other NPM1 stretches [46]. In conclusion, even if further experiments on the potential synergy of aggregation of the entire AML-mutated protein are required, the presented data allow adding new gussets in the puzzled way of molecular determinants of cytoplasmatic accumulation of NPMc+ and could introduce innovative therapeutic strategies targeting the NPM1-AML subtype.