Orientational Preferences of GPI-Anchored Ly6/uPAR Proteins

Ly6/uPAR proteins regulate many essential functions in the nervous and immune systems and epithelium. Most of these proteins contain single β-structural LU domains with three protruding loops and are glycosylphosphatidylinositol (GPI)-anchored to a membrane. The GPI-anchor role is currently poorly studied. Here, we investigated the positional and orientational preferences of six GPI-anchored proteins in the receptor-unbound state by molecular dynamics simulations. Regardless of the linker length between the LU domain and GPI-anchor, the proteins interacted with the membrane by polypeptide parts and N-/O-glycans. Lynx1, Lynx2, Lypd6B, and Ly6H contacted the membrane by the loop regions responsible for interactions with nicotinic acetylcholine receptors, while Lypd6 and CD59 demonstrated unique orientations with accessible receptor-binding sites. Thus, GPI-anchoring does not guarantee an optimal ‘pre-orientation’ of the LU domain for the receptor interaction.

Despite the possible important role of GPI-anchor and glycosylation in the interaction of Ly6/uPAR proteins with their targets, these post-translational modifications have been previously considered in very few works [24]. GPI-anchor imposes significant configurational restraints on the behavior of Ly6 proteins and, thus, can be vital for their proper functioning. Here, without any pre-assigned hypothesis, we studied an intrinsic preorientation relative to the cell membrane and possible positional restraints imposed by the GPI-anchor on a set of six GPI-anchored proteins having a single LU domain in a receptorunbound state. Five of them (Lynx1 [21,[25][26][27], Ly6H [28][29][30], Lynx2 [31][32][33], Lypd6 [34][35][36][37], and Lypd6B [35,[38][39][40]) regulate nicotinic cholinergic signaling in the nervous system, while CD59 protects the body's own cells from complement-mediated lysis [19,20,41]. We built the most exhaustive to date models of Ly6 proteins based on the available 3D structures (Table S1); added post-translational modifications, including the GPI-anchors and glycans ( Figure S1); immersed them into lipid membranes of a different composition; and calculated 2-3 µs of all-atom molecular dynamics (MD) for each system (Table S2). Then, we determined the following parameters: (1) the position of the center of mass (COM) of the protein relative to the membrane surface; (2) the orientation of the protein (in terms of tilt/rotation angles) relative to the membrane; and (3) the intermolecular contacts between the Ly6 protein and lipids. The data obtained will be useful in studying the interaction of receptors with their GPI-anchored ligands.

Results and Discussion
Representative positions and orientations of the GPI-anchored Ly6 proteins in the membrane are provided in Figure 1. Generally, both GPI lipid tails were anchored in the membrane, behaving similarly to the membrane lipid tails. At the same time, the polypeptide moieties of the Ly6 proteins floated at the bilayer surface, sometimes raising above it ( Figure 2A). The C-terminal linkers of the proteins connecting the LU domains with the GPIanchors, as well as the carbohydrate moieties of the GPI-anchors, underwent significant folding at the initial stage of MD. This process, probably related to an entropic coiling [42] of the linker chain, resulted in the protein association with the membrane surface, regardless of the linker length (provided in Table 1) (see Supplementary Movies S1-S6). The obtained results agree with the previously reported modeling of the GPI-anchored GFP, where protein contacts with the membrane were observed [43]. a The fragment of the protein forming the mature form, from N-terminus to the site of the GPI-anchor attachment. Numbering according to the Uniprot database. b The length of the protein region that precedes the conserved LU domain (which starts with two amino acids before first cysteine). c The length of the protein linker connecting the conserved LU domain (after the terminal cysteine) with the GPI-anchor. d The center of mass (COM) position and α/β angles are depicted as insets in Figure 2 and explained in the SI Methods. Data are calculated over MD trajectories' production parts (after energy minimization, NVT-, and NPT-relaxation) and presented as median values ± standard deviations in approximation of uni-, bi-, or trimodal normal distributions. Low-population modes are indicated in parentheses. e Wide base modes.   Table 1. 2D histograms of probability density in α/β-coordinates are represented in Figure 4. Loop I: Q32 Loop III: Y84 C-terminus: G115, S116 N-glycan: GlcNAc-8 Loop I:  Table includes amino acids, carbohydrates, and phosphatidylinositol residues interacting with membrane lipids for at least 10% of the MD time through ion-ion/ion-dipole interactions and hydrogen bonds; for at least 5% of the MD time through π-cation interactions; or forming at least two hydrophobic contacts with a total lifetime of more than 200% of the MD trajectory (100% corresponds to one contact during the whole trajectory). Residues that simultaneously form the ionic contacts and hydrogen bonds are listed in the first column only. The column with hydrophobic contacts lists only the residues that do not participate in the ionic and π-cation interactions or hydrogen bonds. Strong ionic interactions and hydrogen bonds with a total lifetime of ≥ 50% of the MD time are shown in bold; the most stable of them (with lifetime ≥ 75%) are also underlined. The Loop/Head I/II/III regions of the proteins are defined in Figure S4 in the SI. Designations: SAPI-stearoylarachidonoylphosphatidylinositol; DSPI-distearoylphosphatidylinositol; and PEtN-phosphatidylethanolamine.
To extract functionally relevant motions from obtained Ly6 modulators' trajectories, we performed principal component analysis (PCA). Protein backbone atoms were used for covariance matrix determination and eigen values calculation. PCA revealed that the extremal protein configurations for the most significant principal components represent the cases when proteins: (1) are located near or far from the membrane; (2) are inclined towards the membrane either by the first or third loop; and (3) are tilted to the membrane by a central loop or a 'head' ( Figure S2). Thus, we introduced three parameters for a system state description: Z (the distance from the protein center of mass to the membrane), α (the tilt angle of the central loop to the membrane), and β (the rotation of the protein plane via the first or third loop to the membrane). These three principal components of proteins' mobility seem to be enough for the coarse description of the anchored modulators' movements because the internal folding of their protein parts is negligible as compared to the "rise", "pitch", and "roll" movements depicted by the Z, α, and β parameters, respectively.

Centers of Mass (COM) Positions
COM positions relative to the membrane surface are presented as distributions ( Figure 2A) and the respective median values (Table 1). Three numbers are usually used to describe the position of the COM: x-, y-, and z-components. However, due to the translational symmetry with respect to the membrane (namely, x-and y-components), only the z-component is considered for the analysis.
We found that the distribution of the COM position may be either unimodal (CD59, Lynx1, Lynx2, and Lypd6B) or bimodal (Ly6H and Lypd6). Moreover, the median COM positions differ significantly for different proteins, indicating the adoption of different modes of interaction with the membrane. The largest difference can be observed for CD59 and Lypd6B (Figures 2A and 3A). CD59 'lies' on the membrane surface, while Lypd6B was lifted and based on N-glycan and loop III, which prevented tight association of the protein with the membrane. This mode of the Lypd6B/membrane interaction was stabilized by the sporadic interactions of N-glycan with the GPI-anchor, the interaction of the long N-terminal protein region (20 residues, which precede the LU domain, Table 1) with the membrane, the C-terminus of the protein, and the extramembrane part of the GPI-anchor (see Supplementary Movie S4).
The lifted position of Lypd6B relative to CD59 could result from the significantly longer C-terminal linker (24 vs. 8 residues, Table 1), but we did not observe a good correlation between the lengths of the C-terminal linkers and the median COM positions, which do not exceed 2.3 nm (Pearson's correlation coefficient R = 0.45, see Figure S7 in the SI). This emphasizes a localization of the Ly6 proteins near the membrane, rather than remotely, as might be expected for the long-linker proteins (Lynx2, Lypd6, and Lypd6B). Nevertheless, Lynx2 significantly rose above the membrane (COM position~4.5-5.0 nm) approximately in the middle of the MD trajectory (1300-1400 ns of the total 2000 ns) and subsequently descended to the membrane (see Supplementary Movie S2). Probably, similar rises occur also for Lypd6 and Lypd6B, but they were not observed in our trajectories, probably due to insufficiently long MD and, consequently, an incomplete scanning of the conformational space.
Bimodal COM distributions, observed only for Ly6H and Lypd6, indicate that the C-terminal linker length (5 and 20 residues, respectively) does not correlate with the conformational freedom of the protein. For Lypd6, we observed two unequally populated peaks: the major one at 1.7 nm and weak-at~2.2 nm (Figure 2A). This may indicate the ability of GPI-anchored Lypd6 to interact with several membrane targets of different "heights": nAChRs [35] and the Wnt coreceptor LRP6 [23]. Two equally populated peaks were observed in the COM position distribution for Ly6H. The peaks at 1.8 and 2.3 nm form a relatively wide plateau (Figure 2A), indicating very high conformational flexibility of the Ly6H protein despite the very short C-terminal linker. We can speculate that Ly6H also may possess an additional (not yet identified) molecular target besides nAChRs.

Orientational Preferences
To define object orientation completely, usually three rotational angles are required. However, there is a rotational symmetry for GPI-anchored proteins around normal to membrane → Z-for this reason, only two rotational angles are enough to identify orientation: tilt (α) and rotation (β). They were calculated through the two mutually perpendicular and loops I and III are equidistant from the membrane. β = 0 • s means that both Nand C-termini face the membrane (the 'ventral' side of the β-structure is down); β = ±180 • corresponds to the case where the 'dorsal' side of the β-structure is down. The one-and two-dimensional histograms of the probability distribution of α/β angles are shown in Figure 2B,C and Figure 4, respectively. The extremal orientations of the GPI-anchored Ly6 proteins in the membrane are provided in Figure 3B,C.
Negative α values predominate for Lynx1, Lynx2, Lypd6B, and Ly6H ( Figure 2B, Table 1), which suggests that most of the studied Ly6 proteins in the receptor-unbound state prefer touching the membrane by the 'fingers'. The largest negative tilt (α = −82.3 • ) was observed for Ly6H ( Figure 3B, left), where it was stabilized by the intensive polar and hydrophobic interactions of the loop II tip and N-glycan with the membrane lipids (Table 2). Interestingly, bimodal distributions of the tilt angle α with positive and negative values were observed for Ly6H and Lynx2 ( Figure 2B). This indicates that even proteins with short C-terminal linkers (5 and 11 residues, respectively) can switch their 'fingers' orientation from down to up.
In contrast, CD59 had a weak positive tilt (α = 17.8 ± 9.0 • ), and this configuration was quite stable during MD due to the specific conformation of the C-terminal linker ( Figure 3A left, shown in red), which went approximately parallel to the membrane surface and prevented a detachment of the protein's 'head' from the membrane and tilting of the loop tips toward the membrane surface (Supplementary Movie S6).
The pronounced positive shift of the Lypd6 tilt angle α ( Figure 2B) can be explained by the extensive interactions of the protein's 'head' and the highly mobile N-terminal region with the membrane, which lift the Lypd6 'fingers' up ( Figure 3B, right; Table 2; Supplementary Movie S3). The significant difference of the Lypd6 and Lypd6B orientations is quite unexpected because their LU domains are ≈60%similar. The difference probably comes from the distinct distribution of the charged residues in the protein 'heads' [11], contributing to dissimilar pharmacology [23,35].
The CD59 orientation differed significantly from the other Ly6 proteins. It demonstrated narrow and distinctive distributions of the COM position and tilt/rotation angles ( Figure 2). This may highlight a different target of CD59 [19,20,41] as compared to the nAChR ligands (Lynx1, Lynx2, Lypd6, Lypd6B, and Ly6H), despite the generally conserved three-finger fold.
The probability distribution histograms for different tilt and rotation angles for all the proteins studied are shown in Figure 4; these data represent the free energy landscape in two principal coordinates (α; β) and the general preference of different orientations. As one can see, although generally very similar and anchored in the same way, the studied Ly6 proteins exhibit individual behavior.

Interaction of Ly6 Proteins with Membrane Lipids
Analysis of the MD trajectories revealed long-lived contacts with the membrane lipids not only for the residues of GPI-anchor, but also for protein amino acid residues and glycans (Tables 2 and S4 in the SI). Various types of contacts were found: ion-ion and ion-dipole interactions, hydrogen bonds, π-cation, and hydrophobic interactions (see Figure S6). There was a large variation in the number of long-lived ionic contacts, from two for Lypd6B to seven (two of them from the long N-terminal region) for Lypd6. Interestingly, Lynx1-the only non-glycosylated protein in this study-formed a significantly greater number of stable polar contacts (ionic and hydrogen bonds) with the membrane than other proteins, 15 vs. 8-11, respectively (see Table 2). Probably, the absence of glycans ensures a tighter interaction of the polypeptide part of Lynx1 with the lipids. Besides ionic and polar contacts, amino acid residues can simultaneously form up to 6-7 hydrophobic contacts with lipids (see Table S4 in the SI).
Glycans also interacted with the membrane but not so persistently as amino acids due to their high mobility. For Lynx2, Ly6H, Lypd6B, and CD59, the stable hydrogen bonds and hydrophobic contacts between N-and O-glycans and the membrane lipids were observed. Glycans do not interact with the membrane only in the case of Lypd6. In addition, different GPI-anchors' residues contacted the membrane in all studied systems. Contacts were observed for both phosphatidylinositol (DSPI-1, SAPI-1) residues directly incorporated into the membrane and for carbohydrate residues, including Man-5 and Man-6, furthest from the membrane ( Table 2). Membrane-embedded parts of the GPI-anchor can form up to 25-30 hydrophobic contacts due to the interaction with acyl lipid chains.
The largest number of hydrophobic contacts with lipids was observed for CD59. The contacts were formed by the residues of the protein 'head', loop III, C-terminal linker, and O-glycan (Table 2). Interestingly, while polar contacts predominated in the interaction of other Ly6 proteins with the membrane, hydrophobic contacts predominated for CD59.
According to the 'membrane catalysis' concept [44,45], the binding of a ligand to the membrane can optimize ligand-receptor interactions. The attachment of the GPI-anchor to a soluble three-finger domain could have the following consequences: (1) the partition of a ligand into the appropriate membrane compartment in the vicinity of the target receptor, and (2) the 'pre-orientation' of the three-finger domain that carries the receptor-binding site for optimal interaction with the receptor.
Among the studied Ly6 proteins, the positions of the receptor-binding sites were established only for Lynx1/nAChR [46], Lypd6/LPR6 [23], and the CD59/membrane attack complex [20]. Moreover, most of the Ly6 proteins studied to date (except CD59) interact with target receptors by the loop regions [12,13,17,23,46]. In our MD trajectory, the loop II of Lypd6, containing Asn88-Ser-Ile90 motif responsible for the interaction with LPR6 [23], was lifted high enough above the membrane surface to interact with the receptor ( Figure 1D). Thus, the LU domain of Lypd6 is probably pre-oriented for effective receptor binding. A similar situation was observed for CD59: its receptor-binding site lies on the 'dorsal' side of the LU domain near Trp65 and is accessible in the membrane-bound protein. At the same time, some of the CD59 residues located on the edge of the receptor-binding site (e.g., Phe67 and Asn73) contact the lipids (Table 2). Nevertheless, we assume that CD59 is also 'pre-oriented' for receptor binding.
In contrast, the Lynx1 loop II residues participating in the nAChR binding [46] form strong contacts with the membrane lipids (Table 2, Figure 1A). The other Ly6 proteins acting on nAChRs (Lynx2, Lypd6B, and Ly6H) also tended to interact with the membrane by the tip of loop II and loop I (α < 0 • , β < 0 • ). Thus, the LU domains of these proteins are not 'pre-oriented' for optimal receptor binding and should raise their 'fingers' to interact with the ligand-binding site at the receptor.

Data Relevance and Application to In Vivo
Our in silico study provides data on spatial position for a range of three-finger proteins regarding membrane surfaces. Despite the existence of several powerful experimental methods, it is difficult to obtain such data in vivo or in model systems. For example, to determine the position and orientation of the protein above the membrane by the fluorescence or EPR spectroscopies, the introduction of several fluorescent or paramagnetic labels into the protein and some labels in the membrane is needed. In this case, the labels (that are usually large) can significantly disturb the position of the protein and membrane properties. On the other hand, the systems with Ly6/uPAR proteins in the model membranes (e.g., in liposomes) are too large for solution NMR studies, and they are insufficiently 'solid' and ordered to be studied by solid-state NMR. The other method, which became popular in the last few years-cryo-electron microscopy-is also not applicable. Ly6/uPAR proteins are too small for EM studies. Thus, in silico simulations are practically the only method to obtain information about the behavior of such dynamic systems as GPI-anchored proteins.
We performed our calculations using an explicit solvent and membrane model. However, it would be reasonable to note that GPI-anchored proteins in vivo are surrounded by a complex mixture of various components, including water-soluble and membrane proteins, which are able to influence their position and orientation, as well as their contact with lipids. Here, our system can be considered as a model like those used in experimental methods to simplify the study. For example, the interaction of GPI-anchored Ly6/uPAR proteins (e.g., Lynx1 and Lypd6) with nicotinic acetylcholine receptors (nAChRs) cannot be studied in vivo due to the very heterogeneous environment in the different tissues of the organism. On the other hand, in vitro studies such as electrophysiology measurements are usually done in a controlled environment represented by a buffer applied on the individual cell or cell-patch through the perfusion system. This means that the receptors and GPI-anchored proteins under study do not contact the different molecules presented in the 'biological' fluids but are submerged in the controlled solution represented by water and salts. However, the validity of electrophysiology in vitro studies is usually not questioned. At the same time, an attempt to introduce additional interactions into the MD simulations does not seem to provide additional significant information to the information obtained in our case due to the wide variety of external conditions and the locations of the GPI-anchored Ly6/uPAR proteins. For example, the in silico simulation of Ly6/uPAR proteins in a concentrated solution of acetylcholine or glutamate (the conditions sometimes occurring in the synapse) does not give adequate information about the behavior of the proteins in the lung or skin or even in the synaptic cleft in the absence of a neurotransmitter.
Additionally, although relatively long, our simulation lengths definitely cannot be considered exhaustive, so it should be noted that MD pictures are almost always just a glance through a keyhole at how molecules actually behave.

Systems Preparation
To perform all-atom MD simulations, we set up series of systems containing Ly6 modulators, considering GPI-anchors, glycosylation, and an explicit membrane/water environment. Systems were built using the CHARMM-GUI software package [47] using the instruments: Membrane Builder, Glycolipid Modeler [48], and Glycan Modeler [49]. Models of full-size, human, mature Ly6 proteins, containing all Nand C-terminal amino acids, which are usually absent from experimental structures (although necessary for GPI-anchor attachment), were taken from the AlphaFold database [50,51] (Table S1). Their RMSD values from the respective experimental structures were all below 1.5 Å, which is in the range of normal RMSD values observed in the MD trajectories, except for Ly6H, which still had no structure determined.
To estimate the pK a values of amino acids within the studied Ly6 proteins, we used the PROPKA prediction program [52,53]. We obtained pK a values corresponding to a mainly deprotonated state for Asp, Glu, and N-termini; a mainly protonated state for Arg, Lys, Cys, and Tyr; and a mainly deprotonated state for most of the His residues. The maximal histidine pK a predicted values were 7.05 (for His-69 in CD59) and 7.00 (for His-37 in Lypd6). Concerning this, we uniformly set all histidines as deprotonated. In our case, all C-termini were deprotonated due to amide bond formation with D-glucosamine residue of the GPI-anchor. Detailed output values from PROPKA predictions can be found in Table S3.
Many Ly6 proteins are glycosylated, which is frequently omitted in modeling studies. To model the N-, O-glycans, and GPI-anchor of CD59, we searched the experimental MALDI-MS and HPLC data and chose the most frequent isoforms [54][55][56][57]. Because the exact N-glycans structure of other Ly6 proteins is currently unknown, we used an isoform widespread in the central nervous system (CNS), according to the N-glycome data in mice [58]. To model the GPI-anchors of Lynx1, Lynx2, Lypd6, and Lypd6B, the structure of the most common isoform in the CNS was used, although their exact structures are also unknown [59].
To build the CD59 system, we used a model bilayer (Table S2), resembling an erythrocyte outer layer, where normally this protein resides [60]. Other systems had a lipid composition characteristic for rafts in neuronal membranes [24,61] (Table S2).
Detailed structures of GPI-anchors and glycans are described in Figure S1. Standard CHARMM36m parametrization supplied by the CHARMM-GUI server was used to describe GPI-anchor [48] and glycan [62,63] behavior.
The production MD calculations for equilibrated systems were performed in an NPT ensemble at 310 K with a V-rescale thermostat [68] and a Parinello-Raman barostat [69] with a time step of 2 fs. No position restraints were applied to any molecules during the production MD phase, so the structure of proteins and their GPI-anchors and glycans was allowed to relax and change.
The details on built systems (box sizes, number of lipid and water molecules) and the total lengths of the calculated trajectories are given in Table S2.

Data Analysis
To perform principal component analysis (PCA), we utilized GROMACS utilities gmx trjconv, covar, and anaeig. Prior to analysis, the modulator trajectories with protein backbone fitted to x-and y-but not z-components of the box were obtained (-fit rotxy+transxy option in gmx trjconv). Then, covariance matrices were constructed using gmx covar performed with -nofit option. For eigenvalue and eigenvector determination, we took advantage of the gmx anaeig procedure; protein backbone atoms were used as input for eigenvector calculation.
To analyze the center of mass (COM) position and the orientation of the proteins during MD, we used GROMACS utilities gmx trjconv, make_ndx, trjcat, and in-house Python scripts, which use NumPy and Matplotlib libraries.
To determine the protein COM position relative to the membrane, we used the gmx make_ndx procedure to define two index groups: amino acids and phosphorus atoms of upper lipid monolayer. Using the gmx trjcat procedure, we extracted the COM coordinates for both groups. The difference in the Z coordinate provides the position of protein COM relative to the membrane.
To determine protein orientation with respect to the membrane, we performed structure superposition using PyMOL and sequence alignment using Jalview [70] and Cys-Bar [71]. On this basis, we then selected four groups of amino acid residues to establish orientation angles. The exact definitions and ways of calculation of orientational angles with respect to the membrane are described in SI.

Lipid Contacts
For analysis of intermolecular contacts between the Ly6 proteins and membrane lipids, we utilized our in-house IMPULSE software package (cont_stat.js procedure) [72]. Hydrophobic contacts were determined according to the concept of molecular hydrophobic potential (MHP) [73].
A complete list of all found contacts is available in the Supplementary Table S4. The values in the table are the relative lifetimes of interactions for all types of lipids in total and separately (cholesterol, sphingomyelins, phosphatidylcholines, and phos-phatidylethanolamines). Relative lifetime means the fraction of MD trajectory where the corresponding contact with lipid was observed: the value 0 corresponds to the complete absence of contact; the value 1 corresponds to the presence of one contact throughout the entire trajectory. If more than one contact of the corresponding protein group with the lipid was observed during MD, relative lifetime can exceed 1.

Conclusions
In summary, the behavior of GPI-anchored Ly6 proteins in the receptor-unbound state is quite complex and is determined not only by the anchoring to the membrane but also by the presence and position of N-and O-glycans and the ability of individual protein regions to interact with the membrane lipids. The relative position and dynamics of the Ly6 proteins weakly depend on the length of the C-terminal linker connecting the LU domain with GPI-anchor. GPI-anchoring does not guarantee the optimal pre-orientation of the LU domain required for the receptor interaction. The obtained results are valuable for the ongoing research of the regulatory proteins from the Ly6/uPAR family.