Selenourea for Experimental Phasing of Membrane Protein Crystals Grown in Lipid Cubic Phase

: Heavy-atom soaking has been a major method for experimental phasing, but it has been difﬁcult for membrane proteins, partly owing to the lack of available sites in the scarce soluble domain for non-invasive heavy-metal binding. The lipid cubic phase (LCP) has proven to be a successful method for membrane protein crystallization, but experimental phasing with LCP-grown crystals remains difﬁcult, and so far, only 68 such structures were phased experimentally. Here, the selenourea was tested as a soaking reagent for the single-wavelength anomalous dispersion (SAD) phasing of crystals grown in LCP. Using a single crystal, the structure of the glycerol 3-phosphate acyltransferase (PlsY, ~21 kDa), a very hydrophobic enzyme with 80% membrane-embedded residues, was solved. Remarkably, a total of 15 Se sites were found in the two monomers of PlsY, translating to one selenourea-binding site per every six residues in the accessible extramembrane protein. Structure analysis reveals that surface-exposed selenourea sites are mostly contributed by mainchain amides and carbonyls. This low-speciﬁcity binding pattern may explain its high loading ratio. Importantly, both the crystal diffraction quality and the LCP integrity were unaffected by selenourea soaking. Taken together, selenourea presents a promising and generally useful reagent for heavy-atom soaking of membrane protein crystals grown in LCP. P2 1 0.436 REFMAC5 . reﬁned using the P2 1 dataset, the structure well-resolved densities in the otherwise problematic region. The clearer densities under the P2 1 space group no clashes at the C-terminal, and the two NCS monomers assumed different conformations in this region; more speciﬁcally, at residue Val197 (Figure 7F), justifying the existence of twining. Taken together, the twined PlsY crystal did not affect phasing.


Introduction
In X-ray crystallography, experimental phasing has been a major method for the structural determination of macromolecules with novel folds. The recent advances in artificial intelligence-based structural predictions, represented by AlphaFold [1] and RosettaFold [2], make it possible to bypass experimental phasing even for unknown folds because of the ability to generate prediction models with enough accuracy for molecular replacement (MR) [3]. Because the quality of the predicted models depends on available training datasets, experimental phasing may still be required for the crystal structure determination of proteins that lack suitable learning models for structure prediction. Membrane proteins may represent such a class because of the relatively low number of unique structures. Thus, according to the Membrane Protein with Known 3D Structure database (https://blanco.biomol.uci.edu/mpstruc), there are only 1458 unique structures of membrane proteins as of 31 May 2022. Among these, some are homologs from different species, and nearly 10% belong to bacteriorhodopsins or structurally related rhodopsins/G-protein coupled receptors (GPCRs) [4]. Furthermore, some membrane proteins undergo conformational changes to a drastic degree such that molecular replacement with its alternative conformation could even fail [5]. Therefore, experimental phasing, although not as in demand as before, may still be needed in the structural biology of membrane proteins.  [42][43][44] using the A CIMBOLDO_LITE algorithm [45] and molecular replacement with models constructed by evo tionary coupling [46,47], as shown in (B). Brackets indicate the number of structures in each ca gory. (B) Statistics of structures solved by non-MR methods. In the case of experimental phasi the heavy atoms are listed. The total number (72) exceeds 68 as in (A) by four because four structu used two heavy atoms each for experimental phasing.
Here, we demonstrate that Se-urea is a compatible and effective heavy-atom co pound for the experimental phasing of crystals grown in LCP. With a single crystal, structure of PlsY was successfully determined by Se-SAD. The results should encoura the use of the Se-urea for experimental phasing with LCP-grown membrane protein cr tals.

Purification of PlsY
PlsY from Aquifex aeolicus (Uniprot O66905) was overexpressed in Escherichia c BL21 (DE3) cells with a C-terminal green fluorescent protein (GFP) fusion for easier mo itoring of the expression and purification process [48], as described [41]. The expressi was induced with 0.05 mM isopropyl β-D-1-thiogalactopyranoside using cells growing M9 minimum medium at 20 °C for 16 h. Cells were lysed by passing through a cell d rupter at 20 kpsi three times at 4 °C. The cell lysate was clarified by centrifugation 20,000× g for 30 min. The supernatant was further centrifuged at 150,000× g for 1.5 h isolate the membrane fraction. The pellets were then solubilized by 1% (w/v) dode maltoside at 4 °C for 1 h with mild agitation. The mixture was heated at 65 °C for 30 m  [42][43][44] using the ARCIMBOLDO_LITE algorithm [45] and molecular replacement with models constructed by evolutionary coupling [46,47], as shown in (B). Brackets indicate the number of structures in each category. (B) Statistics of structures solved by non-MR methods. In the case of experimental phasing, the heavy atoms are listed. The total number (72) exceeds 68 as in (A) by four because four structures used two heavy atoms each for experimental phasing.
Here, we demonstrate that Se-urea is a compatible and effective heavy-atom compound for the experimental phasing of crystals grown in LCP. With a single crystal, the structure of PlsY was successfully determined by Se-SAD. The results should encourage the use of the Se-urea for experimental phasing with LCP-grown membrane protein crystals.

Purification of PlsY
PlsY from Aquifex aeolicus (Uniprot O66905) was overexpressed in Escherichia coli BL21 (DE3) cells with a C-terminal green fluorescent protein (GFP) fusion for easier monitoring of the expression and purification process [48], as described [41]. The expression was induced with 0.05 mM isopropyl β-D-1-thiogalactopyranoside using cells growing in M9 minimum medium at 20 • C for 16 h. Cells were lysed by passing through a cell disrupter at 20 kpsi three times at 4 • C. The cell lysate was clarified by centrifugation at 20,000× g for 30 min. The supernatant was further centrifuged at 150,000× g for 1.5 h to isolate the membrane fraction. The pellets were then solubilized by 1% (w/v) dodecyl maltoside at 4 • C for 1 h with mild agitation. The mixture was heated at 65 • C for 30 min, cooled to room temperature (22-25 • C) under tap water, and clarified by centrifugation at 20,000× g for 30 min at 4 • C. The supernatant containing solubilized PlsY was mixed with Ni-NTA beads for 1 h in the presence of 10 mM imidazole in the Buffer A (150 mM NaCl, 10% (v/v) glycerol, 0.1 mM EDTA, 50 mM Tris-HCl pH 8.0). The beads were packed into a gravity column and washed successively with 10 and 45 mM imidazole in Buffer A. The fusion protein was eluted with 250 mM imidazole in Buffer A, desalted, and digested with 3C protease overnight at room temperature. The mixture was loaded onto a second Ni-NTA column to remove His-tagged GFP and His-tagged 3C protease. The flowthrough containing tag-free PlsY was concentrated to~10 mg mL −1 and loaded onto a Superdex 200 10/300 GL column connected to a Bio-Rad FPLC system for size exclusion chromatography. Peak fractions were pooled, concentrated to 20 mg mL −1 , flash-frozen in liquid nitrogen, and stored at −80 • C until use.

Crystallization of Lysozyme in Lipid Cubic Phase
Crystallization of lysozyme was performed manually as described [49]. Briefly, lysozyme was dissolved at 50 mg mL −1 in MillQ H 2 O. LCP was made by mixing the lysozyme solution with a 1.5-fold volume of monoolein in two coupled syringes [50]. Optically clear LCP was transferred to a 10-µL microsyringe (Cat. 80330, Hamilton, Reno, NV, USA) and pre-loaded into a repeat dispenser [51] with an engineered bushing [52]. On a microscope slide with wells created by double-sticky tapes [53], 200 nL of LCP was carefully deposited onto the surface of each well. The LCP bolus was immediately laid by 1 µL of precipitant solution containing 35% (v/v) PEG 400, 0.8 M NaCl, and 100 mM sodium citrate/citric acid pH 4.5. The wells were sealed using a cover slide 18 mm × 18 mm in dimension and 0.1 mm in thickness. Crystals generally grew to~20 µm in length after 16 h.

Soaking Lipid Cubic Phase Crystals with Se-Urea
To soak the crystals, a small window ( Figure 2) was created for the desired wells using a pointy glass-cutter following a previously published procedure [36]. A tiny speckle of Seurea was added to the drop through the small window under a microscope. The dimension of the speckles, as measured under a microscope, were typically 0.4 mm × 0.2 mm. Because the thickness of the spacer between the cover and the base plate was 0.12 mm, the volume of the speckles was typically 0.008 mm 3 . Thus, the speckles had an estimated mass of 10 µg assuming that Se-urea has a similar density to urea (1.32 g/cm 3 ). This translates to~0.1 M final concentration in the precipitant solution (800 nL). The well was resealed using Scotch tape. After soaking for the lysozyme for six minutes and PlsY for 22 min, the wells were cut open using a glass cutter. Crystals were harvested with a loop (Cat. M2-L18SP-50, MiTeGen, Ithaca, NY, USA) loop and cryo-cooled directly in liquid nitrogen as described [56].

X-ray Diffraction Data Collection
X-ray diffraction data were collected on the BL17U1 beamline [57] for lysozyme crystal with an Eiger 16M detector (Dectris, Baden, Switzerland) at the wavelength of 0.97918 Å and BL19U1 beamlines [58] for PlsY crystal with a Pilatus 6M detector (Dectris, Baden, Switzerland) using 0.97907 Å X-rays at the Shanghai Synchrotron Radiation Facility (SSRF) with a 0.5° oscillation and total rotation range of 360° and a beam size of 50 μm × 50 μm.

Data Processing and Structure Determination
Crystallographic data were processed by HKL2000 [59] using "auto-correction" during scaling ( Table 1). The scaled data with separated Friedel pairs were subjected to SHELXC [60] to calculate the anomalous difference signal (<d"/sig>) and generate the input files for SHELXD [60]. Selenium atoms were located by SHELXD with 1000 trials with the resolution cut-off at 2.5 Å and 3.0 Å for lysozyme and PlsY, respectively. The substructure refinement, density modification, and the initial chain tracing were carried out by SHELXE [60]. Model building was carried out by BUCCANEER [61] in the CCP4 package [62]. The Se-urea molecules were placed according to the anomalous difference map and the lipids bound to PlsY were modeled according to the Fo-Fc and 2Fo-Fc maps in COOT [63]. The models were refined by REFMAC5 and manually adjusted in COOT iteratively. The structures were illustrated by PyMOL [64].

X-ray Diffraction Data Collection
X-ray diffraction data were collected on the BL17U1 beamline [57] for lysozyme crystal with an Eiger 16M detector (Dectris, Baden, Switzerland) at the wavelength of 0.97918 Å and BL19U1 beamlines [58] for PlsY crystal with a Pilatus 6M detector (Dectris, Baden, Switzerland) using 0.97907 Å X-rays at the Shanghai Synchrotron Radiation Facility (SSRF) with a 0.5 • oscillation and total rotation range of 360 • and a beam size of 50 µm × 50 µm.

Data Processing and Structure Determination
Crystallographic data were processed by HKL2000 [59] using "auto-correction" during scaling ( Table 1). The scaled data with separated Friedel pairs were subjected to SHELXC [60] to calculate the anomalous difference signal (<d"/sig>) and generate the input files for SHELXD [60]. Selenium atoms were located by SHELXD with 1000 trials with the resolution cut-off at 2.5 Å and 3.0 Å for lysozyme and PlsY, respectively. The substructure refinement, density modification, and the initial chain tracing were carried out by SHELXE [60]. Model building was carried out by BUCCANEER [61] in the CCP4 package [62]. The Se-urea molecules were placed according to the anomalous difference map and the lipids bound to PlsY were modeled according to the Fo-Fc and 2Fo-Fc maps in COOT [63]. The models were refined by REFMAC5 and manually adjusted in COOT iteratively. The structures were illustrated by PyMOL [64].

Both Lipid Cubic Phase and PlsY Tolerated Se-Urea Soaking
Heavy-atom derivatives are only useful if the crystals survive soaking. Thus, the stability of crystals upon soaking should first be visually checked under a microscope. For LCP crystals, there is another complication. Various chemicals, when at high concentrations, can cause phase transition from the cubic mesophase to the lamellar phase [65][66][67][68]. The bulk lamellar phase has a characteristic strong birefringence [53] under polarized light. It may impose stress on crystals, and it certainly interferes with crystal observing and harvesting [56]. Furthermore, the lamellar phase can at least compromise crystal diffraction because it shows strong diffraction rings under X-ray [68]. Therefore, it is important to test the stability of both LCP and crystals and gauge conditions such as soaking concentration and time before soaking crystals with the best qualities, especially when such crystals are difficult to obtain.
The tolerance of LCP and PlsY crystals to Se-urea soaking was carried out using crystals that were relatively small in three-dimension (thin plates) ( Figure 2). A well containing PlsY crystals was cut open. Se-urea speckles were added to the precipitant solution that contained 32% (v/v) triethylene glycol (Figure 2A). Se-urea was completely dissolved after 22 min. Crystals were visually intact during this period ( Figure 2B,C), suggesting that PlsY crystals were tolerant to osmaticity changes caused by the dissolution of Se-urea. Finally, like urea, Se-urea may have chaotropic characteristics. However, the concentration (estimated to be 0.1 M, see Methods) was probably not high enough to denature PlsY.
LCP is known to tolerate saturating concentrations of 'normal' urea [69]. By extension, it is expected to be compatible with high concentrations of Se-urea. Consistently, no bulk lamellar phase (which shows characteristic birefringence under polarized light) [53] was observed during the soaking ( Figure 2D,E). Nevertheless, the insensitivity of LCP to Seurea observed here should not be generalized. Crystallization conditions vary in type and concentration of precipitants, detergents, lipids, and membrane proteins, and the phase stability may depend on these conditions. Therefore, we recommend running the compatibility test for new projects.

Prove-of-Principle Experiment with Lysozyme
The idea of using Se-urea for LCP-grown crystals emerged in discussions between the authors during the ICCBM17 workshop and conference in Shanghai. As part of the workshop demonstration, lysozyme crystals were grown in LCP; and heavy-atom derivatives of lysozyme crystals were needed for the integrated phasing workshop. Driven by common interests, the authors performed a quick soaking of lysozyme crystals ( Figure 3A) using Se-urea. The crystals diffracted beyond 1.71 Å at the synchrotron without further optimizing the distance during the workshop training section (Table 1). Although only weak anomalous signals were indicated by the Chi 2 versus resolution plot from HKL2000 data scaling (Figure 3B), the anomalous signal <d"/sig> estimated by SHELXC was very strong within 3.0 Å ( Figure 3C). The sub-structures of Se atoms were determined using SHELXD, and six sites with occupancy higher than 0.3 were revealed ( Figure 3D,E). After density modification and poly-Ala tracing using SHELXE, 99 residues were built in six chains ( Figure 3F).

Prove-of-Principle Experiment with Lysozyme
The idea of using Se-urea for LCP-grown crystals emerged in discussions betwe the authors during the ICCBM17 workshop and conference in Shanghai. As part of t workshop demonstration, lysozyme crystals were grown in LCP; and heavy-atom deri atives of lysozyme crystals were needed for the integrated phasing workshop. Driven common interests, the authors performed a quick soaking of lysozyme crystals (Figu 3A) using Se-urea. The crystals diffracted beyond 1.71 Å at the synchrotron without fu ther optimizing the distance during the workshop training section (Table 1). Althou only weak anomalous signals were indicated by the Chi 2 versus resolution plot fro HKL2000 data scaling ( Figure 3B), the anomalous signal <d"/sig> estimated by SHELX was very strong within 3.0 Å ( Figure 3C). The sub-structures of Se atoms were determin using SHELXD, and six sites with occupancy higher than 0.3 were revealed (Figure 3  3E). After density modification and poly-Ala tracing using SHELXE, 99 residues we built in six chains ( Figure 3F). The initial model was subjected to BUCCANEER for rebuilding. The final structu of the lysozyme and Se-urea complex was presented with the Se-urea close to the asym metric unit of lysozyme ( Figure 4A). The occupancy of the six sites ranged from 0.33 0.5, suggesting some level of disorder. In addition, although the positions of the Se atom were supported by the anomalous map, the position for the amine groups and the co necting carbon atoms were less certain based on the 2Fo-Fc density maps ( Figure 4Bsuggesting high degrees of flexibility. Nevertheless, we built the Se-urea molecules wi 2Fo-Fc density maps at 0.6-0.7 σ levels and we'll discuss the interactions between lysozym and Se-urea based on the built model. The initial model was subjected to BUCCANEER for rebuilding. The final structure of the lysozyme and Se-urea complex was presented with the Se-urea close to the asymmetric unit of lysozyme ( Figure 4A). The occupancy of the six sites ranged from 0.33 to 0.5, suggesting some level of disorder. In addition, although the positions of the Se atoms were supported by the anomalous map, the position for the amine groups and the connecting carbon atoms were less certain based on the 2F o -F c density maps ( Figure 4B-F), suggesting high degrees of flexibility. Nevertheless, we built the Se-urea molecules with 2F o -F c density maps at 0.6-0.7 σ levels and we'll discuss the interactions between lysozyme and Se-urea based on the built model.  Interacting residues are shown as sticks with yellow carbon atoms for residues within the monom or with grey carbon atoms for residues from crystallographic symmetry mates. Dashed lines in cate H-bonding with distances shown in Å. Distances are omitted for the amine groups that did show clear densities. Dash coloring is explained in the box in (F): black, interactions involving ma chain groups; red, interactions involving sidechain groups; cyan, interactions between bridging w ters and protein residues; green, interactions between Se-urea molecules. A prime symbol lab amino acids from symmetry mates. We would note that, although the position of the Se atom w supported by the anomalous map, the position of the two amine groups was less certain and t were built with 2Fo-Fc density maps at a contour level of 0.6-0.7 σ. In (B-F), occupancies of the s are indicated in brackets. Water molecules are shown as red spheres. (G) Comparison of Se-u binding sites in lysozyme crystals grown in LCP (white ribbon for protein and black sphere for and those in solution (yellow ribbon for protein and orange sphere for Se) (PDB ID 5T3F) [39]. B overlapping sites (SA/SB/SC/SF); green, symmetric equivalent sites (SE/Siii); black, the site unique the LCP structure (SD); red, sites unique to the solution structure (Si/Sii/Siv/Sv).
None of the sites were found in the stacked helices. Instead, they were concentra at the surface and the active site cleft. Three sites (SA, SB, SC) were contained in the crys lographic monomer and three were involved in crystal contact (SD, SE, SF, Figure 4A-The binding mode exhibited great diversity. It could bind to all secondary structu (loop, sheet, and helix). The binding mode of SC was particularly interesting. Instead forming a pocket that contains surrounding residues, three residues along the same f of the helix form a 'hook'-like structure to host Se-urea ( Figure 4D). Two mainchain c bonyl groups interacted with Se-urea indirectly via two water molecules, and Lys97 one turn away provided a further hydrogen bond. This is very encouraging-such a c Interacting residues are shown as sticks with yellow carbon atoms for residues within the monomer, or with grey carbon atoms for residues from crystallographic symmetry mates. Dashed lines indicate H-bonding with distances shown in Å. Distances are omitted for the amine groups that did not show clear densities. Dash coloring is explained in the box in (F): black, interactions involving mainchain groups; red, interactions involving sidechain groups; cyan, interactions between bridging waters and protein residues; green, interactions between Se-urea molecules. A prime symbol labels amino acids from symmetry mates. We would note that, although the position of the Se atom was supported by the anomalous map, the position of the two amine groups was less certain and they were built with 2F o -F c density maps at a contour level of 0.6-0.7 σ. In (B-F), occupancies of the sites are indicated in brackets. Water molecules are shown as red spheres. (G) Comparison of Se-urea binding sites in lysozyme crystals grown in LCP (white ribbon for protein and black sphere for Se) and those in solution (yellow ribbon for protein and orange sphere for Se) (PDB ID 5T3F) [39]. Blue, overlapping sites (S A /S B /S C /S F ); green, symmetric equivalent sites (S E /S iii ); black, the site unique to the LCP structure (S D ); red, sites unique to the solution structure (S i /S ii /S iv /S v ).
None of the sites were found in the stacked helices. Instead, they were concentrated at the surface and the active site cleft. Three sites (S A , S B , S C ) were contained in the crystallographic monomer and three were involved in crystal contact (S D , S E , S F , Figure 4A-F). The binding mode exhibited great diversity. It could bind to all secondary structures (loop, sheet, and helix). The binding mode of S C was particularly interesting. Instead of forming a pocket that contains surrounding residues, three residues along the same face of the helix form a 'hook'-like structure to host Se-urea ( Figure 4D). Two mainchain carbonyl groups interacted with Se-urea indirectly via two water molecules, and Lys97 at one turn away Crystals 2022, 12, 976 9 of 16 provided a further hydrogen bond. This is very encouraging-such a configuration may be easily satisfied because α-helices are very abundant, and mainchain interactions are relatively insensitive to sequence variations.
The previous lysozyme structure (PDB 5T3F) soaked with Se-urea contains 9 Se sites [39]. When overlaid, four of the sites identified in this study overlapped with those previously observed and one symmetrically identical Se site with the CO(NH 2 ) 2 group pointing in the opposite direction ( Figure 4G), leaving one unique site to the LCP structure and four unique sites to the water-soluble structure. The differences are unlikely caused by crystal packing because the two protein structures are almost identical with the same packing pattern. Different diffusion rates were not likely the cause of the differences because unique sites for both structures were observed. The diffraction of Se-urea from the small crystal grown in LCP (~20 µm × 20 µm) was weaker compared to that from the large crystals used for 5T3F (~300 µm × 100 µm). In addition, the lower resolution of the LCP data set at 1.71 Å provided fewer details of the NH 2 group compared to the electron density maps of 5T3F at 1.45 Å.

Experimental Phasing of PlsY
Next, we wanted to test how Se-urea behaves for membrane protein crystals growing in LCP. One of the major differences between membrane proteins and soluble proteins is the exposed region for binding with water-soluble chemicals such as Se-urea used in this study. To expand the application of Se-urea for membrane proteins, it would be ideal to use proteins with very few exposed regions. PlsY fits this purpose because 80% of its residues are embedded in the membrane [41].
Using the soaking procedure, the PlsY structure was solved with 360 degrees of data collected from a single crystal ( Figure 5A). The crystal diffracted to 1.8 Å (Table 1). Similar to the lysozyme case, the anomalous signal from HKL2000 scaling was weak ( Figure 5B) but could be extended to 3.5 Å calculated from SHELXC ( Figure 5C). The sub-structures of Se atoms were determined using SHELXD, and 11 sites with occupancy higher than 0.3 were revealed ( Figure 5E). After density modification and poly-Ala chain tracing by SHELXE, 373 residues were built in nine chains ( Figure 5F). in the opposite direction ( Figure 4G), leaving one unique site to the LCP structure and four unique sites to the water-soluble structure. The differences are unlikely caused by crystal packing because the two protein structures are almost identical with the same packing pattern. Different diffusion rates were not likely the cause of the differences because unique sites for both structures were observed. The diffraction of Se-urea from the small crystal grown in LCP (~20 μm × 20 μm) was weaker compared to that from the large crystals used for 5T3F (~300 μm × 100 μm). In addition, the lower resolution of the LCP data set at 1.71 Å provided fewer details of the NH2 group compared to the electron density maps of 5T3F at 1.45 Å.

Experimental Phasing of PlsY
Next, we wanted to test how Se-urea behaves for membrane protein crystals growing in LCP. One of the major differences between membrane proteins and soluble proteins is the exposed region for binding with water-soluble chemicals such as Se-urea used in this study. To expand the application of Se-urea for membrane proteins, it would be ideal to use proteins with very few exposed regions. PlsY fits this purpose because 80% of its residues are embedded in the membrane [41].
Using the soaking procedure, the PlsY structure was solved with 360 degrees of data collected from a single crystal ( Figure 5A). The crystal diffracted to 1.8 Å (Table 1). Similar to the lysozyme case, the anomalous signal from HKL2000 scaling was weak ( Figure 5B) but could be extended to 3.5 Å calculated from SHELXC ( Figure 5C). The sub-structures of Se atoms were determined using SHELXD, and 11 sites with occupancy higher than 0.3 were revealed ( Figure 5E). After density modification and poly-Ala chain tracing by SHELXE, 373 residues were built in nine chains ( Figure 5F). The final structure was built by BUCCANEER with 15 Se-urea molecules scattered in two non-crystallographic symmetry (NCS) monomers ( Figure 6A). The sites were all located in the extra-membrane domain ( Figure 6B), as expected. Among the 15 sites, six were involved in crystal packing (S1, S2, S7, and their NCS pairs, Figure 6C-I). The binding The final structure was built by BUCCANEER with 15 Se-urea molecules scattered in two non-crystallographic symmetry (NCS) monomers ( Figure 6A). The sites were all located in the extra-membrane domain ( Figure 6B), as expected. Among the 15 sites, six were involved in crystal packing (S 1 , S 2 , S 7 , and their NCS pairs, Figure 6C-I). The binding sites were almost identical between the two NCS monomers with two exceptions as follows. Site 8 (S 8 ) was only in monomer A ( Figure 6A), and Site 3 (S 3 and S 3 ) displayed slight differences between the two monomers ( Figure 6E). involving mainchain groups; red, interactions involving sidechain groups; cyan, interactions between bridging water molecules and protein residues; green, interactions between Se-urea and other ligands. Residues from monomer A are shown as sticks with yellow carbon and labeled with residue numbers. Residues from monomer B are shown as sticks with grey carbon and labeled with residue number plus the chain ID. Residues from adjacent crystal packing residues are shown as sticks with grey carbon atoms and labeled with a prime. We would note that, although the position of the Se atom was supported by the anomalous map, the position of one or both amine groups for S 2 , S 3 , S 4 , S 5 , and S 6 were less certain and they were built with 2F o -F c density maps at a contour level of 0.6-0.7 σ.
Among the 15 sites, eight had occupancies greater than 0.50 (S 1 , S 7 , S 8 , S 1 , S 2 , S 3 , S 6 , and S 7 ) and three had full occupancies (S 1 /S 1 /S 8 ) ( Figure 6A). Compared with the lysozyme structure (Figure 4), the densities for the amine groups of Se-urea in the PlsY structure were more defined, suggesting that they are more ordered. Similar to the case of the lysozyme, the mainchain amide and carbonyl groups contributed significantly to the binding of Se-urea in PlsY ( Figure 6C-I). The interaction pattern involving mainchain atoms for Se-urea has been observed before for the urea transporter [70] and the targets used in the previous Se-urea phasing report [39]. These results suggest that Se-urea can relatively easily bind to shallow grooves on the surface owing to its small size.
Given the little exposed region in PlsY, the loading ratio of Se-urea sites is very high. One PlsY monomer contains approximately 40 exposed residues, meaning the PlsY crystal bound~18.8 Se-urea per 100 residues in the accessible extramembrane region. The lysozyme (129 residues) crystal bound 6 Se-urea sites, corresponding to a 4.7% coverage. Therefore, the density of the Se-urea binding site in the non-membrane embedded region of PlsY was fourfold of that for the lysozyme.
One of the objectives of this project was to check if Se-urea soaking can give enough signal for experimental phasing for membrane proteins which generally contain limited available hydrophilic surfaces for heavy-atom labeling. Traditionally, for Se-Met labeling, it is considered a 'rule-of-thumb' if a protein had 1 Se-Met per every 50-75 residues (1.3-2%). Technological advances over the years have improved with better detectors and data collection strategies, pushing the limit to one Se site per every 150-200 residues (0.5-0.7%) [71]. Taking the full-length protein into consideration, the coverage for PlsY was 3.8%. This is much higher than the abovementioned minimum coverage required by Se-Met labeling. Given that PlsY is very hydrophobic with 80% membrane-embedded residues, and that membrane proteins generally contain a more hydrophilic proportion than this, coverage is unlikely to be an issue for Se-urea phasing of most membrane protein crystals.
Because the success of experimental phasing depends on data resolution, a good soaking reagent should not compromise the diffraction quality of crystals. The two examples here showed that Se-urea was not invasive under the present conditions. They both survived long soaking at an estimated concentration of 0.1 M, and both diffracted to similar quality as their native counterparts [26,41,54]. The generality of this will have to be tested with other membrane protein crystals in the future.
Initial LCP crystals for membrane proteins are generally small (5-30 µm) [72][73][74]. The optimization for bigger crystals and routine synchrotron diffraction experiments may involve the fine-tuning of constructs, precipitant conditions, temperature, host lipids, and native lipid additives [72][73][74][75][76][77][78], a process that can be time-and resource-consuming. Recent advances in micro-beam synchrotron radiation and detectors [15,79], serial crystallography [80], and X-ray free-electron lasers [80,81] make it possible to obtain high-resolution diffraction data from multiple microcrystals or sub-micron crystals. Furthermore, microcrystals may be more tolerant to soaking compared with large crystals [82]. Therefore, the application of the Se-urea soaking to the LCP microcrystal for rapid phasing warrants future investigation. Twinned crystals could be problematic for phasing [83]. Initially, the diffraction data of PlsY were processed in the C222 1 space group with an overall R merge of 0.101, which was slightly higher than that in the P2 1 space group (0.094). Still, at first, the C222 1 dataset appeared to be justified. The anomalous signal <d"/sig> obtained from SHELXC ( Figure 7A) was much higher compared to the data processed in the P2 1 space group ( Figure 5B), probably owing to the higher redundancy with C222 1 . In addition, the Se atoms were successfully located by SHELXD ( Figure 7B) and the poly-Ala model was also successfully traced by SHELXE and auto-built by BUCCANEER using the C222 1 dataset. However, symmetry-related clashes were observed during structure refinement ( Figure 7C) for Val197 at the C-terminal of PlsY. Therefore, the data were reprocessed to the P2 1 space group which has lower symmetry. Although a latter L-test for twinning showed that the crystal was more close to un-twinned ( Figure 7D), an H-test reported a twinning fraction of 0.40 ( Figure 7E); such a phenomenon is abnormal. To resolve the clashed model, we chose to refine the data using the P2 1 space group with a twin fraction of 0.436 estimated by REFMAC5. Indeed, when refined using the P2 1 dataset, the structure showed well-resolved densities in the otherwise problematic region. The clearer densities under the P2 1 space group showed no clashes at the C-terminal, and the two NCS monomers assumed different conformations in this region; more specifically, at residue Val197 ( Figure 7F), justifying the existence of twining. Taken together, the twined PlsY crystal did not affect phasing.
symmetry-related clashes were observed during structure refinement ( Figure 7C) Val197 at the C-terminal of PlsY. Therefore, the data were reprocessed to the P21 spa group which has lower symmetry. Although a latter L-test for twinning showed that t crystal was more close to un-twinned ( Figure 7D), an H-test reported a twinning fracti of 0.40 ( Figure 7E); such a phenomenon is abnormal. To resolve the clashed model, chose to refine the data using the P21 space group with a twin fraction of 0.436 estimat by REFMAC5. Indeed, when refined using the P21 dataset, the structure showed wellsolved densities in the otherwise problematic region. The clearer densities under the P space group showed no clashes at the C-terminal, and the two NCS monomers assum different conformations in this region; more specifically, at residue Val197 ( Figure 7 justifying the existence of twining. Taken together, the twined PlsY crystal did not aff phasing. In summary, we described the methods and results of soaking LCP crystals with S urea for experimental phasing. The results showed that Se-urea was indeed versatile w regard to binding motifs and interaction mode. The successful application of Se-urea the membrane enzyme PlsY should encourage its wide usage in experimental phasing   In summary, we described the methods and results of soaking LCP crystals with Se-urea for experimental phasing. The results showed that Se-urea was indeed versatile with regard to binding motifs and interaction mode. The successful application of Se-urea for the membrane enzyme PlsY should encourage its wide usage in experimental phasing.