Adding Size Exclusion Chromatography (SEC) and Light Scattering (LS) Devices to Obtain High-Quality Small Angle X-Ray Scattering (SAXS) Data

: We describe the updated size-exclusion chromatography small angle X-ray scattering (SEC-SAXS) set-up used at the P12 bioSAXS beam line of the European Molecular Biology Laboratory (EMBL) at the PETRAIII synchrotron, DESY Hamburg (Germany). The addition of size exclusion chromatography (SEC) directly on-line to the SAXS capillary has become a well-established approach to reduce the e ﬀ ects of the sample heterogeneity on the SAXS measurements. The additional use of multi-angle laser light scattering (MALLS), UV absorption spectroscopy, refractive index (RI), and quasi-elastic light scattering (QELS) in parallel to the SAXS measurements enables independent molecular weight validation and hydrodynamic radius estimates. This allows one to address sample monodispersity as well as conformational heterogeneity. The beneﬁts of the current SEC-SAXS set-up are demonstrated on a set of selected standard proteins. The processed SEC-SAXS data and models are provided in the Small Angle Scattering Biological Data Bank (SASBDB) and are labeled as “bench-marked” datasets that include the unsubtracted data frames spanning the respective SEC elution proﬁles and corresponding MALLS-UV-RI-QELS data. These entries provide method developers with datasets suitable for testing purposes, in addition to an educational resource for SAS data analysis and modeling.


Introduction
Solution-state small angle X-ray scattering (SAXS) is employed to obtain information on the size, shape, structure, oligomerization state, interactions, conformational heterogeneity, flexibility, and intrinsic disorder of biological macromolecules. It is increasingly used as part of the modern integrative structural biology toolkit that synthesizes the results obtained from both high-and low-resolution techniques-SAXS combined with X-ray crystallography, Electron Microscopy, Nuclear Magnetic Resonance spectroscopy, etc.-to build more complete molecular models and develop more adequate descriptions of the structural state(s) of biomolecules [1][2][3][4]. As is the case for many structural methods, the bottleneck of a SAXS experiment is often the preparation of high-quality samples [5,6]. SAXS profiles measured from solution samples are influenced by the summed X-ray contrast-and volume fraction-weighed contributions from each and every particle in the beam; i.e., the SAXS intensities are sensitive to the population of states in solution. Consequently, the (often unpreventable) formation of higher oligomeric species or even aggregation over time remains the major challenge for sample preparation. Even though larger trace contaminants may be present at low concentration, they can strongly affect the scattering pattern, as their contribution to the overall scattering is proportional to the square of the particle volume (V 2 ). Thus, their presence can lead to severe alterations of the scattering curves, often making the data difficult, if not impossible, to interpret.
A now widely adopted-although not always trivial-method for measuring quality SAXS data has been the coupling of size exclusion chromatography (SEC) set-ups directly to the scattering experiment. By using SEC-SAXS, the different species present in a sample are generally separated according to their size directly before they are passed through the SAXS measuring cell. After initial experiments performed at the APS beam line BioCAT (Chicago, IL, USA) in 2004 [7] and at the Photon Factory BL10C (Ibaraki, Japan) in 2008 [8], the bioSAXS SWING beam line at the Soleil synchrotron (Saint-Aubin, France) was the first station to offer such a set-up on a regular basis [9]. Building upon this success, the SEC-SAXS collection mode is now offered by all major SAXS beam lines focused on biological questions. In addition, the method has been adapted to small-angle neutron scattering (SANS), for example at the D22 SANS instrument at the ILL (Grenoble, France) [10], as well as less-intense X-ray lab sources [11,12]. In 2015, the advantages afforded by adding parallel light-scattering detectors coupled to the SEC-SAXS measurements were demonstrated at the EMBL bioSAXS P12 beam line [13]. The incorporation of light scattering (LS) combined with accurate concentration (c) determination was integrated in order to independently validate the molecular weight (MW) of the SEC-eluting species. The determination of the MW of a macromolecule or complex is a key parameter for SAXS, as it directly informs most subsequent data interpretation and modeling steps: knowledge of the oligomeric state or the stoichiometry of a complex is critical. SEC-SAXS/LS measurements are a powerful add-on biophysical characterization tool and have been implemented at a number of other beamlines.
In regard to both sample and data quality, another significant development in the small angle scattering (SAS) community, for both bioSAXS and bioSANS applications, has been the establishment of reporting guidelines and standards [14,15]. In 2012, a task force set out to define essential criteria for data reporting and interpretation of biomolecular SAS data. Publishing guidelines were proposed, and the community has increasingly adhered to the recommendations that have encompassed diverse aspects of bioSAS experiments. Within these efforts, the deposition of SAXS/SANS data and derived models was strongly encouraged, resulting in the development of a curated repository, the Small Angle Scattering Biological Data Bank (SASBDB, www.sasbdb.org [16,17]). One feature of this repository is the possibility (for anyone in the community) to deposit benchmark datasets and models. These are sets of scattering data from well-characterized samples that are intended for data validation, interrogation, education, and method development purposes. Here, we describe an improved SEC-SAXS set-up at P12 and the collection of such a set of benchmarked SAXS data ( Figure 1).

Samples
All samples were commercially obtained. Bovine carbonic anhydrase (CA), yeast alcohol dehydrogenase (ADH), and horse apoferritin (aFER) are all part of the Gel Filtration Markers Kit cat# MWGF1000 (Sigma, Darmstadt, Germany). Bovine Serum Albumin (BSA) was purchased from Sigma (cat# 05470, Darmstadt, Germany). To increase the sample quality, all samples were pre-purified at 4 • C prior to the SEC-SAXS/MALLS (multi-angle laser light scattering) analysis using the following protocol. The lyophilized CA, BSA, and ADH powders were dissolved in preparation buffer (25 mM HEPES 50 mM NaCl, 5 mM urea, 1% v/v glycerol, pH 7) to obtain solutions of approximately 25 mg/mL. The aFER, which is provided as a stock glycerol solution, was diluted two-fold in preparation buffer. Approximately 200 µL (≈5 mg) of each individual sample were loaded onto equilibrated SEC columns (GE Healthcare, Germany): for aFER, a Superose 6 Increase 10/300 column; for BSA and ADH, a Superdex 200 Increase 10/300 column, and for CA, a Superdex 75 increase 10/300 column were employed for the pre-purification step. Preparation buffer was used as the mobile phase, and the flow rate was set to 0.4 mL/min on an Akta Purifier (GE Healthcare, Germany). Fractionated aliquots corresponding to the highest absorbing peak (estimated using UV A280 and UV A245 nm) were pooled. In addition, fractions corresponding to the BSA dimer peak were collected separately. All samples were concentrated with ultrafiltration devices (Millipore, Germany) with a 30 kDa cut-off, with the exception of CA, for which a cut-off of 3 kDa was selected. The final concentration was determined by averaging triplicate UV A280 measurements using the E0.1% values calculated from the amino acid sequence (ProtParam [18]). The extinction coefficients and final concentrations of the pre-purified samples are listed in Table 1. Appropriately sized aliquots (50-100 µL) were snap-frozen in liquid nitrogen and stored at −80 • C until the SEC-SAXS/MALLS measurements.

Chromatography System
The EMBL P12 beam line (PETRA III, DESY Hamburg, Germany [20]) is equipped with an Agilent 1260 Infinity II Bio-inert liquid chromatography system (LC, Agilent, Waldbronn, Germany). Three SEC-SAXS modes are available for user operations: (i) Basic SEC-SAXS, where the column outlet is connected directly to the SAXS measuring capillary; (ii) Serial SEC-UV-vis-SAXS, where the eluent is passed through an in-series UV-vis absorption spectrometer prior to the SAXS measurement capillary; and (iii) SEC-SAXS/MALLS, where the post-column eluent stream is split and passed in equal amounts to the SAXS capillary and the UV-vis, MALLS, and refractive index (RI) detectors, enabling parallel SAXS/LS data collection from the column eluent ( Figure 1). For the studies described here, the SEC-SAXS/MALLS configuration was selected. Of note, a valve system is now in place to switch between the SEC-SAXS modes of operation and conventional batch SAXS measurements using the customized automated sample changer at the P12 beam line (Arinax, Moirans, France [21]). The custom-made beam line operating software BECQUEREL [22] has been designed to easily perform switching between batch-and SEC-SAXS, such as the remote control over valve, and it provides a common interface for both modes that includes the integrated control over the Agilent Chemstation LC software (Agilent Technologies, Santa Clara, CA, USA) and the coordinated communication of events between the LC system and SAXS data collection.
The LC system is compatible with a variety of both FPLC and HPLC SEC-columns routinely used for macromolecular separations, up to an operational pressure of 100 bar and higher, while also maintaining an appropriate flow rate to limit the effects of X-ray-induced radiation damage (at P12, flow >0.25 mL/min are required). In addition, the LC system is equipped with the Agilent 1260 Infinity Bio-Inert multisampler, which is designed for low sample carryover with an in-built sample-injection needle wash. It allows for the precise injection of volumes between 1 and 100 µL, and conventional (lidless) 1.5 mL reaction tubes (such as from Eppendorf, Germany) can be used (with an approximate dead volume of 5 µL). Care was taken to select connecting tubes leading to the SAXS capillary of the smallest feasible diameter, to avoid band broadening and a subsequent loss of resolution [23].
The injection volumes and concentrations of the CA, BSA, ADH, and aFER samples are listed in Table 1. For the SEC-SAXS/MALLS runs, a GE Superdex 200 Increase 10/300 column (GE Healthcare, Germany) was used, and the flow rate was set to 0.5 mL/min with the exception of CA, for which a GE Superdex 75 Increase 10/300 column was used. All columns were attached to a post-column 0.1 µm filter followed by a mobile-phase micro splitting valve, with low dead volume (P-451 Upchurch Scientific ® , Figure 1 #4) to divert the eluent in two equal streams for the parallel acquisition of SAXS and MALLS/RI data (as described in Graewert et al. 2015 [13]). Splitting of the column eluent, as opposed to an in-line serial detection array, limits band broadening and remixing effects and the subsequent loss of separation resolution.

UV, MALLS, and RI Data Collection and Analysis
The UV-vis/MALLS/QELS/RI system at P12 consists of an Agilent variable wavelength UV-Vis detector (VWD) followed by a Wyatt (Wyatt, Germany) miniDAWN ® TREOS ® multi-angle laser light scattering detector (MALLS), with an in-built WyattQELS module and a Wyatt Optilab T-rEX (RI) refractometer. In all instances, UV absorption spectroscopy data were recorded at 280 nm. The MALLS system was calibrated relative to the light scattering of toluene for an absolute RI measurement of the mobile phase. The differential RI increment, dn/dc (mL/g), of each protein sample (Table 1) was calculated from the primary amino acid sequence using the method of Zhao et al. 2011 [24], which is integrated into the SEDFIT 'vbar and dn/dc calculator' [19], taking into account the experimentally determined RI of the solvent, measurement temperature (25 • C), and RI laser wavelength (658 nm). The molecular weight estimates, MW MALLS , were determined from the three-angle MALLS scattering intensities combined with the protein concentration determined from RI through the SEC elution peak of each sample using the ASTRA 7 software package (Wyatt Technology Corporation, Santa Barbara, CA, USA). In addition, the integrated QELS detector was used to evaluate the hydrodynamic radius, R H , of the protein samples in the mobile phase, with the incorporation of a correction for solvent viscosity due to the effect of 2% v/v glycerol in the running buffer. The viscosity of the 2% v/v glycerol running buffer was estimated at 0.9476 centipoise (cP) using the 'Calculate density and viscosity of glycerol/water mixtures' calculator [25] based on an assumption that glycerol, and not the other buffer components, is primarily responsible for affecting the translational diffusion coefficient of the proteins in solution compared to its value in water.

SAXS Data Collection and Reduction
The portion of the SEC column eluent used for SAXS measurements, under continuous flow, was directed into a 1 mm diameter quartz capillary housed within the in-vacuum beam line sample exposure unit. A capillary with 0.9 mm inner diameter has become the standard measurement cell, compared to the previous 1.7 mm option, as the narrower diameter increases the linear speed of the SEC eluent through the X-ray beam, reducing the chances of radiation damage [26]. In addition, the band broadening effect of the sample is lessened in the 0.9 mm capillary, improving the correlation between the background-corrected SAXS intensity at zero angle, I(0), and concentration estimates from the parallel UV-vis or RI measurements. For the set of experiments described here, the SAXS data were collected using a Pilatus 6M detector at a sample-detector distance of 3 m and at an X-ray wavelength λ of 0.124 nm. The data were recorded as a sequential set of 2880 individual 1 s frames, corresponding to one column volume (CV) for each protein sample (48 min total). Each individual 2D image underwent data reduction (azimuthal averaging) and normalization to the intensity of the transmitted beam to generate 1D scattering profiles plotted as I(s) vs. s through the momentum transfer range of 0.05 < s < 6 nm −1 (where s = 4πsinθ/λ and 2θ is the scattering angle). The s-axis was calibrated relative to a silver behenate [27]. As a preventative measure, automated washing cycles of the capillary were performed between each SEC-SAXS column run to remove the potential and unintended build-up of non-specific debris on the capillary surface that may happen at the point of X-ray exposure ("capillary fouling") ( Figure 2).

Figure 2.
Overview of the automated analysis procedure for SEC-SAXS data collected at P12.

Background Subtraction of the SEC-SAXS Data
At the P12 beam line, a quick assessment and preliminary result of a SEC-SAXS experiment is possible "on the fly". The program CHROMIXS [28] has been integrated into the automated data processing pipeline, SASFLOW [29]. The pipeline workflow can be described as follows: (i) integration of 2D images from the detector; (ii) separation of the integrated 1D frames according to their run number; (iii) launch of the corresponding run number data in CHROMIXS (see below [28]); (iv) determination of sample peaks and buffer region for each run number; and (v) generation of averaged and subtracted scattering profiles. Therefore, the pipeline generates a number of ready-for-analysis SAXS profiles for each run number and elution peak, so that the final steps of data processing are identical to the standard SASFLOW "Sample Changer mode" [29]. Finally, the pipeline passes each subtracted SAXS pattern to a set of programs, performing the determination of the overall SAXS parameters (R G , MW, D max ), as well as the calculation of the real-space distance distribution function (p(r) profile). Then, this information is piped to the DAMMIF ab initio bead modeling routine, which generates a "first look" P1-symmetry low-resolution model [30]. The obtained parameters and models are gathered together in a single summary XML table, enabling fully automated and interactive on-site SAXS studies.
The program CHROMIXS [28] plots an s-range limited integrated intensity versus frame number to generate the corresponding "SAXS chromatogram" (Figure 3). The default s-range of 0.1 nm −1 < s < 0.8 nm −1 was used for the integration to aid the visualization the major SEC-elution peak(s) on the chromatogram and evaluate the stability of the scattering signal recorded from the SEC running buffer. The automated CHROMIXS selection of frames recorded before or after the sample peak, corresponding to the buffer, were averaged and used for the subtraction of background scattering contributions from the sample peak frames. The automated buffer scattering selection was additionally verified by visual inspection to dismiss any kind of background drift throughout the SEC run, which may be caused by capillary fouling. To assess the homogeneity of the sample, the stability of the radius of gyration, R G , and concentration-independent MW estimates for each buffer-subtracted sample peak frame were assessed. For the former, the ATSAS program AUTORG [31] was used, running in the background of CHROMIXS. The MW estimates from CHROMIXS are by default based on the Porod volume [32], or, alternatively, the program may be configured to assess the concentration-independent MW from a combined scattering invariant approach utilizing Bayesian inference [33]. Those individual subtracted SAXS curves producing a consistent R G through the elution peak were scaled to the one data frame with the highest integrated intensity and then averaged to generate the final SAXS profile. CHROMIXS performs this scaling operation by default, as opposed to normalizing each individual data frame to a sample concentration and then averaging, because the sample concentration is constantly changing through the SEC elution. It is necessary to compare the SAXS chromatogram with the output from the UV/RI detectors first to obtain the concentration profile and then evaluate the I(0) (and not integrated SAXS intensity) to assess concentration-dependent MW estimates. The final buffer and sample profiles used for the generation of final averaged processed CHROMIXS data are recorded in the footer of the ASCII format (.dat) file, and for P12, additional metadata relating to the experiment (e.g., X-ray wavelength, sample to detector distance, column type, flow rate injection volume, initial sample concentration, etc.) are listed in the footer of each unsubtracted data frame.

SAXS Data Analysis
ATSAS 2.8 [30] was employed for further data analysis and modeling. The program PRIMUS [34] was used to perform Guinier analysis (lnI(s) versus s 2 ) in the very low-angle regime from which the radius of gyration, R G , and I(0) was determined. The p(r) distributions were calculated using the indirect Fourier transform method implemented in GNOM [35] that provided additional estimation of R G , I(0) and the maximum particle dimension, D max . The concentration-independent MW estimates from the SAXS data were assed with the Bayesian interference approach described by Hajizadeh et al. 2018 [33] that includes the estimation of a MW credibility interval for each sample. The MW estimates obtained from other concentration-independent methods based on scattering data invariants are also reported: MW from the volume of correlation, Vc [36]; SAXMoW [32]; Size-and-shape (S&S [37]) and Porod-volume, Qp [30,32]. In addition, a concentration-dependent assessment of the MW was performed in a two-fold manner using the forward scattering I(0) of the final SAXS curve, which was normalized by the concentration estimates at the of the top of the sample elution peak maxima as determined from RI measurements. The forward scattering of BSA of known concentration measured in batch mode was also used for calibration [38]. The ab initio modeling of the proteins was performed using either DAMMIN or GASBOR and the fit to the experimental data from the high-resolution crystal structures were computed by CRYSOL [39].
The experimental SAXS data described here, as well as the models derived from them, were deposited to the SASBDB [17] with the accession codes listed in Table 2. The unsubtracted 1D SAXS data frames encompassing the entire SEC-SAXS run of each sample, and CHROMIXS R G estimates through the SEC-SAXS chromatogram peaks combined with the UV-vis, MALLS, RI, and QELS data have also been deposited to the data bank.

Data Presentation
Python scripts were designed to use MatPlotLib and generate plots for quick visualization of the results. These quick-plot tools are accessible at the P12 beamline and can be used to evaluate measurement performance, allowing the possibility to adjust the collection strategy if required. Settings to produce high-quality figures for publications are also part of this tool kit.

Overall Assessment of SEC Performance
Data were collected in SEC-SAXS/MALLS mode from five different standard proteins. In Figure 3, the CHROMIXS SEC-SAXS chromatogram results obtained from the monomeric and dimeric BSA samples, ADH, and aFER runs using a S200 Increase SEC-column (10/300) are overlaid and compared to the respective Rayleigh ratio chromatograms measured using MALLS (Figure 3b). The analogous elution profiles corresponding to CA, separated on an S75 Increase column, are shown in Supplement Figure S1. All five runs show a symmetrical major elution peak, and there is a strong correspondence between the SEC-SAXS chromatograms and the light-scattering results, demonstrating that the majority of the protein volume fraction of each individual sample, after dilution through the SEC column, presents as one isolatable oligomeric species. Only very minor pre-or post-peaks are noted corresponding to trace volume fractions of higher or lower oligomeric species in the injected samples. Notably, for the dimeric BSA ( Figure 3, blue trace), a minor second peak indicates the presence of monomeric BSA, which cannot be avoided due to the dimer-monomer equilibrium of this protein. The absence of any traceable amount of aggregates that would otherwise flow through the SEC column void volume, or any multiple unresolved peaks in the SAXS chromatograms and LS traces, reflects the high quality of the samples prepared prior to the SEC-SAXS experiments. Although time-dependent aggregation may not be avoidable for other types of macromolecular samples (for which SEC-SAXS is an invaluable tool), it is important to remember that SEC-SAXS is as much an analytical technique as SEC-MALLS, requiring high-quality samples to obtain high-quality results. The SEC step at a beam line should not be used as a simple or quick substitute for the final step of a purification protocol. The quality of the sample should be taken into consideration prior to SEC-SAXS, which is why the samples described here all underwent a pre-purification step prior to measurement.
In all five runs, the integrated SAXS intensities of the SAXS chromatograms recorded toward the end of the SEC elution returned to a constant baseline (Figure 3, Supplement Figure S1). This has three important implications: (i) the column was sufficiently equilibrated prior to the experiment, the mobile phase was well matched to the sample solvent composition, and the column was not overloaded; (ii) no clear presence of X-ray-induced capillary fouling occurred during the course of the SEC-SAXS run, either damage to the buffer or to the proteins; (iii) the X-ray beam remained stable throughout the run, and the data were correctly normalized to the transmitted beam to accurately take into account both buffer and sample X-ray absorption.
The choice of mobile phase for SEC-SAXS should be guided by both the physical/chemical stability of the sample in the supporting solvent and the final X-ray contrast, column separation resolution, and the susceptibility of the sample toward X-ray-induced damage that may result in sample aggregation and subsequent capillary fouling. Capillary fouling caused by deposit build-up during the course of a SEC-SAXS run is a complicated process involving free radical, solvated electron, and quartz glass surface chemistries with the mobile phase and/or the macromolecule(s) of a sample, and is difficult to predict a priori. The implementation of co-FLOW SEC-SAXS, for example at the Australian Synchrotron SAXS beam line [40], mitigates such an effect. However, for SEC-SAXS in general, what may be considered as a "default" SEC purification buffer, e.g., phosphate-buffered saline, may not always be compatible with the X-ray properties/behavior of the sample in the X-ray beam. For example, and for the samples described here, 5 mM urea combined with low-salt (50 mM NaCl) and 1% v/v glycerol were included in the purification steps to (qualitatively) minimize aggregate formation during the comparatively long time frame of sample handling, purification, snap freezing, −80 • C storage, and defrosting, leading up to the SEC-SAXS experiments. The selected mobile phase for SEC-SAXS was based on the purification buffer, but it was modified for the final experiments so as to maximize the X-ray contrast while at the same time taking into account the effects of radiation damage, SEC column separation efficiency, and interparticle interactions. The final mobile phase included an increased percentage v/v of glycerol to help limit radiation damage [25,41] and an increased concentration of NaCl (150 mM) to help minimize the non-specific interactions of the proteins with the SEC column stationary phase (to improve separation resolution) and to decrease the possible contribution of Columbic repulsive interparticle interference effects in the final SAXS profiles.
Slight discrepancies in the buffer composition of the injected sample (loaded in the "preparation buffer") compared to the SEC-SAXS mobile phase (in "SAXS buffer") are often tolerable because most samples undergo on-column buffer exchange during the course of the elution. Any small molecule differences become detectable at the end of a SEC-SAXS run as the small buffer components derived from the injected sample take the longest diffusional path through the column matrix and typically elute slightly before one entire column volume. The presence of these small molecule differences is often detected by wild fluctuations in the RI signal (and often not UV absorption; Supplement Figure S2) recorded toward the end of the total elution of the sample. Therefore, it is generally advised to avoid selecting the "end of column" data frames from the corresponding region of the SAXS chromatogram for background scattering subtraction purposes, as these may differ with respect to X-ray contrast and X-ray absorption properties compared to the bulk mobile phase of the SEC run.

Concentration Determinion via UV Absorption and/or dRI
Knowing the eluent concentration (c) is essential for determining the MW MALLS and, independently, the MW I(0) from SAXS. The UV-vis/RI set-up allows for the protein concentration at every measurement point to be retrieved. For the UV-vis absorption-based concentration estimates, knowledge of the extinction coefficient at the employed wavelength is required as well as the optical path length of the UV-vis detector. For example, the extinction coefficient of proteins (E 280 ) is typically determined at A280 nm from the amino acid sequence, which in most cases is sufficiently accurate to obtain reasonable concentration estimates. However, absorption-based methods necessitate that the macromolecule has sufficient chromophores to generate an absorption profile, which for proteins usually requires the presence of aromatic amino acids when measuring absorption at 280 nm. The RI-derived concentration determination is more widely applicable compared to UV-vis absorption for protein work, as it does not depend on the presence of absorbing species. In general, the differential refractive index, dn/dc, of proteins is more robust against changes in amino acid sequence compared to UV absorption [18] such that the dn/dc for most proteins-including those lacking aromatic amino acids-and without any conjugated molecules (e.g., glycoproteins, lipoproteins, etc.) can be first estimated at 0.185 mL/g at the laser wavelength of the Wyatt T-rEX (658 nm). The calculated dn/dc of a protein may be further adjusted as described by Zhao et al. 2011 [24] by taking the experimental absolute RI of the solvent (also obtained from the Wyatt TREX) and combining this value with the protein amino acid sequence, temperature, and laser wavelength information. The E280 extinction coefficients and adapted dn/dc values for the protein samples are listed in Table 1. In Table 2, the averaged concentration of the samples calculated for the final profiles are reported. Due to the rather large column volume (24 mL), the samples undergo a marked dilution by a factor of 6-10 compared to the sample injection concentration. When choosing the column for SEC-SAXS, one has to consider the tradeoff between separation power, which is often improved by longer columns or larger column volumes, and the associated dilution effect that is amplified when using larger columns. The dilution of the sample impacts both the intensity and variance (lowers the signal-to-noise ratio) of the resulting SAXS signal (as I(s) α c) and is especially relevant in cases where the oligomerization and/or the association of a fully formed protein complex is concentration dependent, as is demonstrated here by the disassociation of the BSA dimer sample on the column into a mixture of BSA monomers and dimers (Figure 3, blue curve).

Stable MW MALLS and R H Estimates Across the Elution Peaks Indicate Homogeneous Sample Populations
The individual SEC-MALLS and QELS datasets were analyzed in relation to assessing the sample population homogeneity across the major elution peaks. In Figure 4a,b, the derived chromatograms for the ADH sample are displayed, showing the MW correlation obtained from the MALLS and RI measurements as well as the R H from QELS. An example of the QELS autocorrelation function from ADH is shown in Supplement Figure S3. For these calculations, the estimated viscosity of the SEC mobile phase due to the addition of 2% v/v glycerol was taken into account. The chromatograms of the other samples are provided in Supplementary Figures S4-S7. In combination, the light-scattering and subsequent stable MW and R H correlations demonstrate the homogeneity of the protein samples within the selected elution range from the SEC column (Table 2, Figure 4a, and Supplement Figures S4-S7).
When using SEC-MALLS, the MW MALLS estimate is independent of the elution behavior of the sample. Throughout a SEC run, the Rayleigh ratio-defined by the ratio of intensities of incident and scattered laser light at a specified distance-is both calibrated (in this case relative to toluene) and measured along with the concentration of the protein sample in the eluent, as determined by UV absorption or RI. This allows the Rayleigh ratio to be normalized by concentration to obtain the MW estimate. Alternatively, and more simply, SEC estimations of the MW that are not coupled to a LS system are often based on comparing the elution volume of a sample relative to elution volume of a set of protein standards with a known MW. However, the "elution volume only" approach is based on the underlying assumption that the sample in question displays the same hydrodynamic behavior as the measured standards and that the sample does not interact with the column matrix. For example, it may be the case for proteins with a more loosely packed structure compared to the calibration standards that the elution from the column is delayed, leading to an underestimation of the MW, whereas densely packed structures may elute sooner, suggesting a higher MW. Here, for example, the dimeric BSA (expected MW = 133 kDa) elutes unexpectedly at an earlier time point than the slightly heavier ADH (expected MW = 147 kDa, Figure 3). MW estimates based on elution volume alone may lead to inaccurate conclusions. In effect, both the MW and overall hydrodynamics/conformational state of the protein (compact, intrinsically disordered, rod-shaped, etc.) as well as sample/matrix interactions may contribute to elution-volume behavior. Figure 5 shows the ratio of the experimentally determined average MW MALLS estimates compared to the expected MW, MW exp , calculated from the amino acid sequence and known oligomeric state of the proteins (full squares). For the five examined proteins, these ratios are between 0.95 and 1.0 (Table 2).  The current configuration of the SEC-SAXS/MALLS set-up also includes the integrated Wyatt QELS module that allows the collection of dynamic light scattering data while the sample is eluting through the same measurement cell as used for MALLS. The ASTRA software allows for solvent viscosity corrections, e.g., the presence of glycerol, to model the R H from the measured auto correlation functions. From the derived autocorrelation function recorded at each time point through the elution, the distribution of R H across the elution peak can be examined (Table 2, Figure 4b and Supplement Figures S4-S7). From the stable correlation of R H across the elution peaks observed here, one can conclude that the samples stream as homogeneous populations with respect to the MW and also hydrodynamics. The presence of structural heterogeneity within a sample may manifest in a "tailing" of the elution peak where specific conformational states have a tendency to interact differently with the column matrix, leading to a delay in sample elution that may be identified by a change in R H . This additional biophysical characterization, both MALLS and QELS, is an asset when continuing SAXS data analysis, i.e., the selection of the appropriate data frames, and to inform subsequent interpretation (e.g., anisometry, the choice of oligomeric state used for ab initio and rigid-body modeling; the choice of flexible ensemble approaches, etc.).

SAXS Data Analysis: Stable R G and MW I(0) Estimates across the Elution Peaks
For all of the benchmark protein samples, the resulting CHROMIXS SAXS chromatograms show one major symmetrical peak with the return to a stable baseline and the baseline separation of the major peak from any minor species present in the injected samples. As a result, CHROMIXS is able to effectively delineate the major sample and buffer regions of the chromatogram and automatically process the final SEC-SAXS profile (Supplement Figure S8). In instances where the elution peaks are not well separated, or if there is significant drift in the integrated baseline scattering intensity, CHROMIXS may struggle to identify sample or buffer frames, and it may even return a message that it is not possible to find an adequate buffer region for subtraction purposes. For the SEC-SAXS pipeline at P12, the inability of CHROMIXS to automatically select the appropriate regions of the SAXS chromatogram will prevent further data analysis and on-the-fly "first look" ab initio modeling routines, and the user has to select and process the chromatogram manually through the CHROMIXS interface.
The CHROMIXS estimates of R G calculated for each individual frame through the selected sample range of the SAXS chromatogram are shown in (Figure 4c,d and Supplement Figures S4-S7). All five runs showed stable R G correlation for each of the five samples of the major elution peaks. CHROMIXS also includes a fast assessment of the MW Chromixs for each sample frame which, as reported here, is based on the Porod volume estimate. In combination, the stability of the R G and MW correlations from SAXS and the MW MALLS and R H from QELS goes toward confirming the homogeneity of the eluting proteins and, importantly, that the intense X-ray beam has not damaged the protein samples as they flow through the SAXS capillary ( Figure 4). With both the R G from SAXS and R H from QELS at hand, it is also possible to quickly calculate the shape factor, or R G /R H ratio, that provides a "parameter insight" into the conformation of the proteins in solution. The R G /R H for compact globular proteins is ≈0.78 and trends to higher values when the molecules deviate from globular to elongated structures. The shape factor obtained for the five proteins span 0.75 for CA-a very compact protein-to 0.86 for the BSA dimer, which may suggest a level of structural anisotropy. Without modeling the data, the general shape factor for all of the proteins indicate that they are all relatively globular species in solution ( Table 2).

MW MALLS is the Most Robust Estimate for Determining the Molecular Weight for SEC-SAXS Applications
The determination of concentration-independent MW estimates using scattering invariant approaches from the integration of normalized Kratky plots (I(s)s 2 vs. s) or Vc plots (I(s)s vs. s) has become a routine procedure in SAXS analysis of non-conjugated protein-only samples. Here, we used the Bayesian interference approach [33] that combines several of these methods together to obtain the most probable estimated protein MW and MW credibility interval calculated directly from the scattering profiles (Table 2). All values are in close proximity to the expected MW.
Alternatively, the concentration-dependent MW estimate based on sample concentration and the forward scattering, MW I(0) , was also determined, and the values are also in close approximation to the expected values. The advantage of this concentration-dependent method (either calibrated relative to a known protein standard-as was employed here-or via absolute scaling of the data [38]) is that it may be applied to macromolecules in general or macromolecular conjugates and is not dependent on the macromolecule shape. However, the disadvantage is the accurate assessment of the sample concentration that for bio-macromolecules is not always trivial, especially for SEC where the concentration of the eluting species is ever-changing and potentially affected by band-broadening in the SAXS capillary. However, by utilizing a split-eluent flow approach in the SEC-SAXS/MALLS set-up, and reducing the SAXS sample capillary diameter, the concentration at the point of X-ray exposure during the course of a SEC-SAXS run may be estimated using the UV and RI detectors set up in parallel to the SAXS measurements. For the visualization and quantification of the different MW approaches, the ratios of the various MW estimates to the expected MW are plotted in Figure 5. From this, we conclude that for the simple globular protein samples described here, both concentration-dependent and independent methods are consistent, although MW MALLS (full squares) appears as the most robust experimental method for validating the MW.

Structural Information Derived from the Buffer-Subtracted SAXS Scattering Curves
For all five proteins, atomistic models determined from X-ray crystallography are available. Theoretical scattering profiles were calculated from these models and compared to the experimental data with CRYSOL. The obtained χ 2 values are listed in Table 2, and the visual inspection of the fits in Figure 6 suggest that the crystal structures are generally in reasonable agreement with the solution data. The strongest discrepancies with elevated χ 2 values are observed for Apoferritin and dimeric BSA. The divergence of the former can partially be explained by small shape variations of the Apoferritin assembly in solution compared to the highly symmetric crystal structure. For the dimeric BSA, the scattering data suggest that the dimer in solution has a somewhat more extended monomer arrangement compared to the one observed in the crystal packing. The other curves show very good fits with χ 2 values ranging from 1.1 to 1.4 (Table 2).  Table 2.

Discussion
Since the first installation of SEC-SAXS at the P12 beam line in 2012, demand for the technique has significantly increased. At present, 60% of P12 users make use of SEC-SAXS during their visits or request SEC-SAXS as part of mail-in operations. The current and ongoing improvements at the beam line have streamlined the processes of sample delivery, data acquisition, and analysis in step with this demand. The newest developments of the P12 system include: (i) Upgraded hardware modules such as the automated injector and switching valve for better use of allocated beamtime (the need to physically enter the experimental hutch by the operators is reduced, as several samples can be queued, and washing between samples can be initiated remotely). (ii) The robust HPLC pumps allow extensive use of columns (also at high pressures). Quaternary solvent blending offers increased flexibility in remote preparation of buffers e.g., varying in ionic strength or pH gradient. (iii) The additional acquisition of QELS data allows for further assessment of sample homogeneity (stability of R H across elution peak) as well as structural information through the R G /R H ratio. (iv) Collection of MALLS data at three angles provides more precise MW MALLS estimates.
(v) A combination of software tools and a common beamline-control interface (BEQUEREL) for ease of use and the necessary communication between integrated devices during data collection [22] as well as for data analysis such as CHROMIXS [27] and the new QuickPlot tools described here.
Evaluation of data quality "on the spot" can optimize beam time usage, as improving measures can be made for subsequent runs.
As demonstrated here, the quality of SAXS data can be increased significantly by the in-line SEC system, allowing the immediate acquisition of data after the individual components of a sample elute from a size exclusion column. Figure 7 displays the difference between batch data (taken from SASBDB entry SASDA82) and the SEC data described here for aFER. At higher angles, the two curves overlap well, but the minima in the curve measured with SEC are more pronounced, pointing to a cleaner system. At lower angles, the batch data show a faster descent and overall larger SAXS parameters (RG = 6.8 nm, Dmax = 12.8 nm), suggesting the presence of higher oligomeric species in the sample measured in batch mode. Moreover, the analysis and interpretation of the SAXS data is greatly facilitated, and confidence in the subsequent conclusions increased, if the sample is additionally characterized biophysically with the multi-detector LS system. At this point, we would like to strongly emphasize that the SEC-SAXS step does not replace the need for preparing a high-quality sample. In addition, the success of the experiment depends on an optimized buffer and column selection.
The current improvements to the SEC-SAXS/MALLS set-up have allowed us to collect SAXS data with additional static and dynamic light scattering information for the purposes of benchmarking. These datasets have been deposited into the SASBDB [17] and are made available to the community for testing, development, critique, and interrogation. Screenshots of these entries are shown in Supplement Figures S9-S13.
Finally, we do not expect SEC-SAXS to make the classical batch mode of data collection obsolete. Batch mode SAXS has the advantage of very fast collection times, minimal sample volumes, and high-throughput screening possibilities. It is by far the most appropriate technique for the structural analysis of transient oligomers or low-affinity macromolecular complexes that would otherwise disassociate on an SEC column. As the biology is based on macromolecular association, even weak interactions may have functional relevance. Moreover, batch experiments allow for specific variations in concentration, pH, temperature, buffer additives, stoichiometric ratios, and many other biologically relevant parameters. Therefore, SEC-SAXS is not a "golden technique" to be routinely applied to all samples or experiments. Instead, this is a powerful analytical approach that is especially effective for systems notoriously difficult to keep in a purified state, and this approach is made even stronger by in-line biophysical characterization (absorption, light-scattering, and refractive index measurements).