Cue to Acid-Induced Long-Range Conformational Changes in an Antibody Preceding Aggregation: The Structural Origins of the Subpeaks in Kratky Plots of Small-Angle X-ray Scattering

Antibody aggregation, followed by acid denaturation and neutralization of pH, is one of the reasons why the production of therapeutic monoclonal antibodies (mAbs) is expensive. Determining the structural details of acid-denatured antibodies is important for understanding their aggregation mechanism and for antibody engineering. Recent research has shown that monoclonal antibodies of human/humanized immunoglobulin G1 (IgG1) become smaller globules at pH 2 compared to their native structure at pH 7. This acid-denatured species is unstable at pH 7 and prone to aggregation by neutralization of pH. Small-angle X-ray scattering (SAXS) data have revealed an acid-induced reduction in the subpeaks in Kratky plot, indicating conformational changes that can lead to aggregation. The subpeaks are well resolved at pH > 3 but less pronounced at pH ≤ 2. One of the weakened subpeaks indicates loosely organized inter-region (Fab-Fab and Fab-Fc) correlations due to acid denaturation. However, the structural origin of the other subpeak (called q3 peak in this study) has not been established because its q region could represent the various inter-region, inter-domain, and intra-domain correlations in IgG1. In this study, we aimed to untangle the effects of domain–domain correlations on Kratky’s q3 peak based on the computed SAXS of the crystal structure of IgG1. The q3 peak appeared in the static structure and was more prominent in the Fc region than in the Fab or isolated domains. Further brute-force analysis indicated that longer domain–domain correlations, including the inter-region, also positively contribute to Kratky’s q3 peak. Thus, the distortion of the Fc region and a longer inter-region correlation initiate acid denaturation and aggregation.


Introduction
Protein denaturation and aggregation are inherent and universal phenomena in vivo and ex vivo, and both are often undesirable factors for health. The subject of this study is the latter, ex vivo denaturation and aggregation of therapeutic monoclonal antibodies (mAbs), which are biopharmaceuticals. Because the aggregates of mAbs lead to reduced drug efficacy and potential side effects [1,2], they must be removed during the manufacturing process [3] and further evaluated prior to administration [4], which is time-consuming and costly.
mAbs are purified by affinity chromatography, where an acid solution is used as an eluent [5]. An acidic pH is kept for the subsequent virus inactivation process. During these processes, mAbs are denatured at the acidic pH. The neutralization of pH (pH shift) transforms the denatured mAb molecules into the native structured molecules, but a portion of them associate and evolve to become large aggregates. The mechanism has been Int. J. Mol. Sci. 2023, 24, 12042 2 of 8 identified as "aggregate growth via condensation", described by Smoluchowski coagulation equations [6]. Unlike amyloid aggregates, the mAb aggregates cannot incorporate the native mAb monomers into the aggregates [7]. Although the aggregate evolution mechanism has been increasingly captured [8], the earlier processes preceding the aggregation remain. Identifying the structural details of the acid-denatured structure is also a necessity for antibody engineering [9].
To address the acid-denatured structure, we have collected small-angle X-ray scattering (SAXS) data of mAbs of human/humanized immunoglobulin G1 (IgG1) in an acid solution [10]. IgG1 is the most popular subclass for therapeutics, which is composed of two identical heavy chains and two identical light chains and is Y-shaped when it is in its native state; the total number of the domains is twelve (cf. Figure 3). Our previous analysis has shown that the structure of the acid-denatured antibody is globular and smaller than the native structure at pH 7 [10]. This anomalous acid-denatured structure is the key structure for the aggregation by the pH shift.
The SAXS method is superior for capturing domain-domain correlations (orientation or distance) in multidomain proteins [11], in addition to size information, such as radius gyration (R g ). The SAXS signals of the domain-domain correlations often appear at the q-region higher than the Guinier region [12], which should be pronounced as subpeaks in a Kratky plot. Thus, Kratky plot analysis is an important tool for SAXS studies. Our SAXS data depicted the acid-induced reduction in the peaks in the Kratky plot of IgG1, termed mAb-A, in this manuscript (Figure 1a, taken from ref [10]). However, the assignment of the Kratky plot's subpeaks is nontrivial owing to the various inter-region, inter-domain, and intra-domain correlations in IgG1. In this study, we computed SAXS based on the crystal structure of a representative IgG1 structure and analyzed the effects of the domain-domain correlation on the Kratky plot's subpeak. Figure 1a shows the pH dependence of SAXS of mAb-A [10]. The Kratky plot (q 2 I(q) vs. q) of SAXS for the native structure (pH 7.1) showed three peaks. These peaks were seen in human IgG1 [13], IgG1s (the origins were not described) [14], humanized IgG1 [15][16][17][18][19], and the other IgG subclasses (humanized IgG2 and humanized IgG4 (S241P mutant)) [15,16], and are thus believed to be shared by many types of mAbs [20]. I(q) is the SAXS intensity. The q (scattering parameter) positions of the peaks were designated as q 1 , q 2 , and q 3 , from the smallest angle to the highest angle, respectively. The q 1 peak (~0.034 Å −1 ) at pH 7.1 indicates R g =~51 Å of the scatter IgG1 according to the Guinier relationship R g = 3 0.5 /q 1 [11,21]. The Kratky plot's q 2 and q 3 subpeaks were well resolved at pH > 3 but were less pronounced at pH ≤ 2.

Results and Discussion
The q 2 peaks (~0.08 Å −1 ) were rationalized [10,22] and acknowledged [20] as interregion distances, i.e., Fab (antigen-binding fragment)-Fc (crystallizable fragment) regions distance and Fab-Fab regions distance (~80 Å). Considering the Bragg relation, where the correlation length r c is 2π/q., would help grasp this relationship between q 2 and the inter-region distances. Accordingly, we interpreted that these inter-region correlations were loosely organized based on the less pronounced q 2 peak at pH ≤ 2. However, the structural origin of the q 3 peak has not yet been established. The q 3 peak is useful for identifying the initial conformational change for acid-denaturation-induced aggregation.
Disentangling the origin of SAXS becomes less straightforward in the higher q region due to the second peak of the form factor of the scatter or various correlations [23]. Multidomain or oligomeric proteins possess various distances and orientations, and it is nontrivial to determine the key origins underlying scattering. The computation of SAXS based on structural models is often required to elucidate these origins [24].
We computed the SAXS of a representative IgG1 based on its crystal structure (PDB:1HZH), which indicated that the q 3 peak appeared in the static structure (Figure 1b, red line). In this computation, we used the C α coordinates to approximate the electron coordinates and determined a distance distribution function (P(r)) called the C α -based method. P(r) is converted to I(q) via a Fourier transformation. q is the scattering parameter. A similar peak appeared in the Fc region but was not prominent in the Fab regions, C H 3 dimer, and C H 2. C H 2 is used as a representative of the isolated domains in IgG1 because all domains share an immunoglobulin fold. The lack of a q 3 peak for C H 2 suggests that the second peaks of the form factors of the isolated domains did not yield the q 3 peak. In addition to the C α -based method, we computed SAXS using CRYSOL [25], a standard program for the SAXS of proteins with atomic resolution.
The resulting Kratky plot also shows a q 3 peak (Figure 1b, dashed line). The computed SAXS of aglycosylated IgG1 (black dashed line) is comparable to that of glycosylated IgG1 (red dashed line) in the neighborhood of q 3 . The minor effects of glycosylation allowed us to neglect the sugar components in the current SAXS calculation for convenience. A previous coarse-grained IgG1 model that consisted of 12 spheres reproduced the q 3 character [18], which encouraged us to investigate the domain-domain correlations and allowed us to use the C α -based model that has a sufficiently high structural resolution for the discussion of the q 3 peak. The hydration layer can modulate the q 3 peak position and intensity; however, these effects are minor. coordinates and determined a distance distribution function (P(r)) called the Cα-based method. P(r) is converted to I(q) via a Fourier transformation. q is the scattering parameter. A similar peak appeared in the Fc region but was not prominent in the Fab regions, CH3 dimer, and CH2. CH2 is used as a representative of the isolated domains in IgG1 because all domains share an immunoglobulin fold. The lack of a q3 peak for CH2 suggests that the second peaks of the form factors of the isolated domains did not yield the q3 peak. In addition to the Cα-based method, we computed SAXS using CRYSOL [25], a standard program for the SAXS of proteins with atomic resolution. The resulting Kratky plot also shows a q3 peak (Figure 1b, dashed line). The computed SAXS of aglycosylated IgG1 (black dashed line) is comparable to that of glycosylated IgG1 (red dashed line) in the neighborhood of q3. The minor effects of glycosylation allowed us to neglect the sugar components in the current SAXS calculation for convenience. A previous coarse-grained IgG1 model that consisted of 12 spheres reproduced the q3 character [18], which encouraged us to investigate the domain-domain correlations and allowed us to use the Cα-based model that has a sufficiently high structural resolution for the discussion of the q3 peak. The hydration layer can modulate the q3 peak position and intensity; however, these effects are minor. A previous SAXS study of Fc [26] proved that the q3 peak is sensitive to the distances and orientations between the CH2 domains and between the CH2 and CH3 domains in Fc. Fc has two identical chains, referred to as the H-chain and K-chain, according to the annotation in the Protein Data Bank file (PDB ID:1HZH). Each chain had CH2 and CH3 domains (Figure 3), resulting in six domain-domain correlations. Let us consider the contribution of the domain-domain correlation to the q3 peak. One domain-domain correlation in real space is represented as the distance distribution component for P(r) between two domains. We calculated P(r) of the Fc and P(r) components stemming from these domaindomain correlations (Figure 2a). The order of the average distances between the domains is as follows: A previous SAXS study of Fc [26] proved that the q 3 peak is sensitive to the distances and orientations between the C H 2 domains and between the C H 2 and C H 3 domains in Fc. Fc has two identical chains, referred to as the H-chain and K-chain, according to the annotation in the Protein Data Bank file (PDB ID:1HZH). Each chain had C H 2 and C H 3 domains (Figure 3), resulting in six domain-domain correlations. Let us consider the contribution of the domain-domain correlation to the q 3 peak. One domain-domain correlation in real space is represented as the distance distribution component for P(r) between two domains. We calculated P(r) of the Fc and P(r) components stemming from these domain-domain correlations (Figure 2a). The order of the average distances between the domains is as follows:  (Figure 3a). The remaining P(r) Fc-Fc_3c was calculated as P(r) Fc-Fc_3c = P(r) Fc − P(r) Fc_3c shown in Figure 2a. Consider the following relationship between P(r) and I(q).
where FT indicates the Fourier transformation, and the subscript i is a component, and we can analyze the effects of P(r) i or I(q) i on I(q). Figure 2b demonstrates that the q 3 peak is diminished in the Kratky plot of I(q) Fc-Fc_3c . This indicates that the domain-domain correlations of P(r) Fc_3c (i.e., C H 2(H)-C H 3(K), C H 3(H)-C H 2(K), and C H 2(H)-C H 2(K)) were responsible for the q 3 peak. values of Fc and P(r)Fc. The reasons for this selection are shown in detail later (Figure 3a). The remaining P(r)Fc-Fc_3c was calculated as P(r)Fc-Fc_3c = P(r)Fc -P(r)Fc_3c shown in Figure 2a. Consider the following relationship between P(r) and I(q).
where FT indicates the Fourier transformation, and the subscript i is a component, and we can analyze the effects of P(r)i or I(q)i on I(q). Figure 2b demonstrates that the q3 peak is diminished in the Kratky plot of I(q)Fc-Fc_3c. This indicates that the domain-domain correlations of P(r)Fc_3c (i.e., CH2(H)-CH3(K), CH3(H)-CH2(K), and CH2(H)-CH2(K)) were responsible for the q3 peak.   Figure 3a) were ~80-90 and ~125 Å. This is physically reasonable because Bragg's law is nλ = 2dsinθ (n = 1, 2, 3, …). However, assigning which domain-domain correlation is responsible is nontrivial.   Figure 3a) were~80-90 and~125 Å. This is physically reasonable because Bragg's law is nλ = 2dsinθ (n = 1, 2, 3, . . .). However, assigning which domain-domain correlation is responsible is nontrivial. In Figure 4, subtracting P(r)Fc_3c from the P(r) of IgG1 [P(r)IgG] yields P(r)IgG-Fc_3c and their corresponding Kratky plots indicate the q3 peak is attenuated and conserved. Subtracting the P(r) component of 10c [P(r)IgG_10c] from P(r)IgG, which is P(r)IgG-IgG_10c, reduces the q3 peak intensity and shifts it to a higher q. We found that subtracting both [P(r)Fc_3c and P(r)IgG_10c] effectively diminished the q3 peak. Other domain-domain correlations and their combinations also contributed to the q3 peak. In Figure 4, subtracting P(r) Fc_3c from the P(r) of IgG1 [P(r) IgG ] yields P(r) IgG-Fc_3c and their corresponding Kratky plots indicate the q 3 peak is attenuated and conserved. Subtracting the P(r) component of 10c [P(r) IgG_10c ] from P(r) IgG , which is P(r) IgG-IgG_10c , reduces the q 3 peak intensity and shifts it to a higher q. We found that subtracting both [P(r) Fc_3c and P(r) IgG_10c ] effectively diminished the q 3 peak. Other domain-domain correlations and their combinations also contributed to the q 3 peak. In summary, the origin of the subpeaks in the Kratky plot of IgG1, especially q3 = ~0.16 Å −1 , was investigated based on the crystal structure. In addition to the shorter CH2-CH3 correlations in Fc, the longer correlations between the domains with distances of ~80-90 and ~125 Å also strongly contributed to this q3 peak. For example, CH3(H) in Fc and CL(L) in Fab1 are separated at an averaged distance of 85.4 Å and this combination is the positive contributor. The contributions of the domain-domain correlations are shown in Figure 3. The acid-induced conformational change in IgG1 that triggers aggregation includes the loss of these longer domain-domain correlations that are stable in the native state.
In this study, we used the single static structure of IgG1 while the domain-domain correlation of IgG1 can be stochastic; this is the limitation of this study. Thus, the effects of structural dynamics should be noted. A previous study [26] identified that the position and intensity of the q3 peak are sensitive to conformational fluctuations (i.e., open-close or CH2 domain orientation) in the Fc region. A specific combination positively (or negatively) contributing to the q3 peak in this analysis (Figure 3a) could become a negative (or positive) contributor due to changes in the distance distribution caused by conformational fluctuations in IgG1.
The subpeaks in the Kratky plots involve many correlations. Assigning them should be in conjunction with I(q) analysis at lower q to higher q and P(r) analysis at smaller r to longer r to be self-consistent. Such an analysis would be more feasible for a monodispersed sample (impurity-free or aggregate-free) and for SAXS data with a higher S/N ratio. Even if the mixture contains unavoidable large impurities, such as aggregates, the SAXS of the large particles decays steeply and is less prominent in the higher q region, in which Kratky analysis is preferred, and extracts information from such limited data. Even if the preliminary data are noisy, the Kratky peak is useful for deducing structural changes, which can serve as a working hypothesis for further research.
Interestingly, this q3 peak is shared by IgG1, IgG2, and IgG4 and is less pronounced upon heat-induced aggregation (IgG1) [17] or acid-induced aggregation (IgG2 and IgG4) [15]. The long-range conformational changes identified in this study are the initial common events that precede the aggregation of various mAbs under a wide variety of stresses. The q3 peak in the Kratky plot can be used as a good indicator of the native-like, longrange-ordered structure or the dynamics of mAbs. In summary, the origin of the subpeaks in the Kratky plot of IgG1, especially q 3 =~0.16 Å −1 , was investigated based on the crystal structure. In addition to the shorter C H 2-C H 3 correlations in Fc, the longer correlations between the domains with distances of~80-90 and~125 Å also strongly contributed to this q 3 peak. For example, C H 3(H) in Fc and C L (L) in Fab1 are separated at an averaged distance of 85.4 Å and this combination is the positive contributor. The contributions of the domain-domain correlations are shown in Figure 3. The acid-induced conformational change in IgG1 that triggers aggregation includes the loss of these longer domain-domain correlations that are stable in the native state.
In this study, we used the single static structure of IgG1 while the domain-domain correlation of IgG1 can be stochastic; this is the limitation of this study. Thus, the effects of structural dynamics should be noted. A previous study [26] identified that the position and intensity of the q 3 peak are sensitive to conformational fluctuations (i.e., open-close or C H 2 domain orientation) in the Fc region. A specific combination positively (or negatively) contributing to the q 3 peak in this analysis (Figure 3a) could become a negative (or positive) contributor due to changes in the distance distribution caused by conformational fluctuations in IgG1.
The subpeaks in the Kratky plots involve many correlations. Assigning them should be in conjunction with I(q) analysis at lower q to higher q and P(r) analysis at smaller r to longer r to be self-consistent. Such an analysis would be more feasible for a monodispersed sample (impurity-free or aggregate-free) and for SAXS data with a higher S/N ratio. Even if the mixture contains unavoidable large impurities, such as aggregates, the SAXS of the large particles decays steeply and is less prominent in the higher q region, in which Kratky analysis is preferred, and extracts information from such limited data. Even if the preliminary data are noisy, the Kratky peak is useful for deducing structural changes, which can serve as a working hypothesis for further research.
Interestingly, this q 3 peak is shared by IgG1, IgG2, and IgG4 and is less pronounced upon heat-induced aggregation (IgG1) [17] or acid-induced aggregation (IgG2 and IgG4) [15]. The long-range conformational changes identified in this study are the initial common events that precede the aggregation of various mAbs under a wide variety of stresses. The q 3 peak in the Kratky plot can be used as a good indicator of the native-like, long-range-ordered structure or the dynamics of mAbs.

SAXS Computation
The following SAXS calculation is called "C α -based" computation in this manuscript. The measure of the distance between selected pairs of IgG1 C α coordinates (PDB ID:1HZH) provides the number of pairs of points separated by a distance, r, termed n(r). n(r) is proportional to the distance distribution function P(r) of the SAS object and is therefore regarded as P(r). Scattering I(q) was calculated using the Fourier transform of P(r), as shown in Equation (2): where q is defined as q = |q| = 4πsinθ/λ, q is the scattering vector, 2θ is the scattering angle, and λ is the wavelength of the X-ray or neutron. n(r), P(r), and I(q) were processed using an in-house program [22] in IGOR Pro version 6.22A (WaveMetrics, Portland, OR, USA). Each domain of IgG1 was assigned according to "Family & Domains" information in UniProt (https://www.uniprot.org (accessed on 25 July 2023)) after sequence alignments. The UniProt IDs used were P01857 (C H 1), P01857 (C H 2), P01857 (C H 3), P0CG04 (C L ), P01825 (V H ), and P01703 (V L ). Linker regions were not included in the I(q) calculation.
This study focused on extracting the inter-domain correlations, for which "C α -based" calculation based on Equations (1) and (2) is straightforward. The CRYSOL program [25] outputs I(q), which involves both the intradomain and the inter-domain correlations, and thus does not intend to address the decomposition of these contributions.

SAXS Data Collection
Experimental SAXS data for humanized immunoglobulin G1 (IgG1) (148 kDa), termed mAb-A, were collected using the BL-10C beamline at the Photon Factory of the High Energy Accelerator Research Organization (KEK) in Tsukuba, Japan [27]. The data of Figure 1a were obtained from Figure 1b in the literature [10]. The mAb-A sequence was almost identical to that of the representative IgG1 model (PDB ID:1HZH), except for the V H and V L sequences. Data Availability Statement: Data presented in this study are available in the manuscript. The data generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.