A New Alternative Tool to Analyse Glycosylation in Monoclonal Antibodies Based on Drop-Coating Deposition Raman imaging: A Proof of Concept

Glycosylation is considered a critical quality attribute of therapeutic proteins as it affects their stability, bioactivity, and safety. Hence, the development of analytical methods able to characterize the composition and structure of glycoproteins is crucial. Existing methods are time consuming, expensive, and require significant sample preparation, which can alter the robustness of the analyses. In this context, we developed a fast, direct, and simple drop-coating deposition Raman imaging (DCDR) method combined with multivariate curve resolution alternating least square (MCR-ALS) to analyze glycosylation in monoclonal antibodies (mAbs). A database of hyperspectral Raman imaging data of glycoproteins was built, and the glycoproteins were characterized by LC-FLR-MS as a reference method to determine the composition in glycans and monosaccharides. The DCDR method was used and allowed the separation of excipient and protein by forming a “coffee ring”. MCR-ALS analysis was performed to visualize the distribution of the compounds in the drop and to extract the pure spectral components. Further, the strategy of SVD-truncation was used to select the number of components to resolve by MCR-ALS. Raman spectra were processed by support vector regression (SVR). SVR models showed good predictive performance in terms of RMSECV, R2CV.


Introduction
The development of protein-based drugs has accelerated in recent years as proteins exhibit a variety of therapeutically beneficial properties, such as: higher target specificity and generally fewer side effects. Most drugs derived from biotechnology marketed to date are derived from recombinant proteins. Recombinant proteins represent the majority of biopharmaceuticals on the market. These include hormones, cytokines, enzymes, and monoclonal antibodies, which have rapidly grown thanks to their effectiveness in the treatment of severe diseases.
Glycosylation is present in almost 60% of commercial pharmaceutical proteins and is considered as the most common post-translational modification (PTM) of extracellular proteins and some intracellular proteins [1]. The characterization and quantification of Second, the study aims at quantifying the relative amount of each major monosaccharide (mannose, N-acetylglucosamine, galactose, fucose, and sialic) and glycans (M5, FA2G2, sialylated glycan, high mannose). The pure spectral component of protein resolved by MCR-ALS is used in the regression model as input spectral data. Several regression models were calibrated to determine the relative amount of each monosaccharide and glycan. A hyperspectral Raman dataset of 16 reference mAbs samples was collected. The composition in glycans and monosaccharides was determined using reference LC-FLR-MS methods. Support vector regression (SVR) was established to correlate the spectral information (pure component) and the quantitative reference values. This regression was chosen as it can cope with high degrees of nonlinearity. Moreover, it is a robust method, which is less sensitive to spectral noise and provides accurate predictions [1]. SVR models provide less diagnostics in terms of specificity, because the information about the original input variables is lost [12]. This is why the study on the specificity of SVR models was also explored.

Chemicals, Reagents, and Proteins for the Spectral Characterization of Commercial mAbs Solutions
16 monoclonal antibodies (mAbs) were analyzed (one batch per product). These glycoproteins were provided by the Saint-Pierre hospital (Brussels, Belgium) and the pharmacy of the University Hospital Center of Liège (CHU Liège, Belgium).

Solution Preparation
The stock solutions of the mAbs were analyzed at their initial concentration. Additionally, two aliquots of the stock solutions were prepared. A first aliquot was prepared in NaCl (0.9%) at 3 mg mL −1 . A second aliquot was prepared by taking 50 µL of the stock solution. This second aliquot was eventually purified by column filtration.

Detection Limits
To study the quantitative limits of detection (LoD) in DCDR analysis with the µ-RIM slide, 7 dilutions of each antibody were prepared and analyzed.
The therapeutic concentration ranges were chosen based on the pharmaceutical preparations (CHU Liege). The different concentrations were prepared from the stock formulations using serial dilutions in NaCl (0.9%). Table 1 illustrates the different dilutions prepared for each commercial mAb solution. All samples were prepared freshly on the day of the analysis.

Removal of Excipients in mAbs Formulations
The study first aimed to demonstrate that the DCDR analysis allows the separation of the buffer salts, excipients, and glycoprotein. To check the efficiency of this separation, the spectral profile of the glycoprotein after purification via filtration column and via the DCDR method were compared. Size-exclusion spin columns were used to remove residual salts and excipients present in the formulations. These excipients (mannitol, Tween 80, Tween 20, trehalose, etc. . . . ) interfere with Raman measurements in the spectral region between 600 and 1800 cm −1 as shown in Figure 1A,B. Furthermore, due to their polyol nature, they prevent the analysis of the glycan part of the mAbs.
As explained in [1], a buffer exchange step with NaCl (0.9%) was performed using Micro Bio-Spin ® P-6 Gel columns (TRIS buffer, sample volume 10-75 µL, 6000 Da MW limit). The size-exclusion spin columns are based on the principle of gel filtration (Polyacrylamide). Therefore, molecules larger than the pores of the stationary phase matrix will be first excluded and migrate rapidly through the column, while molecules smaller than the matrix pores migrate more slowly [13].
Thus, at the end of the filtration cycle, the solutions were recovered in 2 fractions: the concentrated proteins were first eluted and recovered in the filtrate, while the excipients were recovered from the filter after washing with the NaCl solution. To ensure efficient separation, three cycles were applied to each mAb solution [6]. limit). The size-exclusion spin columns are based on the principle of gel filtration (Polyacrylamide). Therefore, molecules larger than the pores of the stationary phase matrix will be first excluded and migrate rapidly through the column, while molecules smaller than the matrix pores migrate more slowly [13].
Thus, at the end of the filtration cycle, the solutions were recovered in 2 fractions: the concentrated proteins were first eluted and recovered in the filtrate, while the excipients were recovered from the filter after washing with the NaCl solution. To ensure efficient separation, three cycles were applied to each mAb solution [6].

Raman Analysis
Next, 2 µL of each sample was deposited on a µ-RIM (BioTools, Jupiter, JUP, United States) slide covered with a thin layer of polytetrafluoroethylene (PTFE) and dried at

Raman Analysis
Next, 2 µL of each sample was deposited on a µ-RIM (BioTools, Jupiter, JUP, United States) slide covered with a thin layer of polytetrafluoroethylene (PTFE) and dried at ambient temperature for 45 min. All Raman spectra and hyperspectral Raman imaging data were acquired with a Labram HR Evolution (Horiba scientific, Lyon, France) equipped with an EMCCD detector (1600 × 200 pixels, Andor Technology Ltd., Abingdon-on-Thames, UK.). A 600 gr/mm grating, a Leica 50× Fluotar LWD objective and a 532 nm laser with a power of 25 mW at the sample (Horiba Scientific) were used. Figure 2 illustrates the methodology followed throughout this study. ambient temperature for 45 min. All Raman spectra and hyperspectral Raman im data were acquired with a Labram HR Evolution (Horiba scientific, Lyon, F equipped with an EMCCD detector (1600 × 200 pixels, Andor Technology Ltd, A don-on-Thames, UK.). A 600 gr/mm grating, a Leica 50× Fluotar LWD objective 532 nm laser with a power of 25 mW at the sample (Horiba Scientific) were used. 2 illustrates the methodology followed throughout this study.

Raman Spectral Measurements
All spectra were collected with the LabSpec 6 software (Horiba Scientific, Jobin Yvon, Palaiseau, France) over a spectral range 600-1800 cm −1 with 10 s acquisition time and two accumulations. Three spectra were acquired at four distinct positions of the dried drop for each sample: concentrated protein, filter of Micro Bio-Spin ® (excipients), and excipients.
Regarding the limit of detection (LOD), a spectral range between 600 and 1800 cm −1 was used. For each concentration, three spectra were acquired at four distinct positions of the dried drop, on the edge and the center of the drop (Figure 3).

Raman Spectral Measurements
All spectra were collected with the LabSpec 6 software (Horiba Scientific, Jobin Yvon, Palaiseau, France) over a spectral range 600-1800 cm −1 with 10 s acquisition time and two accumulations. Three spectra were acquired at four distinct positions of the dried drop for each sample: concentrated protein, filter of Micro Bio-Spin ® (excipients), and excipients.
Regarding the limit of detection (LOD), a spectral range between 600 and 1800 cm −1 was used. For each concentration, three spectra were acquired at four distinct positions of the dried drop, on the edge and the center of the drop (Figure 3). Finally, for the interpretation of SVR models, the spectra of each of the five pure monosaccharides and the Man-5 glycan were measured. In this context, six spectra (six distinct deposits) were recorded for each sample.

Raman Hyperspectral Imaging
For the composition analysis of the mAbs, 16 hyperspectral images were acquired. For the quantitative analysis of the composition in monosaccharides and glycans, a total of 96 hyperspectral images at a low concentration (3 mg mL −1 ) of mAbs were acquired. Two independent samples with 3 distinct deposits for each sample were acquired.
A Leica 10× Fluotar LWD objective was used to map the sample's surface. The spectra were acquired from 600 to 1800 cm −1 . For the initial product concentrations of mAbs, a 10 s acquisition time and two accumulations were used, giving a total data acquisition time of 2 h 55 min. In order to reduce the analysis time, these parameters were reduced, without losing quality, because at low concentration, the mAbs are more concentrated at the edge of the drop. Therefore, low-concentration maps were acquired with 2 s acquisition time and two accumulations leading to 38 min.
Raman spectra were collected on the whole drop surface with 25 × 25 pixels mapping and a step size of 134 µm on the X and Y dimensions. The spectral grids were spherical in shape based on the dimensionality of the drop.

UPLC-FLR-MS Analysis
Labelled N-glycans were analyzed via HILIC separation combined with fluorescence (FLR) and mass spectrometry (MS) detection using a UPLC-MS system equipped with an ACQUITY UPLC BEH Amide (2.1 × 150 mm, 1.7 µm particle size, and 130 Å Finally, for the interpretation of SVR models, the spectra of each of the five pure monosaccharides and the Man-5 glycan were measured. In this context, six spectra (six distinct deposits) were recorded for each sample.

Raman Hyperspectral Imaging
For the composition analysis of the mAbs, 16 hyperspectral images were acquired. For the quantitative analysis of the composition in monosaccharides and glycans, a total of 96 hyperspectral images at a low concentration (3 mg mL −1 ) of mAbs were acquired. Two independent samples with 3 distinct deposits for each sample were acquired.
A Leica 10× Fluotar LWD objective was used to map the sample's surface. The spectra were acquired from 600 to 1800 cm −1 . For the initial product concentrations of mAbs, a 10 s acquisition time and two accumulations were used, giving a total data acquisition time of 2 h 55 min. In order to reduce the analysis time, these parameters were reduced, without losing quality, because at low concentration, the mAbs are more concentrated at the edge of the drop. Therefore, low-concentration maps were acquired with 2 s acquisition time and two accumulations leading to 38 min.
Raman spectra were collected on the whole drop surface with 25 × 25 pixels mapping and a step size of 134 µm on the X and Y dimensions. The spectral grids were spherical in shape based on the dimensionality of the drop.

UPLC-FLR-MS Analysis
Labelled N-glycans were analyzed via HILIC separation combined with fluorescence (FLR) and mass spectrometry (MS) detection using a UPLC-MS system equipped with an ACQUITY UPLC BEH Amide (2.1 mm × 150 mm, 1.7 µm particle size, and 130 Å pore size) column (Waters Milford, MA, USA). The details of this method are described in Derenne et al. [6]. MS data were obtained using a Single Quadrupole Detector 2, SQD2 (Waters, Milford, MA, USA) in ESI-positive mode and the data were acquired using Empower 3.1 software. Inc., Manson, WA, USA). Furthermore, SVD truncation strategy and P-vector were programmed under the MATLAB ® environment [12].

Data Preprocessing
All Raman spectra were preprocessed by Savitzky and Golay smoothing (window size: 7) for noise reduction and followed by a baseline correction by the Automatic Whittaker filter (λ = 100,000, p = 0.001).
The resolved Raman spectra used to build the monosaccharide and glycan prediction models were preprocessed first, by Savitzky and Golay 1st derivative (polynomial order: 3, window size: 15) and followed by standard normal variate (SNV).

Multivariate Curve Resolution Alternating Least Squares (MCR-ALS)
The hyperspectral Raman imaging data were analyzed using MCR-ALS to generate spatially resolved chemical images (scores) and corresponding resolved Raman spectra of the individual chemical species (pure components). This method is often used in the case of complex samples of unknown composition [14,15].
Each hyperspectral Raman imaging datum is a three-dimensional data cube where x, y are the spatial information and λ is the spectral information (cm −1 ). In order to be able to carry out the MCR-ALS, D is unfolded as a 2D data matrix D (n × m), where n = x × y and m = λ.
Mathematically, MCR-ALS decomposes the data matrix, D, as follows: where C is the matrix of the relative concentration profiles, S the matrix of pure spectra, and E the matrix of the residuals. In this work, MCR-ALS was initialized via exteriorpts algorithm and non-negativity constraints were applied on the concentration and on the spectra [9]. The full spectral range (600-1800 cm −1 ) of the given hyperspectral Raman imaging dataset was used in MCR-ALS.
In this study, the pure chemical components resolved for each mAb sample were used as input data for the regression models.
The lack of fit (LOF) was calculated to evaluate the quality of the fit of the MCR-ALS model. where: d i,j corresponds to the input original data.
d i,j − d i,j corresponds to the residual calculated from the difference between the input original data and the MCR-ALS reproduction.

Singular Value Decomposition (SVD Truncation Strategy)
The estimate for the number of chemical species (rank) is necessary in order to carry out the MCR-ALS analysis. Different analysis methods exist to estimate this rank, such as principal component analysis (PCA), Durbin-Watson (DW) criterion, and the singular value decomposition (SVD) method. In linear algebra, the rank corresponds to the number of pure spectra (eigenvectors) or concentration profiles necessary to explain the set of recorded data, the latter being only a linear combination of these eigenvectors. In the ideal case, in the absence of measurement noise, the chemical rank corresponds to the definition of the mathematical rank [16].
In this study, SVD (truncation strategy) was applied first to denoise the hyperspectral Raman imaging data and secondly to estimate the number of chemical species in the MCR-analysis. - The first step of the SVD is a factorization of the data matrix D (n × m) as where: U (n × n) and V (m × m) are left and right singular vectors matrices, respectively. Σ(n × m) is the diagonal matrix of the singular values σ i for i = 1, . . . , r with r being the rank of the matrix D.
-The second step consists of plotting the singular values according to their value. Thus, the first, second, and third segment are identified. Only the second (between two breaks) is considered since it represents a good compromise between the relevant information carried by the segment on the left and the noise carried by the segment on the right. Each σ i of this segment is considered as a rank and then as a threshold. - The last step consists of a reconstruction of the different truncated matrices (different In this study, 3 truncated matrices were considered.
For each of these matrices, an MCR model is built and analyzed. Only the interpretable model, related to the best estimate of the rank, was retained.

Performance Evaluation of the SVR Models
Different spectral ranges were tested to build the models. The best spectral range was selected based on its cross-validation performance. As a result, the spectral range between 800 and 1115 cm −1 was retained. It is also in this spectral range that the characteristic bands of sugars are present [17].
To calibrate and optimize SVR models, Venetian blinds was used as a cross-validation strategy with a data split of 10 and one sample per blind (thickness). In this context, the calibration models were evaluated using the Root Mean Square Error of cross-validation (RMSECV).
where y CV,i is the value predicted by the cross-validated model for sample i. y i is the measured value obtained for sample i and n is the number of samples. y corresponds to the average of all reference measurement values in the calibration set [18].
Thereby, an evaluation of performances for the SVR models was performed, by comparing the results obtained by the calibration performances: RMSECV and R 2 CV . Low value of RMSECV and a high value of R 2 CV are expected indicating that the model is able to accurately estimate the concentration.

Interpretation of SVR Models
Gaussian RBF kernel (Radial Basis Function) was chosen to perform the SVR models. This kernel expresses sample-to-sample similarities using the following equation: where γ > 0 and x i , x j are input feature values for i and j samples and γ is the kernel parameter. The nonlinear SVR model requires the determination of three meta-parameters: the cost C, and the variables ε and γ. To have a good generalization performance of SVR models, the simultaneous optimization of its three meta-parameters (C, ε and γ) was performed. This optimization is conducted through a grid search using 2-step crossvalidation. The first step was a coarse grid search in a goal to select approximately the best region. The second step was a finer grid search in order to obtain optimal values. Consider an input dataset X(N × M) with an output vector y i ∈ R. The objective of SVR is to find a multivariate regression function f (x) to predict the desired output property (amount of monosaccharides) of an unknown object (new spectrum).
where α i and α * i correspond to the support vectors, α i , α * i = 0. K x i , x j is an RBF kernel that transforms the nonlinear input space into a highdimensional feature space, where the problem can be modeled in a linear way. However, the information related to the original input variables is lost. Therefore, it is important to determine the contribution of each input variable to the final regression model to ensure the specificity of the model. Thus, being able to clearly determine which input variables in the original input data is explanatory for the modeled output property (amount of monosaccharides). Therefore, a P-vector is obtained by calculating the inner-product between the original input space X T and the α-vector of the SVR models ( Figure 4) [2].
Gaussian RBF kernel (Radial Basis Function) was chosen to perform the SVR models. This kernel expresses sample-to-sample similarities using the following equation: , = exp (− − ), where > 0 and , are input feature values for and samples and is the kernel parameter. The nonlinear SVR model requires the determination of three meta-parameters: the cost C, and the variables ε and γ. To have a good generalization performance of SVR models, the simultaneous optimization of its three meta-parameters (C, ε and γ) was performed. This optimization is conducted through a grid search using 2-step cross-validation. The first step was a coarse grid search in a goal to select approximately the best region. The second step was a finer grid search in order to obtain optimal values. Consider an input dataset ( × ) with an output vector ∈ . The objective of SVR is to find a multivariate regression function ( ) to predict the desired output property (amount of monosaccharides) of an unknown object (new spectrum).
( , ) is an RBF kernel that transforms the nonlinear input space into a highdimensional feature space, where the problem can be modeled in a linear way. However, the information related to the original input variables is lost. Therefore, it is important to determine the contribution of each input variable to the final regression model to ensure the specificity of the model. Thus, being able to clearly determine which input variables in the original input data is explanatory for the modeled output property (amount of monosaccharides). Therefore, a P-vector is obtained by calculating the innerproduct between the original input space and the α-vector of the SVR models ( Figure  4) [2].

Spectral Characterization of Commercial mAbs Solutions after Excipients Removal
Sixteen commercial mAbs listed in Table 2 were filtered by Microbiospin and the filtrate and filter were recovered and analyzed by Raman.

Spectral Characterization of Commercial mAbs Solutions after Excipients Removal
Sixteen commercial mAbs listed in Table 2 were filtered by Microbiospin and the filtrate and filter were recovered and analyzed by Raman.  Figure 5 shows the average Raman spectrum of 16 mAbs after filtration of the excipients, acquired by depositing drops after 45 min of drying. Therefore, the main feature bands of proteins are observed at~640 cm −1 (tyrosine), 755 cm −1 (tryptophan), 880 cm −1 (C-C backbone), 954 cm −1 (C-H stretching of α-helix structure), 1001 cm −1 (phenylalanine), 1237 cm −1 (amide III), 1335 cm −1 and 1445 cm −1 (CH2 deformation), 1551 cm −1 (amide II), and 1668 cm −1 (amide I). Notably, the main feature bands of sugar are observed at 880 cm −1 (C-O-C stretch) and 1350 cm −1 (NH2 twist), and the bands of glycosidic ring are observed at (830, 1000, 1450 cm −1 ) [17,19]. • Analysis filter: excipients Figure S1 reports the analysis on the filter retained during the filtration of trastuzumab. Table 2 shows the detected spectrum of each commercial mAbs solution, after superposition of the spectrum retained in the filter with each of the excipients present in the formulation of mAbs. It is interesting to note that the dominant spectrum corresponds to the excipient at the highest concentration in the sample (Table S1), whilst the other excipients present at very low concentration are not detected [20].  • Analysis filter: excipients Figure S1 reports the analysis on the filter retained during the filtration of trastuzumab. Table 2 shows the detected spectrum of each commercial mAbs solution, after superposition of the spectrum retained in the filter with each of the excipients present in the formulation of mAbs. It is interesting to note that the dominant spectrum corresponds to the excipient at the highest concentration in the sample (Table S1), whilst the other excipients present at very low concentration are not detected [20].

Spectral Characterization of Commercial mAbs Solutions by DCDR Imaging and MCR-ALS Analysis
In previous work, we proposed to use ATR FT-IR spectroscopy to qualitatively and quantitatively analyze the glycosylation of monoclonal antibodies [1]. However, this method has some limitations; for example, it requires protein deposits with uniform chemical composition, and to obtain a high-quality ATR spectrum, approximately 300 µg mL −1 of protein is required [21]. In addition, a preliminary step of elimination of excipients by specific filtration columns is necessary. Therefore, this study, using the DCDR method, presents some advantages, which overcome the limitations previously encountered with the FTIR-ATR method. Indeed, it requires a microdeposition of only 10 ng of protein to obtain high-quality spectra. Further, the excipient removal step, which was previously mandatory, is no longer necessary.
In this context, the first aim of this study was the identification of composition of mAbs in their pharmaceutical formulation. For this, the use of a hydrophobic substrate is beneficial for the formation of the coffee ring. Several hydrophobic substrates have already been used since the development of the DCDR method, including quartz, CaF 2 , polished steel with a thin coating of polytetrafluoroethylene (PTFE), etc. In this study, slides of PTFE-coated steel surfaces were used. PTFE-coated slides are more advantageous than slides coated with CaF 2 , which lose their hydrophobicity over time [22].
This method is based on physical segregation and requires only the deposition of a drop of solution on this specific substrate and, after a few moments, the solvent is evaporated. Thus, the separation of excipients and protein was achieved by forming the "coffee ring" [8]. Further, in this study, the use of hyperspectral imaging enabled the visualization of the distribution of the compounds in the drop.
Rank estimation (number of pure spectral signatures) for the MCR-ALS analysis was obtained based on the SVD truncation strategy. The rank is estimated when the singular values are zero. In our case (Figure 6), the singular values decrease but do not reach zero. It is, therefore, not possible to give an exact value of the rank. This is why the notion of threshold is used to determine the significant singular values with respect to the variance in the noise [23]. As a result, three segments are identified; the first segment contains only information, the third segment contains only noise, and the second segment contains a mixture of information and noise. Thus, the last step consists of the selection of three ranks of this second segment. Finally, the reconstruction of three truncated matrices is performed, then they are introduced into the MCR-ALS in order to choose the correct reconstruction. The phenomenon of the formation of coffee ring patterns in drying drops can also depend on drying conditions, such as temperature, air velocity, and relative humidity in the evaporating medium. Relative humidity directly influences the contact angle of the drop and, thus, the initial evaporation rate of the drop. Indeed, their experiments showed that the contact angle decreases with relative humidity, which influences the pattern of coffee ring formation at the end of the evaporation process. [26].   Figure 7 shows the application of the DCDR method combined with MCR-ALS analysis on the trastuzumab analysis. In this example, the selected number of components determined by SVD-truncation is three. The drying process of the drop allowed the separation of solutes in the solution due to the formation of a "coffee ring" [2]. Therefore, in Figure 7, component 1 represents the spectrum of the excipient with weak protein bands (amide I, II, III), located in the inside edge of the drop. Component 2 represents the spectrum of the purified protein ( Figure 8A), located in the outer edge of the drop. Component 3 is representative of the noise. In total, almost all the information was extracted (Q residuals of approximate %). In general, the coffee ring is observed where the buffer salts and excipients evaporated around the center of the drop, and the protein molecules moved outward to the edge and concentrated on the outer ring of the drop. This contact line pinning acts as a barrier, thus, limiting any further spread of the solvent at the edge. The capillary flow in the drop is from the center of the drop towards the line of contact because the evaporation flow at the periphery is greater than at the center of the drop. Thus, to compensate for the greater loss of solvent at the level of the line of contact, the solvent flow starting from the center is directed towards the line of contact with the non-volatile solutes. Thus, the local evaporation rate is considered to be an important factor in order to ensure the efficiency of the separation between the excipients and the protein [24,25]. Figure 6. Singular values reported according to their rank, which shows three segments. Three thresholds (green dots) were considered in the second segment and, thus, three reconstructions of the different truncated matrices (different ranks r) were performed.  The differential solubilities in the various components in the solution also play an important role in segregation. Proteins precipitate at the beginning of the evaporation process, whereas highly soluble compounds, such as buffer solutions or compounds present at very low initial concentration, may remain dissolved in the drop, which evaporates for much longer, and, therefore, tends, eventually, to settle at the center of the drop [10,24]. The phenomenon of the formation of coffee ring patterns in drying drops can also depend on drying conditions, such as temperature, air velocity, and relative humidity in the evaporating medium. Relative humidity directly influences the contact angle of the drop and, thus, the initial evaporation rate of the drop. Indeed, their experiments showed that the contact angle decreases with relative humidity, which influences the pattern of coffee ring formation at the end of the evaporation process [26].
The differential solubilities in the various components in the solution also play an important role in segregation. Proteins precipitate at the beginning of the evaporation process, whereas highly soluble compounds, such as buffer solutions or compounds present at very low initial concentration, may remain dissolved in the drop, which evaporates for much longer, and, therefore, tends, eventually, to settle at the center of the drop [10,24].
Further, a question is raised in this part: is it possible to identify the dominant component (Table S1) among the excipients for each mAbs? Therefore, superposition of each excipient was detected by the MCR-ALS in the center component for each mAbs, with each of the excipients present in the formulation of the mAbs. It is interesting to note that the spectrum of excipient found in the center of the drop corresponds to the dominant excipient (at the highest concentration) in the formulation of the mAbs (Table S1). These same excipients were retained in the filter of the Microbiospin ® column, during the purification of the mAbs in Table 2.
To evaluate the MCR-ALS model quality, the percentage of lack of fit (LOF) was calculated. Therefore, the values of lack of fit, after applying the SVD, were close to zero (Table  S2), indicating a good quality for the matrix reconstruction. It can be deduced that the analysis of the mAbs by DCDR imaging associated with the MCR-ALS is an efficient technique to eliminate the interference of the excipients. The separation of proteins and excipients by physical segregation plays the same role as purification based on Microbiospin ® column by concentrating the protein at the periphery and the excipients at the center.
The DCDR method can be used to track the progress of wash/ultrafiltration cycles used to remove impurities and additives, such as glycerol and buffer compounds. Indeed, despite an optimization in the number of washing cycles, it happens that certain excipients are not completely eliminated [6]. Additionally, as described in previous studies, this method can be used to detect protein impurities [27].

Limit of Detection of DCDR Analysis
The study on detection limit was performed, in order to evaluate the efficiency of the separation process of the DCDR method, because it turns out that this method is more efficient at low concentration. Figure S2 shows the images of the drops corresponding, respectively, to each dilution of trastuzumab and Figure S3 shows the evolution of the intensity of the bands for the excipients and the protein in the center and at the edge of the drop.
The presence of protein signals in the center of the drop at high concentration shows that the segregation of proteins is incomplete due to the high protein content in a small volume of solvent that evaporates quickly, leaving insufficient time for the protein to move completely to the periphery. In addition, according to Figure S3, as the concentration of mAbs decreases, agglomerates of excipients are more concentrated and form in the center and the protein concentrates at the edge of the drop. This is due to the fact that the more the formulation of mAb is diluted, the more the excipients have time to regroup and form agglomerates and the more the proteins have time to settle entirely at the edge. Thus, the separation between the excipients and the protein within the DCDR method is more efficient at low concentration. The detection limit for the 16 mAbs is around 0.5 mg mL −1 according to the S/N ratio.
The question of the repeatability of DCDR spectra can be raised. Studies on protein analysis have already shown that the repeatability of DCDR spectra can depend on the thickness of the protein cycle but also on the purity of the solvent. For example, the independent deposit of different volumes of the same solution (100 and 1 µM) generates a difference in thickness of the protein rings for the two deposits. Thus, the signal-to-noise ratios for the deposition spectrum in the low volume are significantly higher than these at larger volume. However, they clearly demonstrated that in the case of independent deposition of the same volume in the same solution, there is no significant spectral variation [27].

SVR Models to Determine the Composition of Monosaccharides and Glycans
In this study, the reference data were obtained via UPLC-FLR-MS. The mass spectrometry data were used to identify N-glycans and the fluorescence data were used for glycan quantification. Table S3 groups the composition of the main N-glycans for each mAbs and Table S4 presents the overall mass percentage of the five monosaccharides present in each mAb [1].
The SVR regression models were built to predict the amount of each monosaccharide (mannose, N-acetylglucosamine, galactose, fucose, sialic acid) and glycans, using the spectra of pure component of the glycoprotein (Figure S4) extracted from the MCR-ALS. Optimized SVR parameters for monosaccharide and glycan models at low concentration are presented in Tables S5 and S6.
Tables 3 and 4 demonstrate the performances of SVR models in predicting the amount of each monosaccharide and glycan and Figures 9 and 10 show the results of data modelling by the SVR model, respectively. These results show low error values in terms of calibration (RMSEC), cross-validation performance (RMSECV), and high R 2 cv values, indicating that the models captured most of the correlation between the spectral data and the quantitative values.     Figure 9. Measured versus predicted amounts of monosaccharides obtained by SVR regression between 800 and 1150 cm −1 for the analysis of the amount of each monosaccharide at low concentration (3 mg mL −1 ) of mAbs.

Interpretation SVR Models
Unlike the PLSR of which loadings can be interpreted, SVR provides less diagnostics on the model in terms of specificity. In an SVR model, the information related to the original input variables is lost. Therefore, to study the specificity of the SVR models, P-vectors of the different models were investigated. The P-vectors show the contribution of each input variable (original data) to the final regression model. Figure 11 shows the results of the monosaccharide models. In the mannose model ( Figure 11C), all mannose characteristic bands (1180, 1156, 1124, 1081, 1048, 992, 967, 945, 900, 870, 848, 821, 808 cm −1 ) are present in the P-vector (build in SVR). The same results were observed for the sialic acid, N-acetylglucosamine models ( Figure 11A,B). With regards to the P-vector of the fucose and galactose model ( Figure 11D,E), the main bands characteristic of the monosaccharides are also found; however, there are slight differences in some bands. This is partially explained by the existence of different environments surrounding the chemical bonds of monosaccharides in glycoproteins (different protein sequences implying different conformations), implying minor differences in the vibrations [1]. chemical bonds of monosaccharides in glycoproteins (different protein sequences implying different conformations), implying minor differences in the vibrations [1].

Conclusions
Glycosylation is one of the most critical attributes of biopharmaceuticals to be monitored from development to production. In previous work, we suggested using FT-IR spectroscopy in ATR mode to characterize and quantify glycosylation and, thus, deter-

Conclusions
Glycosylation is one of the most critical attributes of biopharmaceuticals to be monitored from development to production. In previous work, we suggested using FT-IR spectroscopy in ATR mode to characterize and quantify glycosylation and, thus, determine the composition of monosaccharides in glycoproteins. This approach has many advantages: reduced sample preparation (no cleavage, labelling, or separation step) and concise measurement time. However, this method has some limitations; for example, filtration columns were used to remove the excipients, which can interfere with the FT-IR measurements in the specific region of carbohydrate signals. Additionally, it requires protein deposits with a uniform chemical composition. This may be difficult or impossible to achieve due to the coffee ring effect, which concentrates the protein in a ring, as well as the phenomenon of segregation, which tends to cause different components in the solution to precipitate at different locations on the substrate. In this study, it is all to our advantage, due to the DCDR benefiting enormously from the effects of the coffee ring [27]. This is why, in order to simplify the sample preparation step and to increase the analysis throughput, we propose here using a fast and direct protocol based on a drop-coating deposition Raman imaging (DCDR) method combined with multivariate curve resolution alternating least square (MCR-ALS) analysis and SVR regression to identify and quantify glycosylation in monoclonal antibodies (mAbs).
First, DCDR imaging combined with MCR-ALS analysis were applied to 16 mAbs to demonstrate the ability of the method to extract pure Raman spectra of glycoproteins present in their original pharmaceutical formulation. The process of drying the drop allowed the separation of compounds in the solution by forming a "coffee ring" [2]. It was shown that, while the buffer salts and excipients evaporated around the center of the drop, the proteins moved outward to the edge and concentrated on the outer part of the drop. By evaluating the MCR-ALS model quality by LOF, the results show that the values of lack of fit, before applying the SVD, were go until 14%, while after applying the SVD, they were close to 0%. It can be deduced that the analysis of the mAbs by DCDR imaging associated with the MCR-ALS and SVD is an efficient technique to eliminate the interference of the excipients.
The second part of the study demonstrates the capacity of DCDR spectroscopy to quantify the relative amount of each monosaccharide and glycan. In this context, the SVR regression models were established on the spectral region of glycans, between 800 and 1150 cm −1 . The SVR models of each monosaccharide and glycan exhibited good analytical calibration performances. Indeed, low values of RMSEC and RMSECV indicate high accuracy and high value of R 2 CV close to 1, indicating that the model correctly handles the spectral variability and is, therefore, able to accurately estimate the concentration.
The specificity of SVR models was also evaluated. It can be concluded that this interpretation approach facilitates the understanding and in-depth analysis of SVR models.
Finally, the results presented so far prove the concept of this approach for quantification of monosaccharides and glycans in mAbs and, thus, pave the way to three potential applications: comparing the glycosylation of a biosimilar and the original molecule, monitoring batch-to-batch homogeneity, and in-process control of glycoprotein bioproduction.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules27144405/s1, Figure S1. (A) Trastuzumab filtrate (trehalose) drop image acquired with the 10× objective (B) filter spectrum (C) superposition of the spectrum of the filter with the excipient trehalose; Figure S2. Image of the drops acquired at different concentrations (21, 10, 5, 3,1, 0.5, 0., 0.01 mg/mL) for trastuzumab analysis; Figure S3. Evolution of the intensity displayed in color code, according to the intensity of the information, from the least intense (blue color) to the most intense (yellow color). The bands of the excipient (849, 1358 cm −1 ) and of the protein (1004, 1673 cm −1 ), at the edge (A) and in the center (B) of the drop were evaluated; Figure S4: Raman spectra (principal component extracted in MCR-ALS) preprocessed by the baseline correction, in the spectral range between 800 and 1150 cm −1 for the analysis of the composition of monosaccharides and glycans; Table S1: Excipients list present in the analyzed commercial solution of mAbs; Table S2: Percentage of lack of fit (LOF); Table S3: Mass percentage of the majority of N-glycans for each of the glycoproteins. This percentage was calculated based on relative peak areas-%RPA using UPLC-FLR-MS analysis of N-glycans after they were released. labelled and purified by GlycoWorks RapiFluor-MS N-glycan kit from Waters; Table S4: Overall mass percentage of the 5 monosaccharides present in each glycoprotein. These results were obtained by calculating the relative peak areas %RPA; Table S5: Optimal parameters in SVR model for predicting the composition of monosaccharides in low concentration; Table S6: Optimal parameters in SVR model for predicting the composition of glycans in low concentration.