Effect of the Post-Harvest Processing on Protein Modification in Green Coffee Beans by Phenolic Compounds

The protein fraction, important for coffee cup quality, is modified during post-harvest treatment prior to roasting. Proteins may interact with phenolic compounds, which constitute the major metabolites of coffee, where the processing affects these interactions. This allows the hypothesis that the proteins are denatured and modified via enzymatic and/or redox activation steps. The present study was initiated to encompass changes in the protein fraction. The investigations were limited to major storage protein of green coffee beans. Fourteen Coffea arabica samples from various processing methods and countries were used. Different extraction protocols were compared to maintain the status quo of the protein modification. The extracts contained about 4–8 µg of chlorogenic acid derivatives per mg of extracted protein. High-resolution chromatography with multiple reaction monitoring was used to detect lysine modifications in the coffee protein. Marker peptides were allocated for the storage protein of the coffee beans. Among these, the modified peptides K.FFLANGPQQGGK.E and R.LGGK.T of the α-chain and R.ITTVNSQK.I and K.VFDDEVK.Q of β-chain were detected. Results showed a significant increase (p < 0.05) of modified peptides from wet processed green beans as compared to the dry ones. The present study contributes to a better understanding of the influence of the different processing methods on protein quality and its role in the scope of coffee cup quality and aroma.


Introduction
Coffee belongs to the global mainstream drinks [1,2]. The demand for coffee beverages is growing, and only crude oil has a larger trade volume [3]. Coffee processing is the process of isolating the green coffee bean by removing all remaining layers of the coffee cherry [4]. In general, coffee cherries are treated in two well established procedures: the dry and wet method. The more traditional method is dry processing. It is commonly practiced in sunny regions at lower altitudes [5] and is considered less technically demanding and less expensive. After harvesting, the ripe fruits are sun-dried, and depending on weather conditions, the drying process takes 12-15 days to reach the desired moisture content (10-12%) [6]. The dried husk composed of coffee pulp, mucilage, and parchment layer is removed subsequently by machine [6]. Coffee processed by the dry method is cherished for its fruity note and silky mouthfeel. Nevertheless, it is one of the most difficult methods to produce high-quality coffee [4]. In the past years, however, such natural processed coffees have been winning the highest prices in auctions (e.g., "Cup of Excellence") and thus are alyzed enzymatically and non-enzymatically. Both catalyzed reactions consume oxygen by oxidizing the phenolic compounds to quinone intermediates. Quinones are diketones that act as oxidants [27].Their double-bond leads to a positive partial charge inside the ring. In conjunction with nucleophilic groups of amino acid side chains, nucleophilic addition occurs [25,28]. In particular, the amine of lysine and the thiol group of cysteine are preferred reaction sites [29]. The reaction of a protein with a phenolic compound is mainly influenced by the nature of the two reaction partners [30]. Prigent et al. [26] showed that differences in hydrophobicity, isoelectric point, and amino acid sequence have a significant influence on the interaction of the protein. Regarding the phenolic compound, position and number of hydroxyl groups, molecular weight, and structural flexibility play an important role [27,30,31]. Reaction parameters such as exposure time, temperature, pH, and concentration of phenolic compounds also have an influence [27,32,33]. In addition, enzymes such as polyphenol oxidases, which catalyze the oxidation of phenolic compounds, drive the modification of proteins. In this context, the enzyme-assisted extraction from coffee beans was shown to have antioxidant, antityrosinase, and antimicrobial activities resulting from the polyphenol and peptide composition [34]. Protein extracts and bioactive peptides from green coffee beans and spent coffee grounds were also shown to have high anti-hypertensive and antioxidant potentials [35][36][37].
Based on the above evidence provided, the protein fraction is likely to be subjected to modifications even before coffee roasting. In the same context, the type of post-harvest processing may also play an important role in modulating the reaction of the amino acid residues with the phenolic compounds. The aim of this study is to contribute to the understanding of the influence of different coffee processing methods on such protein modifications. For that purpose, a high-resolution mass spectrometry method to detect modification of protein in the green coffee beans was developed. Samples of different processing methods and countries of origin were used to assess the modified peptides in the main storage protein of green coffee beans.

Materials
Samples used in this work were of the species Coffea arabica. De-pulped coffee beans, initially fermented coffee beans, final fermented coffee beans, and green coffee beans were obtained from Rio Colorado Company (Palencia, Guatemala) and from Santa Sofia Company (Santa Rosa, Guatemala). In the Santa Sofia Company, part of the water is returned to the beginning of the process and used again in the de-pulping and fermentation steps. In the case of Rio Colorado Company, water is not recirculated, and freshwater is applied in the processing steps. The details of the processing conditions were described previously [9].
Proteomics-grade trypsin (Sigma Aldrich, Steinheim, Germany) or pepsin (Promega, WI, USA) were used to perform the in-solution digestion of the proteins. A synthesized peptide with the sequence GWGG (Bachem AG, Bubendorf, Switzerland) with a m/z ratio of 376.2 was used as an internal standard (IS) to normalize the data from the tandem mass spectrometry analysis. The in-silico digestion was performed by Skyline Software (MacCoss Lab Software, University of Washington; https://skyline.gs.washington.edu, accessed on 1 September 2020) [38]. The solvents used for mass spectrometry analysis were of LC-MS grade, and all the other chemicals were of analytical grade.

Sample Preparation
For the composition of phenolic compounds, 10 mg of flours were mixed with 1 mL of 80% methanol in water (80:20, v/v). Extraction of polyphenols was performed at room temperature under shaking conditions for 30 min. After centrifugation at 9300× g for 10 min, supernatants were collected and stored at −20 • C.
Samples from two coffee plantations located in Guatemala were extracted by three different methods for protein determination. Extraction with polyvinyl polypyrrolidone (PVPP; option I) was based on Laing et al. [39] and Ali et al. [18]. The extraction protocol with sodium dodecyl sulfate (SDS; option II) was developed as described by Want et al. [40] and Figueroa et al. [9]. Extraction protocol option II was improved and used for further analysis. A schematic flowchart of the three extraction protocols is provided in Figure S1 (see Supplementary Materials).
Briefly, for method I: 100 mg of powdered sample and 50 mg of PVPP were mixed with 1 mL of 0.04% ascorbic acid in water (v/v). Extraction was performed at room temperature under shaking conditions for 2 h. After centrifugation at 4000× g for 20 min, supernatants were collected and stored at −20 • C. For method II: 100 mg of powdered sample was mixed with 1 mL of n-hexane under shaking conditions for 10 min. After centrifugation at 7000× g for 10 min, supernatants were discharged. Subsequently, 1 mL of n-hexane was added and the lipids re-extracted and discharged again, and samples were then allowed to dry in opened microtubes for 10 min at room temperature. Finally, 750 µL of SDS buffer were added to each sample and heated for 20 min at 50 • C. After centrifugation at 7000× g at 4 • C for 5 min, supernatants were collected. Method III consisted of mixing 20 mg of powder sample with 1 mL of n-hexane under shaking conditions for 10 min. After centrifugation at 7000× g at 4 • C for 10 min, supernatants were discharged. Thereafter, the precipitates were washed again with 1 mL of n-hexane and left to dry in opened microtubes for 1 h at room temperature. Subsequently, 750 µL of SDS buffer (50 mM Tris-HCl with 2% SDS at pH 6.8) were added to each precipitate. Then 20 µL of 0.25 M of tris-(2-carboxyethyl) phosphine (TCEP) solution were briefly mixed with the precipitates and next heated at 50 • C for 20 min in the dark. Subsequently, 20 µL of 0.25 M of iodoacetamide (IAA) solution were added to the precipitates and incubated at 50 • C for another 20 min in the dark. Mixtures were afterward centrifuged for 5 min. Finally, the supernatant was transferred to a new microtube, and precipitates were discharged.
Protein fractions extracted with the three methods were finally mixed with 1 mL of acetone at 4 • C and incubated at −20 • C for 20 min. After centrifugation at 7000× g for 5 min, supernatants were discharged. Right after, 1.5 mL of methanol at 4 • C was added and mixed for 20 s and incubated for 20 min at −20 • C. After centrifugation at 7000× g for 5 min, supernatants were transferred to new microtubes.
After extraction with methods I and II, samples were mixed with 500 µL of 4 M urea buffer containing 0.1 M ammonium bicarbonate under shaking for 1 min followed by 10 min of ultrasonic treatment. Mixtures were centrifuged at 10,000× g for 5 min, and supernatants were stored at −20 • C.

Free Amino Nitrogen
Free amino nitrogen was performed to measure the concentration of free amino groups according to the previously described procedure [41]. The samples were extracted using method III and subsequently dissolved in 4 M urea solution. Glycine stock solution (2 mg/L) was used as standard. Forty microliters of the mixed solution or standard was diluted in distilled water. Briefly, 400 µL of the diluted solution was mixed with 200 µL of ninhydrin staining reagent (containing 1 g Na 2 HPO 4 , 0.6 g KH 2 PO 4 , 50 mg Ninhydrin, and 30 mg Fructose in 10 mL distilled water). The resulting ninhydrin mixtures were heated at 100 • C for 16 min, solutions were cooled down to 20 • C for 20 min and 1 mL of 12 mM potassium iodide solution was added. The absorbance against distilled water was measured using the corresponding spectrophotometer (Jenway Genova, Staffordshire, UK). Absorbance was measured at 570 nm. The content of free amino nitrogen was calculated using equation 1 and the results expressed as µg/mg protein: where, A S is the absorbance of the sample, A G is the absorbance of glycine standard solution, A B is the absorbance of the blank, A C is the absorbance of the corrected blank, F is the dilution factor, and 2 is the concentration of glycine standard solution (mg/L).

Free Thiol Groups
Free thiol groups were measured to examine the modification of cysteine side chains that is induced by the oxidation reaction of phenolic compounds. Reduced glutathione and N-acetylcysteine were used for the calibration curves. For the determination of free thiol groups, the extraction was performed as per method III, but without reduction and alkylation steps. The extracted pellet was dissolved in Tris buffer (0.2 M) after acetone precipitation and methanol purification. For the determination of all total free thiol groups, on the other hand, the pellet was dissolved in 0.2 M Tris buffer containing 1% SDS. Briefly, 450 µL of the dissolved samples or standards were mixed with 30 µL of (5,5dithiobis-(2-nitrobenzoic acid) (DTNB) in the cuvettes while using the spectrophotometer (Jenway Genova, Staffordshire, UK). The mixtures were incubated at room temperature for 10 min and absorbance was measured at 421 nm. The results are expressed as nmol SH groups/mg protein.

Determination of the Composition of Protein-Bound Phenolic Compounds
The composition of the phenolic compounds was determined using an HPLC system (Shimadzu HPLC system GmbH, Leonberg, Germany) [9]. Analyzes were performed with a C8 column (250 × 3.0 mm, particle size 5 µm, at 37 • C; MZ-Analysetechnik GmbH, Mainz, Germany) with a pore size of 300 Å. Undigested samples extracted by the improved SDS extraction method (Option III, Figure S1, Supplementary Material) were used. After extraction, the protein pellets were dissolved in 0.5 mL urea extraction buffer and separated at a flow rate of 0.6 mL/min for 20 min. The eluents were 0.1% trifluoroacetic acid in distilled water (eluent A) and acetonitrile (eluent B). The gradient was as described in the following: 0% eluent B from 0.01 to 3 min; 40% eluent B, from 3 to 7 min; 40% eluent B, from 7 to 10 min; 80% eluent B from 10 to 11 min; 80% eluent B, from 11 to 13 min; and 0% eluent B from 14 to 20 min. The detection was monitored at 280/325 nm for the determination of phenolic compounds (with a UV-Vis SPD-10 AVP detector, Shimadzu, Kyoto, Japan). The results were expressed as µg phenolic compounds/mg protein.

Determination of Protein Content
The different extracted samples were compared for their protein content by using the method of Lowry et al. [42] with Bovine Serum Albumin (BSA) as standard.

Fluorescence Spectroscopy
Fluorescence spectroscopy was performed to evaluate browning reactions occurring during the extraction. Samples were dissolved in 0.5 mL 0.4 M urea extraction buffer. Samples were diluted at 1:1280 with the same extraction buffer. The light emitted in the range of 300-500 nm light wavelength was recorded with a Spectro fluorophotometer (Shimadzu, Duisburg, Germany) using an excitation wavelength of 280 nm. The signal intensity was determined by using the area under the curve for the emitted light (AUC). Pure urea extraction buffer was used for the blanks and subtracted from the AUC of the samples.

Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis
Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis (SDS-PAGE) was performed to determine the resulting molecular weight change as described previously [9]. Briefly, samples were first reduced using Novex™ NuPAGE™ LDS sample buffer (ratio of sample/buffer; 1:1, v/v) and then heated at 95 • C for 10 min. Ten microliters of the solution were then introduced in the wells of a Novex™ NuPAGE™ 4-12% Bis-Tris gels (Thermo Scientific™, Carlsbad, CA, USA). Spectra Multicolor Broad Range Protein-Markers (Thermo Fisher Scientific, Vilnius, Lithuania) were used for molecular weight calibration. The separation was performed for about 2 h at 30 mA. After staining overnight in Coomassie Blue G250 solution, the gels were destained using distilled water containing 10% acetic acid for 2 h. Finally, the gels were scanned (Bio-5000 Professional VIS Gel Scanner, SERVA Electrophoresis GmbH, Heidelberg, Germany) and analyzed with ImageLab software (Bio-Rad Laboratories GmbH, Feldkirchen, Germany).

In-Gel Digestion
SDS-PAGE was performed as described above with the proteins stained with a colloidal Coomassie brilliant blue G 250 solution [43]. Selected gel bands belonging to the 11S coffee storage proteins obtained from the SDS-PAGE were cut with a scalpel and placed in a 0.5 mL microtube. Thereafter, 200 µL of destaining solution (80 mg ammonium bicarbonate in 40 mL acetonitrile/water mixture-50%/50%-v/v) was added and incubated at 37 • C for 30 min. Subsequently, 100 µL of tris-(2-carboxyethyl)-phosphine (TCEP) reduction buffer (25 mM) was added and incubated at 60 • C for 10 min. Next, 100 µL of iodoacetamide (IAA) buffer (25 mM) was mixed and incubated at room temperature for 1 h in the dark. IAA buffer was removed, and 200 µL of the destaining solution was added and incubated at 37 • C for 15 min. One-hundred microliters of acetonitrile were added and incubated at room temperature for another 15 min. Acetonitrile was removed, and gel pieces were allowed to dry for 15 min. The digestion was performed by using trypsin and pepsin enzymes. Digestion using trypsin: The treated gel pieces were mixed with 15 µL of activated trypsin. After incubation at room temperature for 15 min, 25 µL of 25 mM ammonium bicarbonate buffer was added. Digestion using pepsin: The treated gel pieces were mixed 15 µL of activated pepsin. After incubation at room temperature for 15 min, 25 µL of 0.09 M HCl was added. Finally, gel pieces were macerated manually and incubated in the dark for 20 h.

MALDI-TOF-MS
Samples digested for 20 h were placed in an ultrasonic bath for 10 s (BANDELIN electronic GmbH, Berlin, Germany) and subsequently centrifuged for 10 min. After centrifugation, the supernatant was removed and transferred to a fresh reaction vessel. The matrix solution was freshly prepared by mixing 20 mg of α-Cyano-4-hydroxycinnamic acid with 300 µL of acetonitrile and 700 µL of 0.1% TFA. Two microliters of the samples were mixed with 2 µL of matrix solution, and 3 µL of the matrix-sample mixture was placed on a MALDI plate for each sample. Peptide calibration standard II (Bruker, MA, USA) was used as standard. The matrix-sample mixtures were crystallized at room temperature after 20 min. The time-of-flight analyzer (Autoflex Speed, Bruker, MA, USA) was run with the software FlexControl 3.4 (Bruker). After calibration, 30-40% of laser intensity was applied to the crystallized matrix samples until clear peaks were recognized. The mass spectra obtained were analyzed using the software FlexAnalysis 3.3 (Bruker). The background noise was minimized by using the baseline subtraction function. Calibration was performed by using the spectra of the closest standard to the sample position. For each spectrum, a mass list of the peaks with high intensity was exported to the software BioTool 3.2 (Bruker). From there, the detected peptide masses were compared with the Swiss Prot database using the Mascot interface. The parameters of the Mascot database search were set with the selected enzyme (pepsin or trypsin) at 0-2 partials, modification carbamidomethyl, and mass tolerance at 500 ppm. In addition, the FASTA sequence format of the 11S storage protein (P93079_COFAR, UniProt database) was used in the Sequence Editor of the BioTools software to simulate the in silico digestion with trypsin or pepsin, respectively [18]. FASTA is a program for searching and comparing protein and/or DNA sequences [44]. The fragment pattern obtained was in turn compared with the mass lists of the MALDI-TOF-MS spectra.

In-Solution Digestion
For the tryptic digestion, the extracts after acetone precipitation were re-dissolved in 400 µL of digestion buffer containing 0.1 M ammonium bicarbonate. After the addition of 20 µL of the trypsin solution (4 mg/mL), incubation at 37 • C under shaking conditions for 20 h was performed. After the addition of 15 µL of 40% formic acid (to denature the enzyme and stop the reaction), samples were cleaned using the solid phase extraction process before performing the LC-MS/MS analysis.

Solid-Phase Extraction
Solid-phase extraction (SPE) was performed to remove any impurities of the extracted samples. For that purpose, C18 material (300 mg, Chromabond C18 ec, Marchery-Nagel, Düren, Germany) was placed into a glass column. After activating the column (with 6 mL of 50% acetonitrile), the columns were then equilibrated with distilled water containing 0.1% formic acid. Digested samples were then applied onto the column and then washed with 6 mL of distilled water. The analytes were finally recovered using 1 mL of acetonitrile solution containing 0.1% of formic acid, filled to 5 mL using distilled water, and transferred into the vials for the LC-MS/MS analysis.

Development of a Multiple Reaction Monitoring (MRM) Assay Using HPLC-MS/MS
For the analysis of protein modification in connection with phenolic compounds, an MRM method was developed using HPLC-MS/MS. The method focuses on the detection of unmodified and lysine-modified peptides of the α-and β-chain of the trypsin-digested 11S protein present in protein extracts of C. arabica. For the method development, a defined workflow was followed, and the corresponding steps are shown in Figure 1.   The information of the 11S protein sequence was downloaded from the website of UniProt database (https://www.uniprot.org, accessed on 2 September 2020). A total of four unreviewed records with information obtained from literature and curator-evaluated computational analysis was compiled from the UniProt database. The comparison/alignment of the sequences is given (Supplementary Figure S2). Based on previous work, the choice among the listed options was made for P93079_COFAR [18].
The selected sequence in FASTA data file format was opened in the MacCross Lab Skyline Software. An in-silico digestion was performed with a list of all possible peptides, considering the probability of their formation, while referring to their ionization capability, signal strength, and specificity [45]. The following settings were applied: Trypsin was selected as the proteolytic enzyme. The maximum number of missed cleavage sites (partials) was set to zero. No background proteome was set. Peptides with 5-25 amino acid residues were selected for the analysis, while the carbamidomethyl modification of the cysteine residues by iodoacetamide was considered. The transitions were filtered for double-charged precursors and single-charged fragment ions. In addition, only fragmented y-ions were monitored. For each precursor peptide, transitions of up to six fragment ions were analyzed. While using preliminary collision energy for the individual transitions, the intensity was determined. Thereafter, the collision energy was optimized. For this purpose, different collision energies were tested for each intact mass and the one that achieved the largest peak area after the MS/MS run was selected. The selected transitions of the α and β-chain are listed (Supplementary Table S1). To measure lysine-modified masses, a structural modification of an additional mass of 352 m/z for the CQA monomer and a mass gain of 682 m/z for the DiCQA (dimer) was allocated for thiol and lysine groups in Skyline (Supplementary Figure S3) [18]. The modification by DiCQA is relevant only for the interaction with the amino side chains [18].
An HPLC-MS/MS Agilent 1260 system (Agilent Technologies Sales & Services GmbH & Co. KG, Waldbronn, Germany) equipped with an Agilent G6470A series triple Quad LC/MS (Agilent Technologies Sales & Services GmbH & Co. KG, Waldbronn, Germany), integrated with an electrospray (ESI) source operating in positive and negative ionization mode, was used to perform the analysis of the tryptic digested protein from coffee beans. one microliter of each analyte solution was separated through a Kinetex C8 analytical column (150 × 4.60 mm, 2.6 µm, 100 A; Phenomenex, Torrance, CA, USA) using a flow rate of 0.5 mL/min. The column was thermostated at 30 • C. Water containing 0.1% formic acid and 100% acetonitrile were used as eluent A and eluent B, respectively, under the following conditions: 100% solvent A from 0 to 5 min, 50-5% solvent A from 20 to 24 min, and 100% solvent A from 25 to 28 min. The desolvation gas temperature in the ionization source was set to 275 • C, gas flow rate of 11 L/min, nebulizer pressure of 35 PSI, fragmentor voltage of 130 V, and dwell time of 20 s. Nitrogen was applied as collision gas. Multiple reaction monitoring (MRM) mode was selected as the method for the detection, where a specific transition was monitored at a specific retention time. The separation time that ranged from 3 to 20 min was set for the MS-data collection, and the relative abundances of the targeted compounds were determined by using the total area of all the transitions.

Molecular Modeling Experiments
The general methodology for the modeling of the coffee 11S storage protein, template searching, 3D model building, validation, and model refinement was performed as described by Ali et al. [18]. The sequence of P93079_COFAR 11S storage globulin (UniProt online database; https://www.uniprot.org, accessed on 1 Decenber 2021) was used for homology modeling, and the accessibility of two main reaction sites was simulated by molecular modeling to illustrate the eligibility of the detected CQA modifications.

Data Analysis
All experiments were performed in triplicate, and data are expressed as mean ± standard deviation. The results were analyzed with GraphPad Prism 8 ® (GraphPad Software, Inc., San Diego, CA, USA) while applying two-way ANOVA and Tukey's test, using a statistically significance set at p < 0.05.

Free Thiol Groups and Amino Nitrogen in the Protein Fraction
Free thiol groups and free protein-bound amino groups were monitored to determine their reactivity and change during processing. These were determined for coffee beans processed from two pilot wet-processing companies from Guatemala, and results are shown in Table 1 The presence of SDS denatures the proteins, but also improves its solubility. In that case of denaturation and unfolding of the protein, an increase of the thiol values occurs, whereby a higher solubility leads to lower thiol levels. This results in slight fluctuations as documented in Table 1. The values of 7.65 ± 0.13 and 4.57 ± 0.076 nmol/mg protein were obtained for green coffee beans in Santa Sofia and Rio Colorado Company, respectively. These results allow the assumption that cysteine residues are less affected by the post-harvest treatment of coffee beans from Rio Colorado Company, where water is not recirculated, and freshwater is applied in the processing steps. In the Santa Sofia Company, part of the water is returned to the beginning of the process and used again in the de-pulping and fermentation steps, eventually allowing an increase due to microbial load or a more pregnant denaturation of the coffee proteins via enzymatic and/or redox activation steps, which would allow a release of free thiol groups. The identification of the underlying mechanisms represents an element of future studies. The proteins were extracted with option III without reduction and alkylation. For determination of exposed free thiol groups (A), the extracts were dissolved in 0. The values of the measurements of free amino nitrogen (FAN) ranged from 38.15 ± 1.14 and 48.62 ± 3.86 µg/mg protein, which were obtained for the green coffee and final fermentation coffee beans processed in Rio Colorado Company. The content of FAN for green coffee beans processed in the Santa Sofia sample decreased significantly (p = 0.148) compared to de-pulped coffee beans, and values were 35.93 ± 4.14 and 46.97 ± 1.17 µg/mg protein for green and de-pulped coffee beans, respectively. High free amino nitrogen values indicate that samples analyzed have more unmodified lysine side chains that could be available for the initial Maillard reaction during roasting.

Protein Content
To compare the different extracted samples, protein content was determined according to Lowry et al. [42]. Table 2 shows the results of the protein content determined after extraction with PVPP (option I), SDS (option II), and the improved SDS extraction protocol (option III). It is clearly observed that extraction using option II achieves significantly higher protein content for Rio Colorado Company samples compared to extraction using option I (p < 0.0001). Protein content was found to be 1.33 ± 0.21 and 2.08 ± 0.08 mg/100 mg DW for Rio Colorado samples extracted with option I and II, respectively. Values of 1.62 ± 0.04, and 1.73 ± 0.19 mg/100 g DW were obtained for Santa Sofia samples extracted with option I and II, respectively. Samples extracted with option III exhibited the highest content in protein, 3.75 ± 0.04 and 3.45 ± 0.13 mg/100 g DW for the Rio Colorado and Santa Sofia Company, respectively. The results indicate that protein content using SDS in the extraction protocol was higher compared to those using PVPP and ascorbic acid.
A fluorimeter was used to measure the visible fluorescence spectrum of the protein extracted with the three different option protocols. The dissolved extracts, especially after extraction with option I, show a darker green color compared to those of option II and III. Protein extract with option III was colorless compared to option I and II (Supplementary Figure S4). Bongartz et al. [46] reported a change of color when sunflower protein in an alkaline environment was incubated with CQA. An adduct e.g., with lysine results in a green benzacridine derivative as reported in [47,48] and is confirmed with the aid of HPLC coupled with ESI-MS n [49]. If the mechanism proposed by Namiki et al. is followed [47,48], a dimerization prior to the interaction with proteins seems to be a precedent as documented for chlorogenic acid [26], and validated for the adduct formation with the amino group in a model system [25,49]. Thereafter, the present data suggests that the extraction with PVPP induces oxidation of phenolic compounds during the extraction and gives unexpected results by the added protein modifications. The extraction with SDS showed less influence on the analyzed protein fraction; therefore, option III was selected as a method for further analysis. Transmission measurements using a spectrophotometer from 325 to 480 nm wavelength were performed to quantify the different color appearances. The analysis of the variance showed that differences were significant (p < 0.05) for the two types of process. Option III, either for Santa Sofia or Rio Colorado processing, showed the lowest emission values as well as the highest protein content; therefore, it was chosen for further experiments. A clue to the loss of coloration could also lie in the structural changes occurring in the lysine adducts of CQA.
SDS-PAGE was used to compare the extraction protocols (the option I and II), and the results are presented in Figure 2a. The main coffee 11S storage protein was allocated to αand β-chain bands as indicated in Figure 2a. The band intensity after extraction with option II was significantly higher (p < 0.05) compared to the extraction with option I, indicating a higher efficiency of the protein extraction (Figure 2b). The 11S storage protein is composed of three to six monomers of masses of 150−400 kDa, which migrate into storage vacuoles and create, by hydrophobic interactions, the tri-and hexameric quaternary forms [50]. The removal of the disulfide bonds under reducing conditions in the 11S protein monomers releases the α (acidic) and β (basic) subunits [19]. Coffea arabica proteins, in non-reduced state, showed subunits with 55 kDa, and in the reduced state (2-mercaptoethanol) two sub-fractions with 33 and 24 kDa [14,15]. The presented data confirm the role of 11S storage protein as the most abundant one in green coffee. the two types of process. Option III, either for Santa Sofia or Rio Colorado processing, showed the lowest emission values as well as the highest protein content; therefore, it was chosen for further experiments. A clue to the loss of coloration could also lie in the structural changes occurring in the lysine adducts of CQA.
SDS-PAGE was used to compare the extraction protocols (the option I and II), and the results are presented in Figure 2a. The main coffee 11S storage protein was allocated to α-and β-chain bands as indicated in Figure 2a. The band intensity after extraction with option II was significantly higher (p < 0.05) compared to the extraction with option I, indicating a higher efficiency of the protein extraction (Figure 2b). The 11S storage protein is composed of three to six monomers of masses of 150−400 kDa, which migrate into storage vacuoles and create, by hydrophobic interactions, the tri-and hexameric quaternary forms [50]. The removal of the disulfide bonds under reducing conditions in the 11S protein monomers releases the α (acidic) and β (basic) subunits [19]. Coffea arabica proteins, in nonreduced state, showed subunits with 55 kDa, and in the reduced state (2-mercaptoethanol) two sub-fractions with 33 and 24 kDa [14,15]. The presented data confirm the role of 11S storage protein as the most abundant one in green coffee.

In-Gel Digestion
The gel bands of the α-and β-chain of the samples were cut, and digestion either with trypsin or pepsin was performed. The sequence coverage of peptic and tryptic in-gel digestion is given (Supplementary Figure S5). Fragment spectrum obtained by MALDI-TOF-MS analysis was compared to the sequence of the in-silico digested 11S. The fragment spectrum of the pepsin digestion with zero partials covers 3.6 and 4.3% for α-chain and β-chain, respectively. The sequence coverage increased significantly (p < 0.05) if a further partial was allowed in the digestion. Values of 19.5 ± 1.6 and 22.8 ± 1.6% were obtained for α-chain and β-chain, respectively. If two partials were allowed, the sequence coverage increased significantly to 35.3 ± 11.0 and 33.2 ± 12.1%. The sequence coverage of the MALDI-TOF-MS fragment spectra of the trypsin-digested samples does not change significantly with an increase in the partials. Figure S5b (Supplementary Material) shows the sequence coverage of the samples after extraction with options I and II. There is no significant difference between the two extraction options. The sequence coverage of the α-chain shows lower values compared to those of β-chain for Santa Sofia and Rio Colorado. Samples were measured only one time, therefore, differences cannot be statistically checked for significance. These preliminary experiments indicate that tryptic digestion is more effective and was therefore used for further experiments.

Phenolic Substances in Protein Extract by HPLC
Phenolic compounds in protein extracts were determined by HPLC to investigate the content of the phenolic substances as influenced by the post-harvest treatment. The percentage distribution of the seven detected phenolic compounds (Figure 3a) differs from the two coffee companies, especially between de-pulped coffee bean samples. The structures of these seven main compounds are given in our former work [18]. Recent studies determined more than 50 hydroxycinnamic acid derivatives [22,23]. The complexity of following up this type of reaction in coffee-based food matrix arises from the fact that the major phenolic compounds present in coffee beans are liable to isomerization and oxidation, thus producing themselves a series of reaction products, many of which have hardly been characterized [25]. The constituents 3,4-DiCQA, 3,5-DiCQA and 4,5-DiCQA were not detected in de-pulped coffee beans for Santa Sofia Company. Otherwise, the similar composition was detected for the initial fermentation, final fermentation, and washed and green coffee beans. The predominant constituent was 5-CQA with 72.3 and 70.7% for green coffee beans from Santa Sofia and Rio Colorado Company, respectively. The protein extraction (Section 2.2.1) indicates that the proteins were precipitated by the addition of acetone and thereafter washed with methanol. This treatment can remove the loosely bound phenolic compounds, but only the subsequent treatment with urea (desolvation of the protein) and separation under conditions of the chromatographic conditions allow the release of those molecules that are more tightly bound to the proteins, as indicated in Figure 3. It can be inferred that these phenolic compounds detected in the protein extract entered in non-covalent interactions with the proteins. The composition of phenolic compounds of green coffee beans shows similar composition to coffee beans processed in Santa Sofia or Rio Colorado Company. Individual constituents were added up to obtain an approximation of the total phenol content and data are shown in Figure 3b. The highest content of phenolic compounds was found in the green coffee beans with concentrations of 5.21 ± 0.04 and 8.11 ± 0.02 µg/mg protein for Santa Sofia and Rio Colorado Coffee Company, respectively. Thus, the drying step during the post-harvest treatments promotes a stronger binding of the phenolic compounds. The results further indicate that coffee bean processing by Rio Colorado Company triggers stronger binding of the proteins compared to Santa Sofia Company. Again, the post-harvest treatment seems to play a significant role, allowing the hypothesis that the proteins may undergo a more thorough denaturation via enzymatic and/or redox activation steps due to the re-use of water in Santa Sofia Company processing. This in turn could be responsible for structural changes in the storage protein, resulting in a loss of binding sites, thereby allowing a lower binding of the phenolic compounds.

Analysis of the Protein Modification Using HPLC-MS/MS
Proteins can be modified by reaction with phenolic compounds. The bulk of interactions between these two fractions initially result in non-covalent interactions [24,51]. These can be based on hydrophobicity, hydrogen bonds, or ionic bonds [25]. Suryaprakash et al. (2000) showed that proteins from sunflower seeds can act as non-covalent ligands for caffeic and quinic acids [52]. In this context, these interactions often occur between caffeic acid and tryptophan, tyrosine, or lysine side chains [52]. If the interaction of

Analysis of the Protein Modification Using HPLC-MS/MS
Proteins can be modified by reaction with phenolic compounds. The bulk of interactions between these two fractions initially result in non-covalent interactions [24,51]. These can be based on hydrophobicity, hydrogen bonds, or ionic bonds [25]. Suryaprakash et al. (2000) showed that proteins from sunflower seeds can act as non-covalent ligands for caffeic and quinic acids [52]. In this context, these interactions often occur between caffeic acid and tryptophan, tyrosine, or lysine side chains [52]. If the interaction of protein and phenolic compound leads to a covalent bond, it is a protein modification. Such a modification can also occur in parallel with non-covalent reactions. Covalent bonds are catalyzed both enzymatically and non-enzymatically. Both reactions can be divided into two steps and require the presence of oxygen [25]. Generally, the first step involves the formation of an electrophilic reactive species of o-quinone. These are capable of undergoing a nucleophilic addition to proteins e.g., thiol and free amino groups, thereby covalently modifying the proteins. The green coffee beans have been shown to possess polyphenol oxidase (PPO) activity [17]. Therefore, their proteins are liable to this type of modification [18]. In order to access such modifications, the proteins need to be broken down into peptides. Thus, a method was developed as indicated in Section 2.2.11 to encompass such changes. The strategy includes a first step of identifying the unmodified peptides in the major 11S coffee storage protein and a second one that follows a modification by a single or dimerized chlorogenic acid molecule while applying targeted mass spectrometric analysis [18].
The percentage distribution of the detected masses for the α and β-chain of 11S protein can be seen in Figure 4. The peak areas of the fragment masses were added for each transition. It can be seen that the fragment K.LNAQEPSFR.F gave the strongest signal.
The distribution values were between 57.3 and 72.8% for Rio Colorado initial fermentation sample and Santa Sofia green coffee beans, respectively. The modified peptides are dominated by the peptide K.FFLAGNPQQQGGGK.E. This peptide was found modified with the CQA monomer. The corresponding proportion of the modified peptides of the α-chain ranges from 74.5% to 90.7% for Santa Sofia de-pulped coffee beans and Rio Colorado final fermentation, respectively. The peptide R.
LGGK.T was also detected in the modified state. This peptide was found modified with CQA and DiCQA.  The distribution of unmodified peptides of the β-chain is depicted in Figure 4c. More homogenous distribution is observed compared to the distribution in the α-chain. In fact, the response for the β-chain was much stronger and more peptides could be allocated, in turn improving the overall sequence coverage.
From Figure 4d, it can be seen that only two modified peptides were detected. The peptide R.ITTVNSQK.I with DiCQA lysine modification is the predominant peptide in almost all the samples. Washed coffee beans in Santa Sofia show a proportion of 48.5% K.VFDDEVK.Q with a CQA-lysine adduct, differing significantly from the rest of the samples investigated. The reason for this observation is not yet clear and further experiments are needed to confirm this behavior.
The HPLC-MS/MS method was also applied to protein extracts from green C. arabica beans of various origins. The distributions of the unmodified and modified peptides of the α-chain are depicted in Figure 5a,b. Among the modified peptides, K.IIQK.L was found to be predominant. The lysine-modified R.
LGGK.T peptide was found in all the samples. The distribution of unmodified peptides of the β-chain is depicted in Figure 4c. More homogenous distribution is observed compared to the distribution in the α-chain. In fact, the response for the β-chain was much stronger and more peptides could be allocated, in turn improving the overall sequence coverage.
From Figure 4d, it can be seen that only two modified peptides were detected. The peptide R.ITTVNSQK.I with DiCQA lysine modification is the predominant peptide in almost all the samples. Washed coffee beans in Santa Sofia show a proportion of 48.5% K.VFDDEVK.Q with a CQA-lysine adduct, differing significantly from the rest of the samples investigated. The reason for this observation is not yet clear and further experiments are needed to confirm this behavior.
The HPLC-MS/MS method was also applied to protein extracts from green C. arabica beans of various origins. The distributions of the unmodified and modified peptides of the α-chain are depicted in Figure 5a,b. Among the modified peptides, K.IIQK.L was found to be predominant. The lysine-modified R.
LGGK.T peptide was found in all the samples. The proportions of modified peptides in the α-chain are shown in Figure 6a,b as mean values for each type of coffee processing. The percentage of modified peptide R.LGGK.T in green coffee beans has a higher trend for the wet and the monsoon compared to the dried and the half-wet processing. The situation was similar with the modified peptide K.FFLAGNPQQGGK.E in green coffee beans. A significant increase in the peak area of the modified peptide fraction was observed among the dried (13.0%), wet (19.5%), and monsoon processing (42.6%). In the case of β-chain (Figure 6c,d), it can be seen that the proportion of modified R.ITTVNSQK.I peptide is higher in the wet compared to the halfwet processing. In the case of the peptide K.VFDDEVK.Q, a significantly different (p < 0.05) proportion is seen in monsoon compared to the dried, half-wet, and wet processing. The proportions of modified peptides in the α-chain are shown in Figure 6a,b as mean values for each type of coffee processing. The percentage of modified peptide R.LGGK.T in green coffee beans has a higher trend for the wet and the monsoon compared to the dried and the half-wet processing. The situation was similar with the modified peptide K.FFLAGNPQQGGK.E in green coffee beans. A significant increase in the peak area of the modified peptide fraction was observed among the dried (13.0%), wet (19.5%), and monsoon processing (42.6%). In the case of β-chain (Figure 6c,d), it can be seen that the proportion of modified R.ITTVNSQK.I peptide is higher in the wet compared to the halfwet processing. In the case of the peptide K.VFDDEVK.Q, a significantly different (p < 0.05) proportion is seen in monsoon compared to the dried, half-wet, and wet processing. The peptides K.FFLANGPQQGGK.E and R.
LGGK.T of the α-chain show a high level of lysine modifications in the samples analyzed. In the case of the β-chain, peptides were less lysine-modified. Schwenke et al. [53] and Rawel et al. [32] showed that the hydrophilic C-terminal region of the α-chain on the surface of the 11S contained a protective function against the internal β-chain, thereby being the preferred point of attack for chlorogenic acid. The modeling of selected modification sites for the two chains of the coffee 11S protein monomer is given in Figure 7. The results of these simulations indicate the accessibility of the two reaction sites K.FFLAGNPQQQGGGK.E on the α-chain and K.VFDDEVK.Q on the β-chain, which are determined to be two of the most modified peptides out of those detected. Interestingly, the modification does not hinder the digestion by trypsin. Green coffee samples produced by different processing methods showed a connection with CQA-dependent lysine modification. Further work should confirm the amino acid involved in these reactions. Samples of the half-wet, wet, and monsoon processing contain larger proportions of the modified target peptides. Presumably, the increased contact with water favors the oxidation reactions under consideration. The peptides K.FFLANGPQQGGK.E and R.
LGGK.T of the α-chain show a high level of lysine modifications in the samples analyzed. In the case of the β-chain, peptides were less lysine-modified. Schwenke et al. [53] and Rawel et al. [32] showed that the hydrophilic C-terminal region of the α-chain on the surface of the 11S contained a protective function against the internal β-chain, thereby being the preferred point of attack for chlorogenic acid. The modeling of selected modification sites for the two chains of the coffee 11S protein monomer is given in Figure 7. The results of these simulations indicate the accessibility of the two reaction sites K.FFLAGNPQQQGGGK.E on the α-chain and K.VFDDEVK.Q on the β-chain, which are determined to be two of the most modified peptides out of those detected. Interestingly, the modification does not hinder the digestion by trypsin. Green coffee samples produced by different processing methods showed a connection with CQA-dependent lysine modification. Further work should confirm the amino acid involved in these reactions. Samples of the half-wet, wet, and monsoon processing contain larger proportions of the modified target peptides. Presumably, the increased contact with water favors the oxidation reactions under consideration.

Conclusions
This study was initiated to encompass the changes taking place in the protein fraction of coffee beans during different post-harvest processing steps. Three different extraction protocols to isolate the protein fraction were compared. Subsequently, a method to detect lysine and cysteine modification in coffee protein was developed. The unmodified peptide K.LNAQEOSFR.F of the α-chain was detected with high signal intensity. The unmodified peptides of the β-chain showed a diverse spectrum and, therefore, several options for selecting peptides as markers were available. The peptides K.FFLANGPQQGGK.E and R.
LGGK.T of the α-chain showed a high level of lysine modifications in the samples analyzed. In the case of the β-chain, peptides were less modified. Recirculation of water during coffee processing led to more lysine modification. The main output of these preliminary experiments indicates that post-harvest treatment does affect the protein quality, and a larger sample set needs to be analyzed to validate the observed trends and changes. Further work will be directed to the possibility of using the corresponding peptides with more in-depth analysis of further modifications, in order to establish them as biomarkers while also bringing them closer in relation to coffee cup quality.
Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1, Figure S1. Schematic flowchart of the three protein extraction protocols used; Figure S2. Alignment of the data available for 11S from the UniProt database; Figure S3. Predicted structures of modifications by chlorogenic acid (CQA); Figure S4. Comparison of fluorescence emission spectra of protein extracts; Figure S5. Data of MALDI-TOF-MS measurements of the in-gel digested proteins; Table S1. Analyzed mass transitions of the P93079 sequence of the 11S protein.

Conclusions
This study was initiated to encompass the changes taking place in the protein fraction of coffee beans during different post-harvest processing steps. Three different extraction protocols to isolate the protein fraction were compared. Subsequently, a method to detect lysine and cysteine modification in coffee protein was developed. The unmodified peptide K.LNAQEOSFR.F of the α-chain was detected with high signal intensity. The unmodified peptides of the β-chain showed a diverse spectrum and, therefore, several options for selecting peptides as markers were available. The peptides K.FFLANGPQQGGK.E and R.
LGGK.T of the α-chain showed a high level of lysine modifications in the samples analyzed. In the case of the β-chain, peptides were less modified. Recirculation of water during coffee processing led to more lysine modification. The main output of these preliminary experiments indicates that post-harvest treatment does affect the protein quality, and a larger sample set needs to be analyzed to validate the observed trends and changes. Further work will be directed to the possibility of using the corresponding peptides with more in-depth analysis of further modifications, in order to establish them as biomarkers while also bringing them closer in relation to coffee cup quality.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/foods11020159/s1, Figure S1. Schematic flowchart of the three protein extraction protocols used; Figure S2. Alignment of the data available for 11S from the UniProt database; Figure S3. Predicted structures of modifications by chlorogenic acid (CQA); Figure S4. Comparison of fluorescence emission spectra of protein extracts; Figure S5. Data of MALDI-TOF-MS measurements of the in-gel digested proteins; Table S1. Analyzed mass transitions of the P93079 sequence of the 11S protein.