Proteometabolomic Analysis Reveals Molecular Features Associated with Grain Size and Antioxidant Properties amongst Chickpea (Cicer arietinum L.) Seeds Genotypes

Legumes are an essential source of nutrients that complement energy and protein requirements in the human diet. They also contribute to the intake of bioactive compounds such as polyphenols, whose content can vary depending on cultivars and genotypes. We conducted a comparative proteomics and metabolomics study to determine if there were significant variations in relevant nutraceutical compounds in the five genotypes of Kabuli-type chickpea grains. We performed an isobaric tandem mass tag (TMT) couple to synchronous precursor selection (SPS)-MS3 method along with a targeted and untargeted metabolomics approach based on accurate mass spectrometry. We observed an association between the overproduction of proteins involved in starch, lipid, and amino acid metabolism with gibberellin accumulation in large grains. In contrast, we visualized the over-accumulation of proteins associated with water deprivation in small grains. It was possible to visualize in small grains the over-accumulation of some phenolics such as vanillin, salicylic acid, protocatechuic acid, 4-coumaric acid, 4-hydroxybenzoic acid, vanillic acid, ferulic acid, and kaempferol 3-O-glucoside as well as the amino acid l-phenylalanine. The activated phenolic pathway was associated with the higher antioxidant capacity of small grains. Small grains consumption could be advantageous due to their nutraceutical properties.


Introduction
Legumes have been incorporated as the second most important vegetable food source after cereals. Their main nutritional value lies in their protein content of 20-50%, in addition to complementing the essential amino acid scheme in grain-based diets [1]. Chickpeas (Cicer arietinum L.), also known as garbanzo beans, are globally the third most produced and second most consumed food legume [2]. Its grains are the main consumable part of the plant. They have high protein content with essential amino acids, unsaturated fats, vitamins, carbohydrates, crude fiber, minerals, and carotenoids that meet the nutritional need of both humans and animals [2][3][4][5][6][7]. India is the leading producer of chickpea grain, with a significant portion of the almost 12 million tons of annual global production [8].
Mexico (25.6345 latitude and −108.420443 longitude). The temperatures for the agricultural cycle of 2017-2018 were 11-28 • C in November and 15-18 • C in February, and for the agricultural cycle of 2018-2019 were 21-29 • C in November and 13-23 • C in February (weatherspark.com (accessed on 27 July 2022)). The grains were sieved and freed from impurities, and rinsed with distilled water from previous protein and metabolites extraction.

Seed Measurement
The length, width, and thickness of 25 seeds per genotype were measured using a digital caliper (Model 3415 Traceble ® , Friendswood, TX, USA). The seed weight (n = 25 per phenotype) was determined using an analytical balance (Model PW254 ADAM, Milton Keynes, UK).

Proximate Analysis
The proximate composition of the chickpea seeds was determined according to the Association of Analytical Communities (AOAC International, 1995) methods for ash (Section 942.05), crude fiber (AOAC Section 978.10), and moisture (AOAC Section 930.15). Crude protein was determined by the micro-Kjeldahl method (AOAC Section 976.05), and crude lipids were determined by the AOAC method (AOAC Section 920.39).

Total Antioxidant Capacity by 2,2-Diphenyl-1-picrylhydrazyl (DPPH) and Oxygen Radical Absorbance Capacity (ORAC)
The crude methanolic extract was obtained from 200 mg of sample (fine flour) suspended in 8 mL of methanol (ACS, Tedia, CRT-Mexico). The mixture was placed into 50 mL centrifuge tubes and sonicated for 30 min (FS20, Fisher Scientific, Waltham, MA, USA), during which they were mixed and vortexed (VX100, Labnet, Madrid, Spain) every 10 min. Then, the samples were incubated (VWR Avantor, Radnor, PA, USA) for 15 h (25 • C/darkness, 200 rpm) and centrifuged at 4000 rpm/10 min in a Hermle Z 366 centrifuge (Labortechnik GmbH, Wehingen, Germany). The supernatant was decanted and stored at −20 • C, and the pellet was re-extracted for 2 h and processed again as described before. The supernatants from the two extractions were combined and concentrated to dryness in a rotary evaporator under reduced pressure (Yamato, Santa Clara, CA, USA) at 37 • C and reconstituted in 1 mL of methanol (ACS, Tedia, CTR-Mexico). The antioxidant capacity of the methanolic extracts was determined by the DPPH and ORAC methods as described by Cardador-Martínez, et al. [22] and Prior et al. [23], respectively. The results were expressed as µmol equivalents of Trolox/g of flour.

Proteomic Analysis
Proteomic analysis, including protein extraction, digestions, peptide fractionation, TMT labeling, SPS-MS3, and data analysis were conducted as described by Monribot-Villanueva et al. [24,25]. Details of the proteomic analysis are presented in Appendix A. We considered two biological replicates for each grain and carried out a 10plex TMT tag according to the manufacturer's instructions as follows: for SmallF1, we used 126 and 127N; for BigF2 we used 127C and 128N; for SmallCo we used 128C and 129N, for BigCo we used 129C and 130N, and for Jumbo we used 130C and 131. For protein identification, we used the chickpea UniProt Reference proteome (UP000087171) as the database. Data are available via ProteomeXchange with identifier PXD030279.

Metabolomic Analysis
Untargeted metabolomic analyses of methanolic extracts were performed as indicated by Monribot-Villanueva, et al. [24,25] in an Ultra-High Performance Liquid Chromatograph coupled to a High-Resolution Mass Spectrometer (UPLC-HRMS-QTOF; Class I-Synapt G2-Si, Waters, Milford, MA, USA). Metabolomics data were first processed with the Masslynx and Markerlynx softwares (versions 4.1, Waters, Milford, MA, USA). Tentative identification was performed using two approaches considering a maximum Antioxidants 2022, 11, 1850 4 of 22 mass error of ±5 ppm. The first approach was by the MetaboAnalyst bioinformatic platform (https://www.metaboanalyst.ca/MetaboAnalyst/home.xhtml, last accessed on 1 July 2022) with the Peaks to Pathways tool using the algorithm Mummichog and the rice reference metabolome. The second strategy was a manual annotation with m/z values using the public database Foodb (https://foodb.ca/, last accessed on 1 July 2022). The identification and quantification of phenolic compounds were based on dynamic multiple reaction monitoring methods as was previously reported in Juárez-Trujillo et al. [26], Camacho-Vázquez et al. [27], and Monribot et al. [21,22]. Detailed information is shown in Appendix A. Phenolics heatmap was performed with MetaboAnalyst bioinformatic platform [28].

Statistical Analysis
Raw data were analyzed to determine whether they differed from a normal distribution (Shapiro-Wilk test). The homoscedasticity of the data was tested using the Barlett test for normally distributed data and the Fligner-Killeen test for data with non-normal distribution. Variables that conformed to parametric assumptions were analyzed using one-way analysis of variance (ANOVA) and the Duncan Means Test (α < 0.01); those that did not were analyzed using the non-parametric Kuskal-Wallis test and pairwise Wilcox test. We also generated a correlation coefficient matrix between protein content and the morphometric variables, among morphometric variables, and among proximate composition variables. All statistical analysis was carried out in the R environment. Origin version 8.5.1 (OriginLab, Nothampton, MA, USA) and CorelDraw version 17.1.0.572 (Corel, Ottawa, Canada) were used to generate graphs and figures, respectively.

Phenotypic Features of Chickpea Genotypes with Different Grain Size
The five genotypes of Kabuli chickpeas studied shared phenotypic characteristics. They were white-cream colored, pronounced rugosity, rounded, and with a small beak ( Figure S1A). The theoretical volume for the large grains (BigF2, Jumbo, and BigCo) was between 1100 and 1540 mm 3 and for the small grains (SmallF1 and SmallCo) was 700-1000 mm 3 ( Figure S1B). The morphometric traits showed statistical differences between the large and small grains in all parameters recorded ( Figure S1C). Small grains exhibited comparable physical parameters between them (SmallF1 and SmallCo). Among the three large grains, BigCo had larger weight values than the other two large grains ( Figure S1D).

Proximate Analysis of Chickpea with Different Grain Size
The proximate composition of chickpea grains was evaluated among the five genotypes (Table S1). The BigF2 genotype had the highest protein content, followed by Jumbo, Small F1, SmallCo, and BigCo. Lipid content was highest in SmallF1 and lowest in SmallCo and BigF2. The crude fiber content varied from 4.06 to 7.33%; the Jumbo genotype had the highest content, followed by SmallCo, BigF2, BigCo, and SmallF1. The ash content was relatively stable among the five genotypes, ranging from 3.02 to 3.23%, with no statistical differences. The nitrogen-free extract content ranged from 59.66 to 53.62%. The BigCo and SmallCo genotypes had the greatest nitrogen-free extract content, followed by BigF2, Jumbo, and SmallF1. There were no correlations between protein content versus morphometric characteristics such as weight, length, etc., but there were strong positive correlations among the morphometric characteristics ( Figure S2A). When considering correlations among proximate variables, there were positive and negative correlations, except for protein content which was not correlated with any other factor (protein vs. moisture, lipids, ash, and crude fiber; Figure S2B). Also, the ash did not correlate with fiber content, nitrogen-free extract content, or moisture ( Figure S2B).

Comparative Proteomics Revealed Contrasting Proteome Profile in Small vs. Large Grain Genotypes
Protein bands in SDS-PAGE had similar patterns in all chickpea genotypes analyzed ( Figure 1A). There were prominent protein bands between 30 and 45 kDa, and there were no visibly noticeable differences among protein profiles. Under comparative proteomics based on TMT labeling and the SPS-MS3 approach, the analytical pipeline identified 1854 proteins associated with 6107 peptide groups and 54,618 MS/MS spectra (ProteomeXchange identifier: PXD030279). The principal component analysis of the identified protein abundances exhibited a grouping pattern. Large grains exhibited similar protein abundances that contrasted with the small grains ( Figure 1B). The heatmap based on protein abundance also showed differences among the chickpea genotypes analyzed. Hierarchical clustering clearly separated the small genotypes from the large ones ( Figure 1C). correlations among proximate variables, there were positive and negative correlations, except for protein content which was not correlated with any other factor (protein vs. moisture, lipids, ash, and crude fiber; Figure S2B). Also, the ash did not correlate with fiber content, nitrogen-free extract content, or moisture ( Figure S2B).
Because of the higher number of differential proteins observed in the BigCo and SmallCo comparison, we focused on analyzing the total number of differential proteins in detail. We analyzed proteins identified in a higher proportion in BigCo compared with SmallCo grains based on GO enrichment and clustering of GO terms ( Figure 3A, Table S2). This allowed us to visualize two central clusters with the term tricarboxylic acid cycle and stress-related proteins. Within this cluster, we observed proteins associated with the biosynthesis of starch, methionine, and 2-oxoglutarate metabolism connected to the production of sugar polymers and proteins. We also analyzed proteins identified in a higher proportion in SmallCo than BigCo grains ( Figure 3B, Table S2). We found three main clusters, including embryo development ending in seed dormancy, stressrelated proteins, and protein folding. It is worth noting that the identification of proteins associated with a response to salt stress was higher in SmallCo than BigCo grains. A pattern of associated proteins with superoxide metabolism was also evident in small grains compared to large ones ( Figure 3B).

Untargeted Metabolomics Exhibited Additional Clues of the Nutraceutical Potential of Chickpeas
To gain more information about the molecular differences among large and small chickpeas, an untargeted metabolomic approach was performed by UPLC-ESI-HRMS in positive (ESI + ) and negative (ESI − ) modes. Principal components analyses (PCA) of the ESI + ( Figure 4A) and ESI − ( Figure 4B) datasets exhibited contrasting groupings based on the chemical composition. In the ESI + PCA, there was no clear grouping tendency; the most chemically similar genotypes were SmallCo and BigF2 ( Figure 4A). In contrast, the ESI − PCA clearly showed a grouping tendency among all the genotypes evaluated except for SmallF1 ( Figure 4B). Venn diagrams of the spectrometric features or signals (mass/charge-retention time [m/z-Rt]) detected in the metabolic profile from each genotype showed a core metabolome of 39 and 77 signals in the ESI + and ESI − datasets, respectively ( Figure 4C,D). The tentative identification of this core metabolome provided information on the physiological status of grains of various sizes ( Figure 4E). We found carbohydrates, organic acid, nitrogen-containing compounds, phenolics, terpenoids, and lipids (Table S4). Our untargeted comparative analysis found that gibberellin and gibberellin A5 were more abundant in BigCo grains than in the other genotypes, as well as glutamyltyrosine, glutamylphenylalanine, some carbohydrates (xylopyranosyl-arabinose, galactopinitol, and an oligosaccharide) and phenolic compounds (trihydroxy-heptamethoxy-biflavan and genistein-rhamnoside) ( Figure 4E). In contrast, SmallF1 genotype grains exhibited an over-accumulation of tryptophan, dihydrophaseic acid, phenolic compounds such as pentamethoxyflavanone and protocatechuic acid glucoside, the triterpenoid saponins, soy saponin I and III, and the polysaccharide amylopectin ( Figure 4E). Interestingly, lipid compounds were enriched in the Jumbo genotype, exhibiting higher content of glycerolipids, lysolecithins, and lysophospholipids ( Figure 4E). Citric acid was the only organic acid identified in the core metabolome, and its content was highest in the Jumbo genotype ( Figure 4E).
We also noticed a high number of signals specifically detected in SmallF1 grains 47 and 68 in the ESI + and ESI − datasets, respectively, that were considered differential chemical markers ( Figure 4C,D). These SmallF1 chemical markers included carbohydrates (mainly oligomers and polymers), nitrogen-containing compounds (mainly amino acid derivatives, an auxin, among others), phenolic compounds (mainly isoflavones), terpenoids (triterpenoid saponins), and lipids (phospholipid related and fatty acids) ( Table 1).

Targeted Metabolomics Confirmed the Accumulation of Antioxidant Polyphenols in SmallF1
Multiple reaction monitoring analyses based on more than 60 polyphenolics allowed us to determine the endogenous content of 13 polyphenolic-related compounds. Molecules such as vanillin, L-phenylalanine, salicylic acid, protocatechuic acid, 4-coumaric acid, 4-hydroxybenzoic acid, vanillic acid, ferulic acid, and kaempferol 3-O-glucoside were more abundant in SmallF1 compared to other grains ( Figure 5A). SmallCo grains exhibited more gentisic and sinapic acids content than the other samples but also were over-accumulated in SmallF1. The high content of the total phenolic compounds in SmallF1 grains positively correlated with the total antioxidant capacity as determined by the DPPH and ORAC assays ( Figure 5B,C), compared to other grains, which supported our proteomic and metabolomic data.

Discussion
Chickpea grains are classified into two main groups-large and small grains. The size of the grain is the most important trait for international trade because consumers demand large grains. Small grains are either sold at a low price or used as seeds for the next agricultural cycle. To our knowledge, no comparative studies have been carried out among Kabuli chickpea genotypes to determine whether there are molecular signatures or nutritionally valuable compounds related to grain size using integrative multi-omics approaches, including proteomics (isobaric tandem mass tag) and metabolomics (untargeted by accurate MS). We selected genotypes (cv. Blanco Sinaloa and Jumbo) with contrasting grain sizes, one group of small grains formed by two genotypes (SmallCo and  (Table S5, (A)). Total antioxidant capacity of chickpea grain flour by oxygen radical absorbance capacity (ORAC; (B)) and 2-diphenyl-1-picrylhydrazyl (DPPH; (C)). We showed average values and the corresponding standard deviation. Different letters indicate statistical differences between samples. Data were calculated based on dry weight content.

Discussion
Chickpea grains are classified into two main groups-large and small grains. The size of the grain is the most important trait for international trade because consumers demand large grains. Small grains are either sold at a low price or used as seeds for the next agricultural cycle. To our knowledge, no comparative studies have been carried out among Kabuli chickpea genotypes to determine whether there are molecular signatures or nutritionally valuable compounds related to grain size using integrative multi-omics approaches, including proteomics (isobaric tandem mass tag) and metabolomics (untargeted by accurate MS). We selected genotypes (cv. Blanco Sinaloa and Jumbo) with contrasting grain sizes, one group of small grains formed by two genotypes (SmallCo and SmallF1) and the other one formed by three large grains (BigCo, BigF2, and Jumbo). Although all five genotypes selected have the white Kabuli chickpea characteristics of white-cream color and pronounced rugosity ( Figure S1A), they can be divided into two groups based on size (weight and morphometric characteristics; Figures S2 and S3).

Large Chickpea Genotypes Have Higher Levels of Starch, Lipid, Amino Acid, and Gibberellin Metabolism-Related Proteins
The grain filling stage is the final period during which the kernel weight is established, a trait that is directly linked to the grain yield [29]. In the BigCo grain, there was an over-accumulation of proteins associated with starch biosynthesis connected with oligosaccharide accumulation (Figures 2 and 3). Similarly, previous proteomic studies of chickpea seeds and sprouts identified proteins associated with the carbon and central metabolisms such as starch synthase 1, glyceraldehyde-3-phosphate dehydrogenase, malate dehydrogenase, fructose-bisphosphate aldolase, pyruvate kinase, and ATP synthase subunit alpha [30]. The over-accumulation of proteins associated with amino acid metabolism (Figures 2 and 3) overlapped with the over-accumulation of glutamyltyrosine, and glutamylphenylalanine, which are amino acid derivatives ( Figure 4E). Previous proteomic and metabolomic studies in different tissues of seed wheat during the primary phase of the grain filling process exhibited the over-accumulation of isoleucine, methionine, threonine, valine, and lysine in most of the analyzed tissues, which agrees with our proteomic data in chickpea [31]. Interestingly, the presence of glutamyltyrosine and glutamylphenylalanine, plus the accumulation of oligosaccharides have been related to the highly appreciated "kokumi" taste in soybean [32]. In addition, we also found that gibberellins were over-accumulated in BigCo compared with the other genotypes ( Figure 4E). Growth regulators have been previously suggested to have an essential role in modulating the grain filling process [33,34]. It was recently reported that gibberellin exogenous application of corn shank and silks improved the grain-filling rate, grain weight, and yield, as well increasing production of auxin, gibberellin, zeatin, and ABA, and activating ROS scavenger enzymes [35]. In rice, the quantitative trait locus (QTL) grain width 6 (GW6) was recently characterized, which encodes a gibberellic acid stimulated transcript (GAST) family [36]. GW6 positively controls grain size and weight, providing an increase in grain yield. Besides, auxin accumulation was connected with the regulation of starch biosynthesis during grain filling in rice, in which the knowledge regulatory network is minimal [37,38]. However, recent studies during rice development suggested that a subunit of heterotrimeric G protein (RGB) may positively regulate the expression of transcription factor OsNF-YB1, which in turn activates the OsYUC11 transcription leading to an increase in auxin level during starch accumulation [39]. We putatively identified the auxin indole-butyric acid present in the SmallF1 genotype grain (Table 1). However, the hormonal regulatory network of chickpea grain size is beyond the scope of our proteometabolomic study.
Chickpeas produce orthodox seeds, which means they will survive during ex situ conservation, where grains cope with desiccation or freezing conditions without any detrimental effect on the dormancy [40]. The mechanism associated with desiccation tolerance is turned on during late seed maturation with the accumulation of crucial proteins observed in our study. These proteins include the LEA, heat shock proteins, and ROS scavenger proteins. It is well known that there is variation among orthodox seeds in sensitivity to drying and storability that can affect grain survival [41]. In chickpeas, limited studies have been conducted to determine the association between grain size and sensitivity to desiccation and storability. However, our proteomic analysis exhibited a differential accumulation of essential proteins among grain genotypes. Small grains over-accumulated stress-related proteins related to desiccation and ABA responses (Figures 2 and 3), which suggested the molecular mechanism responsible for managing drying conditions. Our proteomic data was supported by the over-accumulation of stress-related metabolites, mainly in SmallF1, related phenolics (pentamethoxyflavanone and protocatechuic acid 4-glucoside), and terpenoids (saponins soyasaponin I, and III) ( Table 1). Previously, chickpea germinated in vitro, and the emulsion system boosted the antioxidative activity of extracted phenolic compounds [42]. Moreover, the protocatechuic acid 4-O-glucoside and 6-hydroxydaidzein were detected in the soluble free phenolic compounds [43]. Besides, in whole grain oats, it was suggested that the occurrence of the phenolic alkaloid avenanthramides, and the steroidal saponins avenacosides A and B, with potent antioxidant and anti-inflammatory effects [44].

Chickpea Grains Exhibited Essential Stress Response Connected to the Overproduction of Polyphenols
The significant stress response observed in SmallF1 grains led to the over-accumulation of several key secondary metabolites with nutraceutical properties (Figures 4E and 5A and Table 1). Our proteomic approach also exhibited the overproduction of ROS scavenger proteins (DHAR, PDIL1-1, APX3 CSD1, CSD12, and BOLA2, Figures 2 and 3). Besides, the over-accumulation of ABA-stress-related protein could suggest a tight regulation between ABA and redox balance during seed desiccation. In orthodox seeds, the acquisition of desiccation tolerance involves a complex regulatory network including several transcription factors, growth regulator signal pathways, over-accumulation of seed storage, and LEA proteins, as observed in our proteomic scrutiny ( Figure 3). However, the currently limited annotation of Cicer arietinum genome hiders to find master molecular players associated with seed desiccation tolerance in chickpeas. Nevertheless, our findings related to the accumulation of polyphenols as a response to desiccation stress adds value to less commercialized small grains. In SmallF1, we identified 4-coumaric acid ( Table 1 and Figure 5A), which has been shown to have high free radical scavenging, antiinflammatory, antineoplastic, and antimicrobial activities [45]. In addition, several studies have suggested the antioxidant and therapeutical effects of secondary metabolites, including catechin glucoside and malvidin-acetylglucoside [46]. Isoflavones are well-known phenolic compounds, and in our study, we determined the presence of glycitein, glycitin, and malonylglycitin in SmallF1 grains (Table 1). Glycitein is an abundant isoflavone in soybean and has antioxidant, estrogenic, and anti-cancer properties [47]. Also, some triterpenoid saponins are over-accumulated in SmallF1, including soyasaponin I, III, V, and saponin D ( Table 1 and Figure 4). Plant triterpenoid saponins have shown different pharmacological activities. For example, betulinic acid from Ligustrumlucidum fructus induced antioxidant capacity in mice liver damage induced by ethyl alcohol in vivo [48]. Besides, the ursolic acid derivative combined with kanamycin against Escherichia coli showed a reduction of MIC value from 128 µg/mL to 8 µg/m [49]. Moreover, oleanolic acid exhibited in vitro inhibition of the HIV-1 replication in infected human peripheral mononuclear cells (PBMC, EC 50 value: 22.7 mM), naturally infected PBMC (EC 50 value: 22.7 mM), and monocytes and macrophages (EC 50 value: 57.4 mM) [50]. Soyasaponins and soyasapogenins suppressed the growth of HT-29 colon cancer cells determined by the WST-1 assay over a concentration range of 0-50 ppm [51]. Therefore, the putative identification of saponins in chickpeas could provide the starting point for finding nutraceutical molecules in chickpeas.
Corroborating the highly confident identification and higher endogenous content of essential polyphenols in smallF1 compared to other size grains underpin our proteomic and metabolomic results where stress conditions correlated with the induction in the production of polyphenols which comprise a wide range of molecules involved in the essential biological and physiological process including response to stress. Comprehensive generated information showed that the phenylpropanoid biosynthetic pathway is commonly activated under severe environmental conditions like drought, extreme temperatures, salinity, heavy metal pollution, and ultraviolet radiation [52]. We were able to identify and quantify the endogenous over-accumulation of molecules such as vanillin, L-phenylalanine, salicylic acid, protocatechuic acid, 4-coumaric acid, 4-hydroxybenzoic acid, vanillic acid, ferulic acid, and kaempferol 3-O-glucoside in smallF1 grains. Polyphenols are suggested to confer the ability to maintain redox homeostasis under drought and salt stress, as observed in rice and wild relatives of wheat, respectively, [53,54]. The presence of the above-mentioned compounds in the SmallF1 genotype positively correlates with the high antioxidant capac-ity displayed by this genotype (Figure 5). Previous studies showed that colored chickpea grains have significant antioxidant values than cream and beige seed [15], but in our study, all genotypes analyzed have the same cream color ( Figure S1A). Therefore, we could suggest that SmallF1 grains can over-produce polyphenols probably due to the response to stress conditions ( Figure 5A). The profiling of these polyphenols during germination as it was previously reported for formononetin and biochanin [16], as well as in other experimental conditions, could contribute to unraveling the regulation of the biosynthesis of these bioactive molecules.

Alternative Application of Small Chickpeas as Nutraceutical Source
The dynamic of grain commercialization is focused on yield percentage. Therefore, bigger grains provide a better economic profit, which is justified by the need for protein, fiber, sugar, and lipids. However, after two years of the pandemic due to the SARS-CoV-2 virus and its variants, emerging diseases, including severe acute hepatitis in children and monkeypox, make us reconsider the continuous search for healthy and functional foods. Animal, human, and epidemiologic studies strongly suggest that polyphenols have antioxidant and anti-inflammatory properties that probably prevent and exert a therapeutic effect in several illnesses, for example, in cardiovascular disease, neurodegenerative disorders, cancer, and obesity [55]. However, in whole grains, bioactive molecules have long been undervalued [56]. In buckwheat (Fagopyrum esculentum Moench), amaranth (Amaranthus L.), and quinoa (Chenopodium quinoa Willd.), spectrophotometric approaches and free radical scavenging activity 2,2-diphenyl-1-picrylhydrazyl (DPPH) determinations suggest that the cultivars Bamby (buckwheat), Annapurna (amaranth), and Quinua (quinoa) are excellent sources of bioactive compounds [57]. Cereals contain free and bound forms of polyphenols, and some examples include ferulic, protocatechuic, gallic, vanillic, and syringic acids, among others [56]. In chickpeas, we confidently identified and quantified the endogenous content of vanillin, L-phenylalanine, salicylic, protocatechuic acid, 4-coumaric, 4-hydroxybenzoic, vanillic, ferulic acids, and kaempferol 3-O-glucoside.
Phenolic compounds are widely distributed in plants and exhibit multiple bioactivities including antitumoral, antioxidant, and anti-inflammatory, just to mention some examples [58][59][60]. The three most abundant phenolic compounds identified and quantified in the Kabuli chickpea grains evaluated were 4-hydroxyphenylacetic acid, 4-hydroxybenzoic acid, and gentisic acid. The phenolic compound 4-hydroxyphenylacetic acid was determined mainly in big grains (Table S5 and Figure 5A), and it has exhibited anti-inflammatory and hepatoprot in induced lung injury in rats [61] and in acetaminophen-induced liver injury in mice [62], respectively. On the other hand, 4-hydroxybenzoic acid and gentisic acid were determined over-accumulated in small grains (Table S5 and Figure 5A). 4-Hydroxybenzoic acid exhibited a hypoglycemic effect in rats [63] and an antimicrobial effect [64], while gentisic acid has demonstrated a beneficial effect on human health as an analgesic, anti-inflammatory, antigenotoxic, cardioprotective, hepatoprotective, neuroprotective, nephroprotective, antimicrobial and antioxidant agent [65]. Therefore, the occurrence of the above-mentioned molecules highlights the healthy value as functional food of small grains and other locally produced grains, in which the production of polyphenols surpasses the average content determined in big chickpeas, being a good option for nutraceuticals supplementation, especially for people with vegetable source-based diets or low-income sector in developing countries [66].

Conclusions
The global market demands big grain sizes, and the Blanco Sinaloa and Jumbo Mexican chickpea varieties meet this international standard. As chickpea plants are self-pollinated, agriculture can generate their seeds for the next agriculture cycle providing savings and additional gains. Due to its self-pollination, chickpea segregates genetic and phenotypic features in every agricultural cycle. Thus, farmers always collect two-grain sizes, including small and big ones. The big grain goes to the international market and the small grains for local sale. Thus, there is great interest in finding molecules with nutraceutical properties in small grains, which could provide additional values to this type of phenotype. Using classical and cutting-edge technologies, we showed that protein content was not correlated with the morphometric characteristic related to grain sizes, but protein profiles were contrasting between small and big grains. Proteomics and metabolomics showed that in big grains, starch biosynthesis and gibberellins were more abundant than in small grains, but in the latter, stress signaling correlated with phenolic compounds content and antioxidant activity. Therefore, it represents a healthy source of nutraceuticals. Also, due to global warming, high cost of production, and pathogens, the chickpea breeders are developing new varieties or cultivars focused on increasing grain yield, pathogen resistance, and environmental factors such as drought. Our study showed that some germplasms have better nutraceutical properties and could be used as genetic resources in molecular chickpea breeding programs to develop biofortified grains.  Table S1: Proximate composition of chickpea grains. Detail information of untargeted metabolomic data in chickpea grains. Table S2: Functional annotation analysis-based GO enrichment of biological proceeds and cluster analysis of proteins over-accumulated in big (A) and small grains (B). Table S3: Key differential proteins identified in proteomic comparative analysis-based TMT-SPS-MS 3 . Data associated with figure 3D. Table S4: Putative identification of core compounds detected in chickpea grains. Table S5: Endogenous content of phenolic compounds in chickpea grains. Table S6

Conflicts of Interest:
The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper. Following reduction and alkylation, 100 µg of protein were digested with 100 mM Tris (2-carboxyethyl) phosphine (TCEP) stock solution for a final concentration of 10 mM for 45 min at 60 • C. Then 300 mM of iodoacetamide (IA) stock solution was added for a final concentration of 30 mM and incubated for one hour at room temperature in darkness. We then quenched the remaining IA with 300 mM DTT stock solution for a final concentration of 30 mM for 10 min. Following overnight protein precipitation with acetone, the protein pellet was resuspended with 100 µL of 50 mM triethylammonium bicarbonate (TEAM) containing 0.1% SDS. The proteins were then digested with trypsin (Trypsin Gold, Mass Spectrometry Grade, Promega, Madison, WI, USA) and labeled with TMT 10plex reagent according to the manufacturer's instructions (AB-Sciex; Foster City, CA, USA). For SmallF1, we used 126 and 127N; for BigF2, we used 127C and 128N; for SmallCo, we used 128C and 129N; for BigCo, we used 129C and 130N; and for Jumbo, we used 130C and 131. Labeled samples were pooled and fractionated using strong cation exchange and high pH reversed-phase C18 cartridges (Thermo Scientific). Each fraction was desalted with C18 cartridges and dried using a CentriVap (Labconco, Kansas, MI, USA). Dried samples were redissolved with 0.1% formic acid in LC-MS grade water (solvent A), and 5 µL of this solution was injected into a nano-LC platform (UltiMate 3000 RSLC system, Dionex, Sunnyvale, CA, USA) through a nanoviper C18 trap column (3 µm, 75 µm × 2 cm, Dionex), and separated on an EASY spray C-18 RSLC column (2 µm, 75 µm × 25 cm) at a flow rate of 300 nL/min. The mobile phase gradient was as follows: a 10 min gradient was set using Solvent A and 0.1% formic acid in 90% acetonitrile (Solvent B), 10 min of Solvent A, 7-20% Solvent B over 25 min, 20% Solvent B for 15 min, 20-25% Solvent B over 15 min, 25-95% Solvent B over 20 min, and 8 min of Solvent A. The nano-LC was interfaced with an Orbitrap Fusion Tribid (Thermo-Fisher Scientific, San Jose, CA, USA) mass spectrometer equipped with an "EASY Spray" nano ion source (ThermoFisher Scientific, San Jose, CA, USA). The mass spectrometer was operated in a positive ionization mode with the nanospray voltage set at 3.5 kV and a source temperature of 280 • C. For external calibration, we used caffeine, Met-Arg-Phe-Ala (MRFA) and Ultramark 1621. Full MS scans were carried out in the Orbitrap analyzer at resolution 120,000 (FWHM), scan range 350-1500 m/z, AGC of 2.0 × 10 5 , maximum injection time of 50 ms, intensity threshold of 5.0e3, dynamic exclusion 1 at 70 s and mass tolerance of 10 ppm. For MS2 analysis, the 20 most abundant MS1 were isolated with charge states set to 27. Fragmentation parameters included CID with 35% collision energy and activation Q of 0.25, AGC of 1.0 × 10 4 , maximum injection time of 50 ms, precursor selection mass range of 400-1200 m/z, precursor ion exclusion width low 18 m/z and high 5 m/z, isobaric tag loss TMT and detection was carried out in the ion trap and the MS3 spectra were acquired using an SPS of 10 isolation notches. MS3 precursors were fragmented by HCD with 65% of collision energy and analyzed using the Orbitrap at a resolution of 60,000 with a scan range of 120-500 m/z, isolation window of 2 m/z, 1.0 × 10 5 AGC, and a maximum injection time of 120 ms with 1 microscan.
The spectra acquired by MS/MS and (SPS)-MS 3 were analyzed with the program Proteome Discoverer 2.1 (Thermo Fisher Scientific Inc.). Data were processed using the search engines Amanda, Mascot server (version 2.4.1, Matrix Science, Boston, MA, USA), and Sequest HT algorithm, in which searches were directed against chickpea UniProt Reference proteome (UP000087171). As search parameters, we considered full-tryptic protease specificity, two missed cleavages, carbamidomethylation of cysteine (+57.021 Da) and TMT 9-plex N-terminal/lysine residues (+229.163 Da). Moreover, we considered methionine oxidation (+15.995 Da) and deamidation in asparagine/glutamine (+0.984 Da) as dynamic modifications. Protein identification was carried out at a lower resolution in the linear ion trap with tolerances of ±10 ppm and ±0.6 Da. Peptide hits were filtered for a maximum of 1% FDR using the Percolator algorithm (Käll, Canterbury, Weston, Noble & MacCoss, 2007). In addition, TMT quantification was performed at the MS3 level in the Orbitrap analyzer with the precursor co-isolation filter set to 45%. Data are available via ProteomeXchange with identifier PXD030279 (https://www.ebi.ac.uk/pride/ (Last accessed on 1 July 2022)).
Appendix A.1.3. Bioinformatic Analysis Proteins and peptides abundance was generated on the Proteome Discover 2.4 platform. Normalization was carried out based on the total peptide amount. After that , we determined the fold change of protein abundance in the ratio BigCo/BigF2, BigCo/SmallCo, BigCo/SmallF1, and BigCo/Jumbo. Differential proteins [fold change > 1.5 and <0.66 (log2 fold change > 0.59 and <−0.59)] were analyzed based on biological process gene ontology (GO) enrichment using the platform DAVID Bioinformatics Resources 6.8 (https://david.ncifcrf.gov/summary.jsp (Last accessed on 1 July 2022)), and the resulting GO terms were then summarized by finding a representative subset with a simple clustering algorithm that relies on semantic similarity measures using REVIGO software (http://revigo.irb.hr/ (Last accessed on 1 July 2022)). We used chickpea Uniport annotation as well as protein homologs to Arabidopsis. The output data were presented using Treemaps, displaying a two-level hierarchy of GO terms-each rectangle is a single cluster representative, and the representatives are joined into 'superclusters' of loosely related terms. The untargeted metabolomics analyses were performed in an ultra-high-resolution chromatograph (UPLC) (Waters™, Class I) coupled with a high-resolution QTOF mass spectrometer (HRMS) (Waters™, Synapt G2Si), as previously reported (MonribotVillanueva et al., 2019). An Acquity BEH column (Waters™) (1.7 µm, 2.1 × 50 mm) was used for the chromatographic separation, with a column oven temperature of 40 • C and sample temperature of 15 • C. The mobile phase was a combination of water (A) and acetonitrile (B), both with 0.1% of formic acid (all solvents used were MS grade from SIGMA). The gradient used began with 1% of B, followed by a linear gradient from 1 to 80% of B over 13 min, then isocratic for 1 min at 80% of B, and, finally, a linear gradient from 80 to 1% of B over 1 min. The total run time was 20 min. The flow rate was 0.3 mL/min, and 2 µL of sample solution were injected. For the spectrometric analyses, electrospray ionization (ESI) was used in positive and negative modes. The capillary, sampling cone, and source offset voltages were 3000, 40 and 80 V, respectively. The source and desolvation temperatures were 100 and 20 • C, respectively. The desolvation gas flow rate was 600 L/h and the nebulizer pressure was 6.5 Bar (650,000 Pa). Leucine-enkephaline was used as a lock mass. The mass spectrometer was set with a mass range of 50-1200 Da using an MS e method. The collision energy of Function 1 was 6 V, while for Function 2, we used a ramp from 10 to 30 V. The scan time was 0.5 s. Metabolomics data were first processed with MassLynx (Waters™) and MarkerLynx (Waters™), then the bioinformatic analyses were performed on the MetaboAnalyst bioinformatic platform (https://www.metaboanalyst.ca/ MetaboAnalyst/home.xhtml (Last accessed on 1 July 2022)).

Sample Preparation
Samples were dissolved in methanol (50 mg/mL), filtered with 0.5 µm PTFE membranes, and placed in 2 mL UPLC vials.

Multiple Reaction Monitoring (MRM)
Phenolic compounds were identified and quantified in the same fruit samples with a dynamic multiple-reaction monitoring method using a UPLC (Agilent Technologies, 1290) coupled to a QqQ mass spectrometer (Agilent Technologies, 6460) and based on protocols previously described by our research group (Juárez-Trujillo et al., 2018; Monribot-Villanueva et al., 2019). For the chromatographic separation, a Zorbax SB-C18 column (Agilent Technologies) (1.8 µm, 2.1 × 50 mm) with a column oven temperature of 40 • C was used. The mobile phases were water (A) and acetonitrile (B), both with 0.1% of formic acid (all solvents were MS grade from SIGMA). The gradient used begins with 1% of B, then a linear gradient from 1 to 40% of B over 40 min and a linear gradient from 40 to 90% of B over 2 min, then isocratic for 2 min at 90% of B and, finally, a linear gradient from 90 to 1% of B over 2 min. The total run time was 47 min. The flow rate was 0.1 mL/min, and 1 µL were injected. For the spectrometric analyses, the ESI source was used in positive and negative modes. The desolvation temperature was 300 • C, the cone gas (N 2) flow rate was 5 L/min, and the nebulizer pressure was 45 psi (310,264 Pa). The sheath gas temperature and flow rate values were 250 • C and 11 L/min, respectively. The capillary voltage was 3500 V, and the nozzle voltage was 500 V. All data were processed using the MassHunter software (Agilent Technologies). For quantification, calibration curves were constructed from 0.25 to 17 µM for each compound. The r 2 values for quadratic regressions were 0.99. Data were processed with the MassHunter software (Agilent Technologies). The heatmap shows the clustering result by using the Euclidean distance and Ward algorithm in the MetaboAnalyst bioinformatic platform.
The protocol used was a dynamic MRM (Multiple Reaction Monitoring). The conditions for each compound are described in Table S6.