Mass Spectrometry-Based Flavor Monitoring of Peruvian Chocolate Fabrication Process

Flavor is one of the most prominent characteristics of chocolate and is crucial in determining the price the consumer is willing to pay. At present, two types of cocoa beans have been characterized according to their flavor and aroma profile, i.e., (1) the bulk (or ordinary) and (2) the fine flavor cocoa (FFC). The FFC has been distinguished from bulk cocoa for having a great variety of flavors. Aiming to differentiate the FFC bean origin of Peruvian chocolate, an analytical methodology using gas chromatography coupled to mass spectrometry (GC-MS) was developed. This methodology allows us to characterize eleven volatile organic compounds correlated to the aromatic profile of FFC chocolate from this geographical region (based on buttery, fruity, floral, ethereal sweet, and roasted flavors). Monitoring these 11 flavor compounds during the chain of industrial processes in a retrospective way, starting from the final chocolate bar towards pre-roasted cocoa beans, allows us to better understand the cocoa flavor development involved during each stage. Hence, this methodology was useful to distinguish chocolates from different regions, north and south of Peru, and production lines. This research can benefit the chocolate industry as a quality control protocol, from the raw material to the final product.


Introduction
Chocolate is one of the most popular and recognizable aliments worldwide due to the organoleptic properties of the cocoa beans. Cocoa (Theobroma cacao L.) from different geographical origins has a different organoleptic profile and influences the final flavor and quality of chocolate [1][2][3]. The chocolate aroma is primarily due to the volatile organic components (VOCs) from cocoa, composed of a complex mixture of over 500 chemical compounds, mainly pyrazines, esters, amines, amides, acids, and hydrocarbons [4]. Most of these VOCs from cocoa are the foundation for the flavor profile of chocolate products [5][6][7]. Interestingly, besides the cultivation conditions and the genotype (i.e., variety) of the cocoa plant, the fabrication processes can also define the chocolate's final organoleptic properties [1,4,8], because the relative chemical composition of cocoa's VOCs is modified during the many different technological processes and metabolic reactions that occur during various stages of chocolate production (e.g., fermentation, drying, roasting, and conching). Therefore, to ensure the chocolate's final flavor quality, the traceability of cocoa's geographical origin and the industrial chocolate processing are essential for both the chocolate industry and consumers [9].
The cocoa market distinguishes between "bulk" and "fine-flavor" cocoa (FFC), with bulk cocoa representing 95% of the world cocoa market [10,11]. FFC has opposite features in contrast to that of the bulk cocoa. FFC based chocolates generally have a variable fruity flavor and/or flower/spicy aroma expressions, besides the typical "cocoa" flavor [12,13]. The quality of the FFC and related products, such as chocolates, is done by professional

Flavor VOCs from Northern Peru
Of the 93 volatile compounds identified (Table 1), the VOCs that presented a significantly higher relative intensity in the chocolate from northern Peru compared to southern Peru were selected. For this purpose, the analysis of variance (ANOVA) was performed. Those MS signals that presented a value of p < 10 −6 were selected because higher p-values were not providing enough robustness to the method when randomly permutating the number of chocolate samples for determining the FFC bean origin (i.e., northern or southern Peru).
In Table 2, the selected key-VOCs are presented according to their retention time and the most intense fragments. These key-VOCs could differentiate chocolate produced with fine flavor cocoa from northern and southern Peru using a multivariate statistical analysis, such as principal component analysis (PCA; Figure 1). The relative intensity pattern of each of the eleven key-VOCs in chocolates from northern and southern Peruvian regions is shown in Supplementary Figure S1. Interestingly, the PCA plot shows four separate clusters. Three clusters in quadrants I and II can refer to northern Peruvian chocolates, while the cluster located in quadrant III refers to chocolate made from cocoa from southern Peru. Furthermore, these key-VOCs can also differentiate three chocolate bars produced with the same FFC beans species from northern Peru due to their different formulation ( Figure 1 and Supplementary Figure S1).  By reducing the focus from 93 to 11 VOCs (shown in Tables 1 and 2, respectively), it is possible to diminish the measurement errors associated with reproducibly monitoring multiple VOCs in food samples [58]. Due to the limited availability of standards, a tandem MS strategy was used for their identification. Thus, the identity of three of them could be validated by their fragmentation pattern measured by a third-party lab using a GC-MS/MS Trace 1300 instrument, Thermo Scientific (MS NIST 2011 Spectral Library). Likewise, since the measured samples are from a known biological origin (i.e., FFC species), peer-reviewed publications of cocoa VOCs were also used to putative identify the metabolites in Table 2.
Interestingly, eight of the eleven metabolites in Table 2 have been previously reported to have a flavor description associated with cocoa (Table 1).
Furthermore, these eleven key-VOCs can also cluster the chocolate samples based on their manufacturing stages, e.g., pre-roasted beans, roasted beans, conching (liquor prior food additives), and a chocolate bar (Figure 2), the reason being that the relative abundances of these 11 key-VOCs change during the chocolate manufacturing process.

Taster's Results vs. MS
To ensure that the eleven compounds can be used to monitor the chocolate bars' flavor, an analogy between the eleven key-VOCs determined by the mass spectrometer and the results from a professional chocolate tasting was made ( Figure 3) using a matrix-based polynomial transformation. In more detail, for performing this correlation between the eleven key-VOCs and the taster's results, the key-VOCs were grouped into five classical flavor groups, i.e., roasted, buttery, fruity, floral, and sweet ethereal based on the PCA loading plot ( Figure 2). Subsequently, a matrix-based polynomial transformation formula was used to match the normalized MS peak areas with the values provided by human tasters. Thus, a training and test set, each of them consisting in chocolate samples and a panel of human tasters were used. The training set was used to calculate the transformation matrix, while the test set was used to validate the mathematical operation. As both training and test sets chocolate samples measured by the MS could match the panel of professional tasters' flavor perception used in each set (Figure 3), it is possible to ascertain that the aesthetic flavor perceived from the northern Peruvian chocolates could be inferred from this 11 key-VOCs.

Flavor Development during Each Key Step of Chocolate Elaboration
With the hypothesis that the eleven key-VOCs could correlate to the chocolate's flavor, the subsequent step was to study how these VOCs evolved during two of the key chocolate-making stages, such as roasting and conching ( Figure 4) using the described GC-MS methodology. Roasting procedures 1 and 2 are done at the same temperature range (between 130 to 160 • C). However, roasting procedure 1 is performed for a shorter time (<30 min), while roasting process 2 takes longer (>30 min). The conching procedures 1 and 2 are similar in terms of temperature and time, the only exception being the type of roasted beans used. For the particular case of the nibs and milk chocolate, the liquor sample was taken before adding other ingredients.

Discussion
This work's underlying concept is based on the culinary hypothesis known as food pairing [59]. According to this theory, taste perception is the neuronal response to chemical compounds present in food. Therefore, by characterizing the odors (volatile organic compounds, VOCs, molecules) and the savors (mainly non-volatile molecules) of food, it is possible to predict the taste that a person perceives in a recipe.
To identify the key volatile organic compounds (VOCs) that can be characteristic for the Peruvian chocolate made from white porcelain FFC beans from northern Peru, ninety-three VOCs were first identified using GC-MS. Table 1 shows the list of all these compounds, where the most predominant are esters, followed by alcohols and phenols. In detail, 6.46% are acids; 18.28% alcohols and phenols, 13.98% aldehydes, and ketones; 19.35% esters; 5.37% furans, furanones, pyranes and pyrones; 4.3% hydrocarbons; 5.37% lactones; 1.07% nitrogen compounds; 13.98% pyrazines and piperazines; 2.15% pyridines; 4.3% pyrroles; 2.16% sulfur compounds and 3.23% terpenes and terpenoids. Esters and alcohols families of organic chemical compounds provide chocolate with a sweet and fruity aroma, very characteristic of chocolates made with FFC beans [37,55]. The aldehydes, ketones, pyrazines, and piperazines are the next families of chemical compounds in a higher proportion that provide a charactersitic flavor. The aroma of roasted cocoa is very characteristic of pyrazines [60,61] and also represents one of the most important flavors of cocoa products [41]. Many of the compounds described in Table 1 have an odor descriptor reported in the literature, but there are still chemical compounds that do not have an associated descriptor.
Of the ninety-three volatile compounds, eleven were selected as key-VOCs for being highly correlated to the northern Peruvian chocolate. Interestingly, not all of these eleven compounds in Table 2 are unique to the region. Six compounds have been reported as cocoa flavor compounds from West Africa (e.g., tetramethylpyrazine, epoxylinalol, phenethyl acetate), Asia (e.g., tetramethylpyrazine, ethyl isobutyrate), and Latinamerica (e.g., tetramethylpyrazine, α-phenethyl alcohol). Nevertheless, as shown in Figure 1, the relative abundance between these VOCs (pattern) is characteristic of the northern Peruvian region.
The PCA plot (Figure 1) based on these eleven key-VOCs (Table 2) shows four separate clusters. It must be noted that each chocolate type (Bitter, Nibs, and Milk) produced with the same harvest of white porcelain FFC beans from Piura (northern region in Peru) has a slightly different clustering quadrant in the PCA due to the industrial process they go through. It will be demonstrated later that this separation among northern chocolate types is mainly due to the difference in the manufacturing conditions and not a particular food additive.
Interestingly, key-VOCs of "similar" known flavors (Table 2) cluster together in the loading plot ( Figure 2B), while key-VOCs of contrasting flavors cluster in different directions in the loading plot. For example, on the one hand, tetramethylpyrazine has the characteristic aroma of roasted cocoa and is also present in chocolates of Ecuador and West Africa [42]. Tetramethylpyrazine is a key roasted aroma contributor to cocoa, with coffeeand cocoa-like attributes [31,33,35]. On the other hand, phenethyl acetate has been reported to have a sweet floral taste [43] as well as α-phenethyl alcohol. Although α-phenethyl alcohol is a volatile compound in milk, it is also present in the cocoa samples [15,16,39,42,55,62]. For the compounds in Table 2 that did not have a reported flavor, we hypothesize that these compounds will share the same flavor characteristics as other "known flavor" compounds that cluster with them. For example, 8-methyl-1,2,4-triazolo[4,3-b] pyridazine, 3-hydroxybutanoic acid, and 3,4-dihydroxy-3,4-dimethyl-2,5-hexanedione were correlated with a sweet and fruity, since their loading plot position clusters them with the putatively identified metabolites ethyl isobutyrate and 2-butoxy ethyl acetate ( Figure 2B). Thus, the present method could only monitor five flavors (i.e., buttery, fruity, floral, ethereal sweet, and roasted flavors) associated with our key-VOCs from all possible (10) flavors named in Table 1. Table 2 had a putative flavor; a matrix-based polynomial transformation formula was used to match the normalized peak areas of each flavor group with the values provided by the tasters. First, a tasting test of the chocolates was performed (i.e., training set), where the intensity of the 10 flavors, which included the five flavors (i.e., roasted, buttery, fruity, floral, and sweet ethereal) of interest, were identified by two professional tasters (provided by Theobroma Inversiones SAC). The objective was that the human tasters where as specific as possible on the intensity of each flavors they perceived to better their responses with the MS signals.

Once all compounds in
The objective of the training set was to mathematically construct the flavor pattern using the MS signals. For this purpose, the flavor pattern given by the Human tasters was key for calculating a transformation matrix (see Section 4.6). Subsequently, a new experiment was run (i.e., test set) to validate our estimation of the transformation matrix (see Section 4.6). A test set consists of a new set of chocolate samples measured with our MS approach to predict the human flavor perception values from a new (different) panel of three tasters from Theobroma Inversiones SAC and a third-party institution. In Figure 3, the results of the flavor pattern comparison between both groups (human tasters and GC-MS signals) demonstrates that the MS-based analysis could match the flavor perception of the professional tasters. Thus, we are confident that we could use our 11 key-VOCs to ascertain the chocolates' human response.
With the certainty that the eleven key-VOCs could represent the flavor that a person can perceive, a polynomial transformation of the relative amounts of these key-VOCs was used to follow the flavor development upon the different stages of chocolate manufacturing since their relative proportions were unique to each stage (Figures 2A and 4). So, during roasting, it was observed that floral and roasted flavors increase when the beans are roasted at higher temperatures. This relative increase is in accordance with the current scientific knowledge available for chocolate manufacturing. The compounds associated with these flavors are products of the Maillard reaction, which occurs during roasting [4,62,63]. Hence, at higher temperatures, more compounds associated with the Maillard reaction can be seen. While during the conching, time and temperature play a pivotal role in releasing the volatile compounds from the liquor. Hence, there is a small decrease in ethereal sweet flavors since they are incredibly volatile. In Figure 4, it is also possible to observe a slight difference between the liquor (conching stage prior additives) and the final product, in particular for the Nibs and Milk chocolate, since they receive additives before the tempering and packaging steps.
In Figure 4, it is observed that the floral flavor is the least intense in the pre-roasted beans. Here, it was noticed that this flavor's perception increased upon roasting the beans for the Nibs/Milk chocolate (Roasted beans 2). In contrast, in roasted beans related to Bitter chocolate (Roasted beans 1), the increase is somewhat less meaningful. This difference may indicate that the roasting conditions affect this particular flavor development in the FFC beans [64,65]. In addition, it was observed that the intensities of buttery and fruity flavor were maintained almost constant during both roasting processes of the cocoa beans.
In the conching stage, where the cocoa liquor is obtained, it was observed that the flavors' intensities that changed significantly were the ethereal sweet for liquor 1 and floral for liquor 2. We estimate that this decrease is associated with frictional heat and the consequent release of volatiles. The loss of flavors such as sweet, floral, and fruity during the conching process has been reported in some studies [47,64,66].
Finally, it is observed that the ratio between the floral/ethereal sweet flavor, which had decreased in intensity in liquor 2, has increased later in the Nibs and Milk chocolate. The explanation for this result lies in the addition of cocoa butter (food additive). Therefore, it can be pointed out that the cocoa butter helps to compensate for the ratio between the floral/ethereal sweet flavors that had strongly diminished after the conching stage (liquor 2).

Materials
Pre and post roasted beans, liquor, and chocolates from northern Peru origin were obtained from Theobroma Inversiones SAC (Lima, Peru). The chocolate samples of Theobroma Inversiones SAC were bitter chocolate (70% content of cocoa) from 5 chocolate bar lots, nibs chocolate (70% content of cocoa mixed with nibs and cocoa butter) from 3 chocolate bar lots, and milk chocolate (50% content of cocoa mixed with powder milk and cocoa butter) from 5 chocolate bar lots. These chocolates were produced from the same species of harvested cocoa fruits obtained in the northern Peru region in 2018 and 2019. The southern Peruvian chocolates used as a comparison were obtained from local supermarkets (70% content cocoa from the southern Peru region). All chocolate bars were analyzed before their expiration date.
Pre and post roasted beans were stored in sealed aluminum paper containers at ambient temperature (18-20 • C). Liquor and chocolates were stored in sealed plastic containers in a fridge (4 • C). Theobroma Inversiones SAC (Lima, Peru) provided us several chocolate bars from their prize-winning product line: Piura Select (cocoa content 70%, named Bitter); Piura Nibs (70%), and Piura Milk (50%). These materials were also analyzed within the expiration date suggested by the company.

Sample Preparation and Volatile Compounds Extraction
One (1) g of each type of chocolate was grated in a mortar to form a fine powder. Then, the chocolate powder was added to a septum vial (20 mL). Volatile compounds of each sample were extracted using the Headspace Solid-Phase Microextraction technique (HS-SPME). The fiber used for the extraction was 50/30 µm divinylbenzene/carboxen/polydimethylsiloxane (DVB/CAR/PDMS, Stableflex 24 Ga, Manual Holder) of Supelco. The use of this fiber for cocoa organoleptic analysis allows obtaining a good separation of chromatographic peaks [41]. The SPME fiber was conditioned in the GC-MS Agilent 7890B's injector system for 15 min at 250 • C. The conditioning was done below the suggested conditioning temperature provided by Supelco for this fiber type (i.e., 30 min at 270 • C) because it was found that our conditions extended the working life of our fiber without carry over compromise. After fiber conditioning, the SPMEs fibers were exposed to heated chocolate samples (at 60 • C) for 15 min in a thermostat block. Although individuals consume chocolate bars at room temperature, the temperature of 60 • C used was to maximized VOCs emission without risking degradation. The VOCs were then desorbed in the GC-MS Agilent 7890B's injector system for 10 min at 250 • C.

HS-SPME-GC-MS Method
Chocolate samples were analyzed by gas chromatography-mass spectrometry (GC-MS), using the equipment Agilent 7890B GC System, Equipped with a VF-23ms column (high polarity column, length: 30 m, diameter: 0.25 mm, film thickness: 0.25 µm). The GC inlet was at 250 • C, while the oven was set at an initial temperature of 40 • C for 5 min, then the temperature was increased to 200 • C with a gradient of 5 • C/min, to finally keep at 200 • C for 10 min [41]. Injection mode was performed manually, exposing the fiber after introducing the SPME needle. The fiber was left exposed for about 10 min and then removed from the inlet.
The SPME fiber selection was made by analyzing chocolate samples from three production lots of bitter chocolate samples (70% Cocoa content) from Theobroma Inversiones SAC. Two samples from different production lots are shown in Supplementary Figure S2. The divinylbenzene/carboxen/polydimethylsiloxane (DVB/CAR/PDMS, black line) and the divinylbenzene/polydimethylsiloxane (DVB/PDMS, red line) fibers showed better performance than the carboxen/polydimethylsiloxane (CAR/PDMS, blue line) fiber due to the higher affinity of the latter to acetic acid. The DVB/CAR/PDMS was finally selected because, after 17 min, it showed better chromatographic peaks than DVB/PDMS fiber.
During SPME optimization, every sample was analyzed in triplicate, and blanks (i.e., no-exposed fiber injections) were run between every sample. Once the method was optimized, six additional chocolate samples were analyzed from production lots of two different years, i.e., 2018 and 2019; to identify key-VOCs that can differentiate northern and southern Peru regions. The number of blanks was reduced to 1 every three sample injections.
After obtaining the chromatograms and spectra of each sample, the signals were integrated using the GC-MS software. In each integration, the NIST 2.0 Mass Spectral Search Program database and the MS NIST 2011 Spectral Library were accessed to identify the compounds. This database allowed access to a list of probable compounds according to the percentage of equivalence between experimental and theoretical mass spectra. For the ninety-three chromatographic peaks, the compound's name with the highest identification percentage of similarity to the mass spectrum was selected (minimum accepted 60%). Low identification percentage is typical with old quadrupole mass analyzer models since it only works with nominal masses. Therefore, to verify the compounds' identity, we searched as well in peer-review references, where these putative signals were also identified in chocolate samples.

Sensory Analysis
The sensory analysis of the chocolate bars was carried out by the qualified tasters of the company Theobroma Inversiones SAC (4 individuals) and one from a third-party company. The number of tasters used is in agreement with standard practices for an international chocolate/cocoa degustation or contest, where a minimum of three to more tasters is used. Furthermore, two of the authors took steps to be qualified as tasters. However, they were not yet certified at the time of the trials and did not participate in the trials. Nevertheless, they could ensure that the trials were done according to international standards.
The term "qualified (taster)" refers to a person trained for at least four months to deconvolute the cocoa and cocoa-related products' flavors. The training is based on developing a flavor memory by trying different flavors present in various cocoa-related products. To become an official taster, the candidate must describe the flavors in cocoa and cocoa related products. The candidate results are subsequently compared to those of a certified taster. If the candidates' results are within a 10% difference of the certified taster score, they become themselves certified tasters as well.
Tasters (5 individuals, divided into two groups) evaluated 52 samples (i.e., 10 to 11 samples per taster). These samples came from four chocolate lots produced with FFC beans from northern (Bitter 70%, Nibs 70%, and Milk 50%) and southern (Bitter 70%) Peru. Each person was provided with a sample of chocolate (from a different origin) and a randomly doubled sample as a control. The attributes selected by the human tasters for evaluation were sweet, fresh, ethereal sweet, cocoa, fruity, honey, roasted, caramel, buttery, nut, and floral. We used a larger number of attributes to facilitate our ability to correlate the human taster's perception of the key-VOCs measured by the MS.
The samples (approximately 2 g) were placed on aluminum foil and coded with random numbers. The tasters were given water and water cookies to neutralize the palate.

Selection of Five Groups of Flavors
The selection of the five flavor groups was based on the PCA loading plot (Figure 2). As a result, the compounds that did not have a reported taste were given the flavors of the compounds they clustered with. Afterwards, the perceived intensity for these five flavors were correlated to the normalized MS peak areas by applying a third-degree polynomial transformation to the MS signal matrix (Section 4.6).

Experimental Design and Statistical Analysis
The multivariate analysis clustering (i.e., principal component analysis) and the polynomial matrix transformation were performed on MATLAB vR2019b. In more detail, the statistical analysis was used to correlate certain volatile compounds to previously defined chocolate flavors, thus permitting us to correctly identify, with a significant degree of confidence, particular chocolate flavors. The volatile compounds' signals were normalized to the total ion current (TIC) value of each spectrum to monitor their variations throughout the industrial process of making different types of chocolates. Eleven (11) of the 93 volatile compounds detected in all samples were extremely good VOCs to differentiate the chocolate bars' origin. Therefore, the number of possible VOCs related flavors observed in Table 1 was reduced using PCA to the observed ones in Table 2 (i.e., buttery, fruity, floral, ethereal sweet, and roasted flavors), providing an excellent starting point for flavor correlation and determining the prize-winning chocolate bars' secret flavor pattern.
Subsequently, the semi-quantitative correlation between the normalized MS peak areas by TIC and the average human tasters' flavors was performed by applying a third-degree polynomial transformation to the MS signal matrix. More specifically: where A i is a 3 × 3 matrix with rows given by normalized MS signals of the metabolites associated with the flavor "i" ( Table 2) and columns given by the type of chocolate (e.g., bitter, nibs, and milk); B i is the unknown 3 × 1 transformation matrix for flavor "i"; and C i (1 × 3 matrix) is the average testing values for flavor "i" identify in a given type of chocolate. The matrix operation to identify B i was performed in Matlab. Since multiple signals could be correlated to a particular flavor (Table 2), the above-described process was manually repeated several times by exchanging the selected MS values to obtain the result closest to the human testers. Finally, to validate our estimation of the transformation matrix (B i ) for each flavor i, a new set of chocolate samples was measured with our MS approach. The selected normalized MS signals were introduced to the above equation to predict the human flavor perception values from a new (different) panel of tasters (three individuals).

Conclusions
Our GC-MS based method identified eleven chocolate volatile organic compounds (VOCs) that could indicate: (a) if the chocolate was produced with white porcelain FFC beans from Piura, northern Peru (i.e., origin), and (b) that could infer the perceive flavor of chocolate bars made with white porcelain FFC beans from Piura. More interestingly, it was possible to monitor the changes in the relative abundance of these 11 VOCs through the different stages of chocolate manufacture. We believe that, in the near future, the implementation of these techniques by food security officials may allow them to trace the origin of the chocolate bar and identify adulterations or bad manufacturing practices among fine flavor chocolate producers in Peru.