2.3. Prediction Model
In order to quantify the cocoa shell in cocoa products, a prediction model for the cocoa shell was developed using a cocoa shell calibration series with defined shell contents. PLSR models were created for the prediction. The suitability of PLSR models for the prediction of cocoa shell content or other research topics has already been demonstrated in many papers [
28,
33,
34]. Two different calibration series were prepared in order to compare and determine which method is best suited for preparation and prediction. In order to create a reliable and robust model, the investigated key metabolites have to show linearity in the relevant concentration range (0–10%) in order to be able to predict the concentration of the shell in cocoa products. The linearity of the key metabolites was already confirmed in a previous work for 17 of the 18 compounds using a LC-ESI-QTOF system. The preparation of the shell calibration series is described in more detail in the material and methods chapter.
To verify the suitability of the calibration series, eight samples of different origins with a known shell content were prepared and analysed. Furthermore, 15 different commercially acquired chocolates and a cocoa butter were analysed. In addition, a calibration series in the range of 0–100% cocoa shell was used to verify the linearity of the metabolites over the entire concentration range. The cocoa shell series were analysed in a fivefold determination and in randomized order.
When transferring the method from the LC-ESI-QToF- to the LC-ESI-QQQ-system, three of the 18 key metabolites could not be included in the targeted method. In addition to using a different analytical system, for the targeted approach was also used a different extraction solvent, which may have had an influence on the concentration of the key metabolites in the extracts. As a result of these changes, three of the metabolites no longer showed a linear relationship in the relevant concentration range.
The concentration of the 15 key metabolites was within the calculated linear ranges. Therefore, the evaluation and calculation of the regression models were performed based on the remaining 15 metabolites. In order to be able to understand the influence of the individual metabolites to the model, an evaluation based on linear regressions of each key metabolite was also carried out to predict the cocoa shell content. Coefficient of determination and regression equations of each key metabolite are shown in
Table S4 in the Supplementary Materials. For every key metabolite a linear regression was calculated and the cocoa shell content of samples with known shell content was then individually predicted. For illustrative purposes,
Table 3 presents the predicted cocoa shell content of samples from Ecuador and Ivory Coast.
The predicted shell content of the samples clearly showed that the calculated shell contents for each key metabolite scattered very strongly and influenced the value differently. The prediction of the sample from Ecuador varied between 1.10% and 22.41% and that of the sample from the Ivory Coast between −0.9% and 5.2%. However, the main metabolites do not always indicate high or low cocoa shell content in the different samples. The metabolites Hexacosanoic acid tryptamide and Heneicosylic acid serotonin can be considered as examples of this relationship. The prediction of the cocoa shell content by Hexacosanic acid tryptamide results in a significantly higher value for the sample from Ecuador than the average value and a significantly lower value for the sample from Ivory Coast. The opposite can be observed for Heneicosylic acid serotonin. Therefore, it is necessary to use as many metabolites as possible from different substance classes in order to obtain a robust model. By calculating the mean value, the actual content can almost be determined.
Figure 1 shows the PLSR model of the first calibration series. The model was calculated by the Unscrambler X 10.3 software using all 15 key metabolites and carrying out a full cross validation. All variables were weighted equally, and a mean-centering was carried out with the data set. The model shows a linear dependence between the area of the metabolites and the cocoa shell content with a coefficient of determination of 0.9. The deviations of the samples with a higher cocoa shell concentration is greater than for the samples with a lower concentration. The greater deviation in cocoa samples with a higher shell content may be attributed to greater inhomogeneity in these samples.
The samples were prepared by mixing cocoa nibs and cocoa shell powder as shown in 2.2. The model should then be applied to samples with known cocoa shell content to verify the validity. The predicted cocoa shell contents of the samples are shown in
Table 4. Overall, satisfactory results were achieved with this model. When predicting the shell content, 5 out of 8 samples showed suitable results with a very small deviation from the actual content. Samples from Ghana, Côte d’Ivoire, Nigeria, Panama and Indonesia showed only slight deviations to the actual concentrations. A deviation of 1% cocoa shell can be considered as a wholly satisfactory result. Samples from Madagascar and Venezuela contained more than 7% cocoa shell and showed very poor results with a deviation of 3.32% and 5.08% respectively. As already observed during the creation of the model, the deviations of the model in a higher concentration range were very big. This could explain the large deviations of the samples with a high shell content.
Furthermore, a PLSR model was calculated on the basis of the second calibration series. As with the first model, a linear dependence of the metabolites in the investigated concentration range of 1–10% is given. The coefficient of determination is 0.82. If the mean values of the calibration points are solely taken into account for the calculation of the PLSR model, a regression coefficient of 0.97 results. The same data pre-treatment and validation was performed as for the first calibration series. The dispersion of values in the higher concentration range from the first model does not apply to this model. As in the case of the first model, the samples with a known shell content were analysed. Comparing the predicted and the actual shell contents, there is only a small deviation for all samples. Unlike to the first model, suitable results are also obtained for the samples from Madagascar and Venezuela. As powders were used for the preparation of the samples, a certain inhomogeneity remains even when mixing. The potential inhomogeneity of the calibration series and of the produced samples with a defined shell content can cause the deviations of the calibration points and the deviations of the predicted results.
Whereas in the first model the cocoa shell and nibs were weighed in directly, in the second model cocoa shell and nibs mixtures were produced and a certain quantity was taken (the weighings of the samples are shown in the supporting information). For this reason, slightly different results can be expected between the two attempts. If the applicability and feasibility of the two dilution series are compared, it becomes clear that the second calibration series is much easier and time-saving to carry out. In addition, the deviations of the shell content in the samples with known shell content and in the chocolate samples are much smaller. Therefore, the second model is better suited for calibrating and determining the shell content in various cocoa samples.
After the predictive quality of the PLS models were confirmed by samples with known shell content, the model was also applied to samples with unknown shell content and other stages of processing. These samples included 14 different chocolates and three cocoa butters. Since the PLSR model of the second calibration series showed much better results over the whole concentration range, this model was used for the prediction of samples with unknown shell content. Fourteen chocolate samples were purchased from different manufacturers and consist of different chocolate variants, such as white, milk and dark chocolates. The predictions of the chocolates and cocoa butters are shown in
Table 5. Besides the predicted cocoa shell content, the table also shows the cacao content and the calculated cocoa shell content in relation to the used cocoa products. The listed percentages of cocoa are the sum of cocoa mass and cocoa butter.
The shell content of the cocoa butters were predicted to be 6.79%. This result coincides with the assumption that the lipophilic key metabolites of the shell pass over to the cocoa butter when the cocoa butter is pressed. The white chocolates as well as the milk chocolate, showed the lowest proportion of cocoa shell. White chocolate has a cocoa butter content of approx. 28% [
35]. Since the cocoa butter contains approx. 6.8% cocoa shell, accordingly the white chocolate should contain approx. 2% cocoa shell. This calculation corresponds to the obtained results. Milk chocolate has a cocoa mass content of approx. 12% and 18% cocoa butter. Though, low cocoa shell contents of approx. 2% are also expected here, which is in line with the obtained results. The greater the amount of cocoa, the higher the expected cocoa shell content should be. This correlation could also be observed for the analysed chocolates. An exception was the chocolate with a cocoa content of 99%. A lower cocoa shell content was detected here than in the chocolate of the same manufacturer with 85% cocoa. This could be explained, by better process control and the use of high-quality raw materials. When calculating the cocoa shell content of the individual chocolates in relation to the used cocoa, it is noticeable that, for most chocolates, cocoa with a cocoa shell content of 6 ± 2% was used. Only one chocolate showed a markedly higher cocoa shell content of over 12%. The increased content could be attributed to faulty production. However, since only one bar of chocolate was used for the analysis, this could also be an outlier. Furthermore, the cocoa shell content could also be non-homogeneously distributed in the chocolate, although the detection was carried out in triple determination and the individual results showed only a very small deviation.
In addition to the calibration series between 0% and 10%, samples with a shell content between 0% and 100% were prepared and analysed to verify the linear dependence of the key metabolites outside the previously investigated concentration range.
Figure 2 shows the linearity over the entire concentration range of 0–100% shell. Therefore, the metabolites can predict the shell content independently of the contained concentration. Cocoa shells are used as a by-product for theobromine extraction or as animal feed [
10]. Since the key metabolites are linear up to 100% cocoa shells, the method can also be used to control cocoa shells.
In this study, a LC-ESI-QqQ-MS/MS targeted method for the determination of the cocoa shell content in different cocoa and chocolate products was successfully developed and validated. Besides the suitability for different cocoa products, the method is also applicable for samples of different origin. Furthermore, the method is characterized by a simple and fast implementation. By replacing chloroform with MTBE as an extraction solvent, the negative impact on the environment and users has been reduced. With the developed method, the cocoa shell content of different cocoa products can be predicted with an accuracy of approximately 1% cocoa shell. The accuracy of the prediction could be further increased by using reference substances for every key metabolite for external calibration and isotope-labelled standards for internal calibrations. External calibration using reference standards would solve both the inhomogeneity problem of the cocoa shell calibration series and the extensive time involved in producing and weighing the cocoa shell calibration series. Furthermore, calibration by means of external and internal standards could provide absolute quantitation of the key metabolites and the method could be transferred directly to commercial or industrial laboratories. The presented method can be considered as a supplement to the NIR detection method published in 2019 [
28]. The NIR method can be regarded as a rapid screening method and should the analysis reveal a shell content close to the limit value, this can be reviewed using the presented method here. Furthermore, the NIR method is a non-targeted method and therefore the method does not provide the level of selectivity that is given by the LC-ESI-QqQ method. Using the NIR method impurities in the samples that also can cause bands at the selected wavelength could have an influence on the results of the method. Because of the multiple reaction monitoring method developed in this work, this influence of impurities can be avoided. However, the NIR method has only been applied to cocoa powder and not to cocoa masses or chocolates, while the suitability of the here presented method has already been confirmed for these matrices as well.