Amino Acid Profiling with Chemometric Analysis as a Feasible Tool for the Discrimination of Marine-Derived Peptide Powders

Marine-derived peptide powders have suffered from adulteration via the substitution of lower-price peptides or the addition of adulterants in the market. This study aims to establish an effective approach for the discrimination and detection of adulterants for four representative categories of marine-derived peptide powders, namely, oyster peptides, sea cucumber peptides, Antarctic krill peptides, and fish skin peptides, based on amino acid profiling alongside chemometric analysis. The principal component analysis and orthogonal partial least squares discriminant analysis results indicate that four categories of marine-derived peptides could be distinctly classified into four clusters and aggregated with the respective raw materials. Taurine, glycine, lysine, and protein contents were the major discriminants. A reliable classification model was constructed and validated by the prediction dataset, mixture sample dataset, and unclassified sample dataset with accuracy values of 100%, 100%, and 100%, respectively.


Introduction
Marine-derived peptide powders are generally manufactured from marine organisms through protease enzymatic hydrolysis [1,2]. Such peptide powders, mainly containing polypeptides, oligopeptides, and amino acids, are derived from protein degradation and belong to a single complex protein hydrolysate product [2,3]. It is generally accepted that peptides are usually easier to absorb than intact proteins due to the lower molecular weight of peptides [3]. Moreover, marine-derived peptides have demonstrated various biological activities, such as antioxidant, anti-inflammatory, anti-fatigue, anti-hypertensive, and anti-obesity activities [4][5][6]. Consequently, marine-derived peptides are attracting more and more attention and have been widely applied in the food, pharmaceutical, and cosmetic industries [1,7].
Echinodermata, Mollusca, Arthropoda, and Chordata, representing typical species of edible marine animals, have been widely used to produce marine-derived peptide powders. Especially, four categories of marine-derived peptides, namely, sea cucumber peptides, oyster peptides, Antarctic krill peptides, and fish skin peptides, occupy considerable shares in the marine-derived peptide market. The biological activity claims and prices of marine-derived peptide powders differ largely from each other depending on the material categories. In addition, a large variety of animals, plants, and microorganisms from both marine and non-marine origins are involved in the manufacturing of commercial

Raw Marine Material Preparation
For the raw marine material, 4 categories of marine raw materials were purchased from reliable local suppliers, including 12 oyster meat samples from 4 geographical regions in China, 6 fish skin samples from 2 species, 12 sea cucumber samples, covering 5 species and 6 processing techniques, and 12 Antarctic krill samples involving 4 processing techniques. These 42 raw marine material samples were coded according to the given category and detailed sample information is listed in Supplementary Table S1. Antarctic krill meal samples were used for analysis without pre-treatment, and the remaining raw material samples were frozen in liquid nitrogen and then ground into powder. All the raw material samples were stored at −20 • C for further analysis.

Marine-Derived Peptide Powder Preparation
A total number of 66 marine-derived peptide powder samples, including 18 oyster peptides (OP), 16 sea cucumber peptides (SCP), 18 Antarctic krill peptides (AKP), and 14 fish skin peptides (FSP) were collected. These marine-derived peptide powder samples were mostly purchased from reputable and qualified companies in Chinese markets. Due to the limit amounts of reliable suppliers, five OP and eight SCP samples were self-prepared through enzymatic hydrolysis in our laboratory. All peptide powder samples were stored in a desiccator at room temperature and used for analysis directly without pre-treatment.

Unclassified Peptide Powder Preparation
Nine peptide powders from other categories of marine and non-marine origins, involving abalone peptides, octopus peptides, crocodile peptides, cuttlefish peptides, bull backstrap peptides, and donkey hide gelatin peptides, were self-prepared and assembled into an unclassified dataset for application of the classification models.

Moisture and Protein Content Determination
Moisture content was determined by drying in the oven at 105 • C until a constant weight was obtained according to the standard AOAC method [27]. The crude protein content was determined using an automatic Kjeldahl nitrogen analyzer (KjeltecTM8400, FOSS Quality Assurance Co., Ltd., Copenhagen, Denmark) according to the AOAC method [27]. The conversion factor of 6.25 was used to calculate the crude protein contents for all the samples. The crude protein content was expressed on a percentage of dry weight basis.

Amino Acid Profile Analysis
Amino acid compositions were analyzed by an automatic amino acid analyzer (L-8900, Hitachi Global Co., Ltd., Tokyo, Japan) following the method described by Cao et al. with some small modifications [28]. Briefly, samples were hydrolyzed with 6 M HCl containing 0.5% 2-mercaptoethanol at 110 • C for 22 h. Following hydrolysis, the sample was evaporated with nitrogen blowing at 50 • C to remove HCl. The residual was dissolved in 0.02 M HCl and then passed through a 0.22-µm membrane filter before injection into the amino acid analyzer. The content of each amino acid was expressed as g/100 g dry protein.

Statistical Analysis
Experimental data were subjected to one-way analysis of variance (ANOVA) by the SPSS 17.0 software package (SPSS Inc., Chicago, IL, USA). Duncan's test was used to determine significant differences between samples (p < 0.05). Chemometric analysis was performed to discriminate peptide samples in different categories using the SIMCA software package (Sartorius, Malmö, Sweden). Variable importance in projection (VIP) analysis was used to find the most influential variables for classification [29]. A PCA-Class was used to construct the classification model for each marine-derived peptide sample category. The collected data of 108 samples, including raw materials and marine derived peptides, were randomly split into a training dataset and a prediction dataset in a ratio of 2:1, respectively. The prediction dataset consisted of 36 samples from these 4 categories, including raw materials (n = 12) and their derived peptides (n = 24), while the remaining 72 samples were used to build the training dataset. The Mahalanobis distance (DModX PS+) was used to detect outliers in the four PCA-Class submodels. The sample in which the Mahalanobis distance was larger than Dcrit (95% confidence interval) of certain submodel was considered an outlier [30]. Accuracy was used to determine the PCA-Class model's classification performance, which was defined as the proportion of correctly classified samples to total samples. Table 1 lists the amino acid compositions and average crude protein contents of the four marine material categories, including the oyster, Antarctic krill, sea cucumber, and fish skin samples. The oyster samples contained significantly lower protein contents in comparison to other three categories of samples (p < 0.05), while the fish skin samples contained significantly higher protein contents than the other three categories of samples (p < 0.05). From the perspective of amino acid composition, the four categories of raw materials contained abundant amounts of GLU, which represented the amino acid with the highest content both in the oyster and Antarctic krill groups. The oyster group was detected with a high level of TAU, while TAU was present in very small amounts in the other three categories. The amino acid with the highest content in the sea cucumber and fish skin groups was GLY. The sea cucumber and fish skin samples also contained a higher level of PRO than the oyster and Antarctic krill samples.  [31]. In the present study, the significantly lower crude protein contents and a high amount of GLU, ASP, and TAU in oyster meat samples show good agreement with the previous study [32]. Chen et al. determined that whole Antarctic krill (Euphausia Superba) contained 76.5% of crude protein on dry weight basis, and that GLU was the amino acid with the highest content after ASP [33]. Wen et al. found that the protein contents of eight commercially processed sea cucumber species ranged from 40.7% to 63.3% and that the most abundant amino acids in sea cucumbers were GLY, GLU, ASP, ALA, and ARG [34]. The high levels of GLY in sea cucumber and fish skin samples were related to the presence of high amounts of collagens, which generally contain a special structure with the sequence of "GLY-X-Y" [35,36]. The differing protein contents and amino acid distribution profiles for these four marine material categories are in accordance with the previously reported results.

Classification of Four Categories of Raw Marine Materials
The contents of protein and 18 types of amino acids were included in the multivariate statistical analysis for both the PCA and OPLS-DA for the 42 raw material samples. According to the score plot of the PCA model ( Figure 1A), the first two principal components (PC1 and PC2) explain 87.40% of the total variance with an R2X (cum) value of 0.991 and a Q2 (cum) value of 0.917. The score plot could be divided into four regions without a clear boundary according to samples in the four categories ( Figure 1A). Sea cucumber samples, with the codes of SC7 and SC10, were close to the oyster meat samples in the score plot, which might result from the low protein contents of SC7 (45.79 ± 0.94%) and SC10 (54.27 ± 0.07%). Among all the tested samples, the oyster group showed an average protein content of 49.41 ± 11.74%, while the sea cucumber group contained an average protein content of 72.06 ± 12.22%. The Antarctic krill samples (AK7, AK8, and AK9) were also close to the oyster meat samples due to the lower protein content (64.22 ± 0.40%, 64.22 ± 0.40%, and 64.22 ± 0.40%) than the average protein content (73.56 ± 7.78%) of the Antarctic krill group. The PCA model was unable to provide good clustering among the four categories of marine materials. The loading plot ( Figure 1B) showed a strong negative correlation between TAU content and the protein contents for both the PC1 and PC2 scales. With a negative loading score, TAU was the amino acid that accounted for the discrimination of oyster meat samples, while GLY and PRO, with high positive loading scores, were the amino acids that accounted for the discrimination of sea cucumber and fish skin samples into different clusters by PC1. The results of the loading plot are consistent with the data shown in Table 1.
A supervised OPLS-DA model was further employed to achieve more significant clustering and reveal the major variables. It can be seen that the four raw marine material categories may be distinctly classified into four clusters ( Figure 1C). In particular, the fish skin and sea cucumber groups were distributed at the first principal component, t [1], with negative coordinates, while the oyster and Antarctic krill groups appeared at t [1] with positive coordinates. Moreover, the latter two groups could be further discriminated along the second principal component, t [2]. The oyster group was projected along with t [2] with negative coordinates, while the Antarctic krill group was distributed along t [2] with positive coordinates. The varying sample distribution profiles could be attributed to the different protein compositions. Previous studies have reported that myofibrillar and sarcoplasmic proteins represent the dominant proteins in oyster and Antarctic krill meat samples, resulting in similar compositions of amino acids from muscle proteins [37,38]. Saito et al. reported that the collagen content was estimated to be about 70% of the total protein in the sea cucumber body wall [35]. The collagen content of cod skins amounts, on average, to 71.2% on a dry weight basis [36]. The similar high concentrations of collagens in the sea cucumber and fish skin samples led to closer distribution profiles in score plots. A VIP plot was employed to indicate the weightage of each variable in the OPLS-DA model to discriminate different classes successfully. The VIP plot shows that the TAU, GLY, LYS, and protein contents are the major discriminants with VIP values above 1 ( Figure  1D). It was not surprising to find that the TAU, GLY, LYS, and protein contents, with great discriminative power, were also the variables recorded with high loading scores in Figure 1B. Accordingly, these four categories of marine material could be clearly divided into four clusters depending on the protein contents and amino acid composition profiles. A supervised OPLS-DA model was further employed to achieve more significant clustering and reveal the major variables. It can be seen that the four raw marine material categories may be distinctly classified into four clusters ( Figure 1C). In particular, the fish skin and sea cucumber groups were distributed at the first principal component, t [1], with negative coordinates, while the oyster and Antarctic krill groups appeared at t [1] with positive coordinates. Moreover, the latter two groups could be further discriminated along the second principal component, t [2]. The oyster group was projected along with t [2] with negative coordinates, while the Antarctic krill group was distributed along t [2] with positive coordinates. The varying sample distribution profiles could be attributed to the different protein compositions. Previous studies have reported that myofibrillar and sarcoplasmic proteins represent the dominant proteins in oyster and Antarctic krill meat samples, resulting in similar compositions of amino acids from muscle proteins [37,38]. Saito et al. reported that the collagen content was estimated to be about 70% of the total protein in the sea cucumber body wall [35]. The collagen content of cod skins amounts,

Classification of Four Categories of Marine-Derived Peptide Powders
As shown in Table 2, the OP group contained significantly lower protein contents (p < 0.05) and higher TAU contents than the other three groups. The FSP group contained the highest level of protein contents among the four categories of marine-derived peptide powders. The OP and AKP groups contained high levels of GLU and ASP, while the SCP and FSP groups showed high contents of GLY and GLU. The protein contents and amino acid composition profiles of peptide powders were consistent with those of the respective raw material samples. The protein contents and 18 types of amino acids were included in the multivariate statistical analysis of both PCA and OPLS-DA for the 66 marine-derived peptide samples. The PCA model exhibited good discriminant power with a R2X (cum) value of 0.916 and a Q2 (cum) value of 0.812 ( Figure 1E). The first two principal components (PC1 and PC2) explained 86.8% of the total variance. The amino acid compositions and protein contents of the sea cucumber peptide and oyster peptide samples might be influenced by the geographical locations and processing techniques, which resulted in a dispersed scope for the score plot ( Figure 1E). The loading plot of the PCA model for marine-derived peptide samples shows high similarity with the raw marine material samples shown in Figure 1B ( Figure 1F). Interestingly, the OPLS-DA score plot illustrates that the four categories of marine-derived peptide samples could be clearly discriminated ( Figure 1G). The protein content and certain amino acid compositions (GLY, LYS, GLU, ASP, and TAU) represent the significant variables with VIP values > 1 ( Figure 1H).
In the present study, the employed peptide samples were derived from raw marine materials covering various species and different pre-processing techniques were used. A good clustering of the four types of peptide samples could be achieved through multivariate statistical analysis depending on their protein contents and amino acid composition profiles. The results obtained from the score and loading plots of peptide powder samples are consistent with those of the raw material samples.

Establishment and Verification of Classification Models for Marine-Derived Peptide Samples
To verify the raw material categories of marine-derived peptide powders, the marinederived peptide samples and corresponding raw material samples were assigned to a new dataset and all 108 samples were divided into four new groups, namely, the oyster group, sea cucumber group, Antarctic krill group, and fish skin group. The contents of protein and 18 types of amino acids were included in the multivariate statistical analysis of both the PCA and OPLS-DA models (Figure 2). The first two principal components (PC1 and PC2) explain 86% of the total variance, with a R2X (cum) value of 0.912 and a Q2 (cum) value of 0.811 (Figure 2A). The results of the loading plot were in line with those of the four categories of raw marine materials and their derived peptide samples ( Figure 2B). The four categories of marine-derived peptide samples were clearly classified into four clusters together with their corresponding raw material samples by the OPLS-DA model ( Figure 2C). Protein content and certain amino acid compositions (GLY, TAU, LYS, and ASP) were the most important variables that accounted for the discrimination of the four sample categories ( Figure 2D). The PCA and OPLS-DA model exhibited tight cluster formation between peptide samples and respective raw materials and good separation between samples in different categories. The result demonstrates that the marine-derived peptides included in the present study share similar amino acid composition with the respective raw materials, confirming the authenticity of the marine-derived peptide samples. Moreover, the results demonstrate that multivariate statistical analysis is capable of capturing the variance of amino acid compositions between peptide samples from marine materials in different categories, and this further demonstrates the potential of using multivariate statistical analysis for the adulteration assessment of peptide powder samples. This was also shown by Seow et al., who reported that the multivariate statistical analysis of amino acid composition data is an effective method to differentiate between cave and house bird nests [39]. Azevedo et al. determined free amino acid profiles of bracatinga honeydew honey for geographical classification by using a chemometric approach [21]. Botoran et al. ascertained that multivariate statistical analysis, in combination with amino acid profiles, could provide valuable information for the authenticity verification of the varietal origins of fruit juices [19]. Further, Wistaff et al. studied whole amino acid profiles of various fruit types and validated the capability of adulteration detection for blond orange juice added in blood orange juice [18]. As such, multivariate statistical analysis tools, such as PCA and OPLS-DA, when combined with amino acid profiles, provide a feasible and promising strategy for the quality control of marine-derived peptides.
A PCA-Class model was applied to establish a classification model for distinguishing between the four categories of marine-derived peptide samples. Four PCA-Class submodels were constructed on basis of the training set, namely the oyster PCA-Class submodel, sea cucumber PCA-Class submodel, Antarctic krill PCA-Class submodel, and fish skin PCA-Class submodel ( Figure 3A-D). All samples in the oyster group showed a DModX The PCA and OPLS-DA model exhibited tight cluster formation between peptide samples and respective raw materials and good separation between samples in different categories. The result demonstrates that the marine-derived peptides included in the present study share similar amino acid composition with the respective raw materials, confirming the authenticity of the marine-derived peptide samples. Moreover, the results demonstrate that multivariate statistical analysis is capable of capturing the variance of amino acid compositions between peptide samples from marine materials in different categories, and this further demonstrates the potential of using multivariate statistical analysis for the adulteration assessment of peptide powder samples. This was also shown by Seow et al., who reported that the multivariate statistical analysis of amino acid composition data is an effective method to differentiate between cave and house bird nests [39]. Azevedo et al. determined free amino acid profiles of bracatinga honeydew honey for geographical classification by using a chemometric approach [21]. Botoran et al. ascertained that multivariate statistical analysis, in combination with amino acid profiles, could provide valuable information for the authenticity verification of the varietal origins of fruit juices [19]. Further, Wistaff et al. studied whole amino acid profiles of various fruit types and validated the capability of adulteration detection for blond orange juice added in blood orange juice [18]. As such, multivariate statistical analysis tools, such as PCA and OPLS-DA, when combined with amino acid profiles, provide a feasible and promising strategy for the quality control of marine-derived peptides.
A PCA-Class model was applied to establish a classification model for distinguishing between the four categories of marine-derived peptide samples. Four PCA-Class submodels were constructed on basis of the training set, namely the oyster PCA-Class submodel, sea cucumber PCA-Class submodel, Antarctic krill PCA-Class submodel, and fish skin PCA-Class submodel ( Figure 3A-D). All samples in the oyster group showed a DModX PS+ < Dcrit (0.05) in the oyster PCA-Class submodel and DModX PS+ > Dcrit (0.05) in the other three submodels. The same phenomenon was observed for samples in the sea cucumber group, Antarctic krill group, and fish skin group with respective submodels (Figure 3A-D). This demonstrates that the four PCA-Class submodels could correctly classify the 72 samples into four groups, indicating that the accuracy of the PCA-Class model was 100% ( Figure 3E). A prediction dataset was employed to assess the predictive ability of the constructed classification model as external validation. The four submodels could classify all the 36 samples from the prediction dataset into respective groups with a classification accuracy rate of 100% ( Figure 3F). It has been reported that a PCA-Class model built for the authentication of the protected denomination of origin for paprika powder has had an accuracy of 91%, and the PLS-DA model had an accuracy of 96% [30]. These values are lower than those in the present study. The PCA-Class model based on amino acid profiles could be employed to distinguish the raw material categories of marine-derived peptides.
Foods 2021, 10, x FOR PEER REVIEW 10 of 14 the constructed classification model as external validation. The four submodels could classify all the 36 samples from the prediction dataset into respective groups with a classification accuracy rate of 100% ( Figure 3F). It has been reported that a PCA-Class model built for the authentication of the protected denomination of origin for paprika powder has had an accuracy of 91%, and the PLS-DA model had an accuracy of 96% [30]. These values are lower than those in the present study. The PCA-Class model based on amino acid profiles could be employed to distinguish the raw material categories of marine-derived peptides.

Application of the Classification Model to Adulteration Detection with Marine-Derived Peptide Mixture Samples
Considering the fact that marine-derived peptides might be partially substituted

Application of the Classification Model to Adulteration Detection with Marine-Derived Peptide Mixture Samples
Considering the fact that marine-derived peptides might be partially substituted with peptides from other marine species, in silico calculation of mixture samples could be employed as a good test to validate the classification model. In addition, the discrimination of mixture samples was a challenge for the classification model because of the high similarity between mixture samples and the modeling samples. In order to increase the variability of mixture samples, binary, ternary, and quaternary peptide mixtures with different combination ratios of the four types of marine-derived peptide powder samples were employed for the application of the classification model to adulteration detection by in silico calculation. The discrimination results of the 28 marine-derived peptide mixture samples are shown in Figure 4A-D. A total of 28 mixture samples showed a DModX PS+ > Dcrit (0.05) in the classification model, indicating that these 28 samples were not classified as belonging to any of the four types of marine-derived peptides; however, one binary mixture sample (M2-4, AKP:OP = 1:1) in Figure 4C was very close to the red line (Dcrit 0.05). Accordingly, the accuracy of classification was 100%. It can be concluded that the classification model built in this study displayed a reasonably good predictive capability with a correct level of 100%, being higher than those previously reported in literature for mixtures of other foods, such as 90.03% to 96.52% for edible oil [40], 93.70% for virgin olive oil [41], and 92.00% to 96.60% for milk [42].  Figure 4C was very close to the red line (Dcrit 0.05). Accordingly, the accuracy of classification was 100%. It can be concluded that the classification model built in this study displayed a reasonably good predictive capability with a correct level of 100%, being higher than those previously reported in literature for mixtures of other foods, such as 90.03% to 96.52% for edible oil [40], 93.70% for virgin olive oil [41], and 92.00% to 96.60% for milk [42]. The sample (M2-4, AKP:OP = 1:1) in the present study was a mixture of Antarctic krill peptides and oyster peptides. This seemed to be in line with the high similarity between the Antarctic krill and oyster material in terms of the muscle protein composition. The sample (M2-4, AKP:OP = 1:1) in the present study was a mixture of Antarctic krill peptides and oyster peptides. This seemed to be in line with the high similarity between the Antarctic krill and oyster material in terms of the muscle protein composition. It was hypothesized that extension of the dataset with a larger number of samples to construct models might improve the robustness and accuracy of the model for classification [18]. Overall, the PCA-Class model constructed with limited number of samples in the present study exhibited a clear tendency for the correct classification of marine-derived peptides.

Application of the Classification Model for Adulteration Detection with Other Marine and Non-Marine Peptide Samples
To evaluate the discriminative ability of the constructed classification model for peptide samples from other materials, nine peptide powder samples derived from both marine and nonmarine materials were also employed and defined as an unclassified dataset. The amino acid compositions and protein contents of nine unclassified samples are shown in Supplementary  Table S2. As presented in Figure 4E-H, all unclassified peptide samples showed a DModX PS+ > Dcrit (0.05) in the four PCA-Class submodels, indicating that the classification accuracy of the model for the nine unclassified peptide samples was 100%. These unclassified peptide samples were correctly differentiated from the four categories of marine-derived peptide powder samples, demonstrating the accuracy and feasibility of the PCA-Class model, based on amino acid profiles, in the classification of marine-derived peptides.
Marine-derived peptides from various materials in different categories have been prepared and applied in the food industry. Furthermore, marine-derived peptides might be produced from different raw materials with the same common name. As an example, sea cucumber peptides might be prepared from different sea cucumber species, such as Apostichopus japonicus, Holotoria floridona, and Cucumaria frondosa. In addition, raw marine materials with different pre-treatment processes, such as fresh sea cucumbers, dried sea cucumbers, and salted sea cucumbers, might be used for the production of peptides. Considering the various material species and production processes of peptide powders, distinguishing between different categories of marine-derived peptides, such as sea cucumber peptides, oyster peptides, krill peptides, and fish skin peptides, seems to be difficult. Inspired by the advancement of chemometric analysis methods, we have established an effective PCA-Class model for the discrimination of marine-derived peptides in different categories based on amino acid profiles, which might be used for the quality control of marine-derived peptide powders in the food industry.

Conclusions
This study has confirmed the feasibility of discrimination and adulteration detection of four representative categories of marine-derived peptides, including sea cucumber peptides, oyster peptides, Antarctic krill peptides, and fish skin peptides, by employing amino acid profiles combined with chemometric analysis. TAU, GLY, and LYS were concluded to be the major amino acids for the discrimination of the marine-derived peptide powders and the respective raw marine materials by PCA and OPLS-DA models. A PCA-Class model was further applied to construct a classification model for the discrimination of the four categories of marine-derived peptide powders, and the model showed excellent predictive capability with 100% samples being correctly classified in the prediction dataset. The classification accuracy of 28 in silico peptide mixture samples, consisting of binary, ternary, and quaternary peptide mixtures with different combination ratios of the four categories of peptide powders, was 100%. Furthermore, another validation dataset consisting of a total of nine unclassified samples was correctly classified when employing the established PCA-Class model, confirming the robustness of the classification model. The methodology developed in this study seems to be promising and reliable for the classification of the four types of marine-derived peptide powder categories and be suitable for the discrimination of other animal-based peptides with the objective of adulteration detection.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/foods10061294/s1. Table S1: Sample information of the raw marine materials. Table S2: Crude protein content and hydrolyzed amino acid composition of nine unclassified peptide samples.