1H NMR Combined with Multivariate Statistics for Discrimination of Female and Male Flower Buds of Populus tomentosa

1H Nuclear Magnetic Resonance (1H NMR) combined with multivariate statistics was adopted to discriminate female and male flower buds of Populus tomentosa in the study. Samples of 11 female and 16 male flower buds of P. tomentosa were collected in Beijing, China. 1H NMR spectra were acquired on a 400 MHz spectrometer. In total, 30 chemical compounds were identified with standards and literature according to chemical shifts, peak areas, and multiplicity. Principal component analysis (PCA), hierarchical clustering analysis (HCA), and supervised orthogonal partial least squares-discriminant analysis (OPLS-DA) were applied to discriminate female and male flower buds. An apparent grouping trend (R2X, 0.809; Q2, 0.903) between female and male groups was exhibited with PCA and HCA. The two groups were also well discriminated with OPLS-DA (R2X, 0.808; R2Y, 0.976; Q2, 0.960). Combined with variable importance in projection (VIP) > 1.0 and p < 0.05 of OPLS-DA, it was found that the content of daucosterol, β-sitosterol, ursolic acid, and betulonic acid in male group was higher than that in female, which should be the key differences of chemical constituents in female and male flower buds of P. tomentosa. The study demonstrated that 1H NMR combined with multivariate statistics could be used to discriminate female and male plants and clarify differences, which provided a novel method to identify the gender of dioecious plants.


Introduction
Dioecious plants refer to the plants that have female and male flowers grown in different individuals. Although the incidence of dioecy in flowering plants globally is relatively low (6~7%) [1], there are several traditional Chinese medicines from dioecious plants, such as Duzhong (Eucommiae Cortex) from Eucommia ulmoids Oliv. (Fam. Eucommiaceae), Yinxingye (Ginkgo folium) from Ginkgo biloba L. (Fam. Bilobaceae), and Tianhuafen (Trichosanthis Radic) from Trichosanthes kirilowii Maxim. and T. rosthornii Harms (Fam. Cucurbitaceae) [2]. Most traditional Chinese medicines derived from dioecious plants are applied in clinical and pharmaceutical practice regardless of their gender.
Populus tomentosa Carrière (Fam. Salicaceae), a typical kind of dioecious plant, is widely distributed in North China Plain [3] and planted by the roadside to provide shelterbelts due to their strong adaptability, rapid growth, cooling [4], noise reduction [5], increasing air humidity [6], and reinforcing soil [7]. The dried male inflorescence of P. tomentosa, called Yangshuhua in Chinese, has been applied in clinical practice for bacillary dysentery and acute enteritis in Pharmacopoeia of the People's Republic of China [8]. Studies have shown that the aqueous extract of male inflorescence of P. tomentosa possessed anti-inflammatory, anti-diarrheal, anti-microbial, and analgesic activities [9,10]. The fresh male inflorescence is used as food materials in some places of China. However, the female inflorescence is neither included in Chinese pharmacopoeia nor adopted as food materials. Therefore, it is of great significance to discriminate female and male inflorescence. Inflorescence is developed from flower buds and identifying the female and male flower buds can help knowing the gender of inflorescence in advance. But the female and male flower buds of P. tomentosa are highly similar in appearance and can't be distinguished visually. In addition, the difference of chemical compositions in flower buds, one of reproductive organs, can better reflect the characteristics of genders. Therefore, it is necessary to establish an analytical method to discriminate them and filter the differences in chemical compositions.
At present, some studies have been carried out on female and male P. tomentosa. For example, 13 chemical constituents were isolated and identified from male bark of P. tomentosa, including sakuranin, salicyltomenside, and siebolside B [11,12]. The content of micranthoside, siebolside B, sakuranin, and isosakuranin was different in female and male barks of P. tomentosa by high-performance liquid chromatography (HPLC) fingerprint method with multivariate statistical analyses [13]. Volatile components in female and male flower buds were compared with HS-SPME-GC-MS (headspace-solid phase microextraction-gas chromatography-mass spectrometer). It was found that the content of 2-cyclohexen-1-one, benzyl benzoate, and methyl benzoate in female was significantly higher than that in male, and the content of ethyl benzoate in male was significantly higher than that in female [14]. However, the differences of non-volatile components between female and male flower buds of P. tomentosa have not been reported yet.
HPLC, liquid chromatography-mass spectrometry (LC-MS), and GC-MS usually require chromatographic separation of chemical compositions before detection, whereas 1 H NMR can be applied directly without prior chromatographic separation. Therefore, chemical compositions can be reflected more comprehensively with 1 H NMR. Besides, the application of GC-MS is limited by the volatility of analytes. For non-volatile compounds, they should be processed by derivatization reaction, which is unstable, time-consuming, and complex. 1 H NMR can detect a variety of chemical compositions at low concentrations in non-destructive manner without complex pre-treatment [15][16][17]. It shows the signal peaks of all protons as a whole and areas of signal peaks are proportional to the number of protons and relative concentrations in samples. Therefore, it can be applied in both qualitative and quantitative analysis. It has become a useful discrimination tool in recent years, such as the detection of dairy food fraud [18], adulteration of herbal medicines [19], and classification of wine varieties [20]. At present, 1 H NMR has been widely applied in medical care [21,22], food [23][24][25], chemistry [26], and other fields.
In order to discriminate female and male flower buds of P. tomentosa and clarify the differences in chemical compositions, 11 female and 16 male flower buds of mature P. tomentosa were collected in Beijing, China. 1 H NMR combined with PCA (principal component analysis), HCA (hierarchical clustering analysis), and OPLS-DA (orthogonal partial least squares-discriminant analysis) were applied. With variable importance in projection (VIP) > 1.0 and P < 0.05 of OPLS-DA, differences of chemical compositions were screened in the study.

Visual Inspection of 1 H NMR Spectra
1 H NMR spectra of 11 female samples and 16 male samples are shown in Figure 1. The representative 1 H NMR spectra of female (F1) and male (M11) samples with the same amplification degree are shown in Figure 2. It can be observed that female and male samples have similar profile on a whole and several differences in intensities of peaks. Some obvious differences by visual inspection are highlighted with red boxes in Figure 2. For example, the intensities at the chemical shift of 1.30 (unknown), 4.13 (sucrose), 7.23 (myricetin), and 7.31 (caffeic acid) in M11 are higher than those in F1, which indicates that there exist differences in content of chemical compositions between female and male flower buds of P. tomentosa.

Compound Assignment
According to standards and literature [11,[27][28][29][30], combined with chemical shifts, peak areas, and multiplicity, 30 compounds are identified and summarized in Table 1. Some chemical structures are displayed in Figure 3 together with the specific hydrogen atoms corresponding to the mentioned chemical shifts in Table 1.

PCA and HCA
PCA is performed to evaluate the differences between female and male groups. The result of PCA score plots is displayed in Figure 4. It can be clearly observed a clear grouping trend between female and male groups, which indicates female and male flower buds

PCA and HCA
PCA is performed to evaluate the differences between female and male groups. The result of PCA score plots is displayed in Figure 4. It can be clearly observed a clear grouping trend between female and male groups, which indicates female and male flower buds of P. tomentosa, can be discriminated with PCA. The first contribution value of two principal components is accounted for 80.9% (R 2 X) in the total variance (PC1 for the first principal component described 69.2% and PC2 for the second principal component described 11.7% of the sample variability). The predictive ability of the model (Q 2 ) is 90.3%, which demonstrates that it is a good model. Figure 3. Some chemical structures with the specific hydrogen atoms corresponding to the mentioned chemical shifts.

PCA and HCA
PCA is performed to evaluate the differences between female and male groups. The result of PCA score plots is displayed in Figure 4. It can be clearly observed a clear grouping trend between female and male groups, which indicates female and male flower buds of P. tomentosa, can be discriminated with PCA. The first contribution value of two principal components is accounted for 80.9% (R 2 X) in the total variance (PC1 for the first principal component described 69.2% and PC2 for the second principal component described 11.7% of the sample variability). The predictive ability of the model (Q 2 ) is 90.3%, which demonstrates that it is a good model. HCA is carried out based on the first four PCs from the PCA model, and displays relationships between female and male groups in the form of dendrogram. As shown in Figure 5, 27 samples could be clearly separated into two groups, i.e., all of the 11 female samples are classified into Group 1 (left) and 16 male samples are classified into Group 2 (right). Though the distribution of some male samples in score plot of PCA is dispersed, they are clearly divided into group 1 in the dendrogram of HCA. The result of HCA confirms the classification with PCA model. These results demonstrate that there are differences in chemical compositions between female and male flower buds of P. tomentosa. Figure 5, 27 samples could be clearly separated into two groups, i.e., all of the 11 female samples are classified into Group 1 (left) and 16 male samples are classified into Group 2 (right). Though the distribution of some male samples in score plot of PCA is dispersed, they are clearly divided into group 1 in the dendrogram of HCA. The result of HCA confirms the classification with PCA model. These results demonstrate that there are differences in chemical compositions between female and male flower buds of P. tomentosa.

OPLS-DA
OPLS-DA is performed to explore the chemical composition differences between female and male flower buds of P. tomentosa. The score plot is shown in Figure 6. R 2 X of OPLS-DA model is 80.8%, demonstrating that 80.8% of variation can be modeled by the selected components. R 2 Y is 97.6%, indicating that the model is well fitted. Q 2 is 96.0%, demonstrating that it has a good predictability.
With VIP > 1.0 and p < 0.05 to filter the differences that are responsible for differentiating female and male groups, it is found that areas of δ 0.80~2.00 and δ 6.96~8.08 in 1 H NMR spectra of male group are significantly higher than those of female. In other regions, there is not an apparent difference between the two groups. Furthermore, combined with the 30 compounds and the paired Student's t-test for the significance analysis, the content of daucosterol, β-sitosterol, ursolic acid, and betulonic acid in male group is higher than that in female group (p < 0.05). It can be concluded that the unequal content of these compounds should be the key differences of chemical constituents in female and male flower buds of P. tomentosa.

OPLS-DA
OPLS-DA is performed to explore the chemical composition differences between female and male flower buds of P. tomentosa. The score plot is shown in Figure 6. R 2 X of OPLS-DA model is 80.8%, demonstrating that 80.8% of variation can be modeled by the selected components. R 2 Y is 97.6%, indicating that the model is well fitted. Q 2 is 96.0%, demonstrating that it has a good predictability.

Apparatus
A DFT-50A grinder was purchased from Wenling Linda Machinery Co., Ltd. With VIP > 1.0 and p < 0.05 to filter the differences that are responsible for differentiating female and male groups, it is found that areas of δ 0.80~2.00 and δ 6.96~8.08 in 1 H NMR spectra of male group are significantly higher than those of female. In other regions, there is not an apparent difference between the two groups. Furthermore, combined with the 30 compounds and the paired Student's t-test for the significance analysis, the content of daucosterol, β-sitosterol, ursolic acid, and betulonic acid in male group is higher than that in female group (p < 0.05). It can be concluded that the unequal content of these compounds should be the key differences of chemical constituents in female and male flower buds of P. tomentosa.

Sample Collection
Samples of 11 female (F1~F11) and 16 male (M1~M16) flower buds of P. tomentosa were collected on 23~28th, February, 2020 in Beijing, China. The detailed sample information is shown in Table 2. The gender of all samples was clearly identified on 6th, April, 2020 based on their mature flowers and/or fruits.

Sample Preparation
Samples were dried in shade and ventilated environment for 10 days with temperature of 18 ± 5 • C and relative humidity of 25~40%, crushed into powder with a grinder, and passed through a 20-mesh sieve. Powder sample of 1.0 g was accurately weighed into 50 mL conical flask and 25 mL of methanol was added precisely. Samples were weighed, ultrasonically extracted for 30 min (100 W, 40 kHz), and weighed again after cooling to room temperature. Weight losses of extracts were replaced with methanol. The obtained extracts then were filtered and filtrates were placed into 35 mL evaporating dishes and concentrated at room temperature. 1.50 mL CD 3 OD with 0.03% TMS was added, dissolved ultrasonically, and filtered with 0.22 µm membrane filter. A liquor of 0.8 mL was transferred into nuclear magnetic tube for analysis.

1 H NMR Measurement
The 1 H NMR spectra were acquired on a Bruker AVANCE II 400 MHz spectrometer at 293 K. TMS and CD 3 OD provided chemical shift reference ( 1 H, δ 0.00) and field frequency lock, respectively. A zg30 pulse sequence was used to suppress the residual H 2 O signal. A 3.6. Data Processing 3.6.1. Spectra Pre-Treatment Free induction decay files were imported into MestRenova software (Version 14.0, Mestrelab Research SL, Santiago de Compostela, Spain). Automatic baseline correction and manual phase correction were performed. The proton signal of TMS was calibrated to δ 0.00. 1 H NMR spectra were integrated into bins with width range of 0.04 ppm. The total integration width was δ 0.00~9.00 except for the regions of δ 3.31~3.35 corresponding to methanol and δ 4.70~5.10 for residual water. Signal areas of bins were normalized to the sum of spectra to compensate for the differences in concentrations and acted as variable values for further statistical analysis [31,32].
To investigate the repeatability of the method, six replicates of both F1 and M11 were evaluated, respectively. The results were expressed with correlation coefficient [33]. The correlation coefficients of both F1 and M11 samples were higher than 0.99, respectively, which indicated that the repeatability was good.

Statistical Analysis
All multivariate analysis and calculations were performed on SIMCA-P software (version 14.1, Umetrics, Malmö, Sweden). The data were imported into the software and scaled by Pareto scaling method to reduce the relative importance of large values and to keep the data structure partially intact [34]. Then the data were submitted to PCA, HCA, and OPLS-DA analysis, respectively [35,36].
PCA is an unsupervised statistical method for reducing dimensions of a database by linear combinations of a starting set of variables based on their maximum variance [37] and can convert original variables into new independent variables named principal components (PCs) [7]. It can make a preliminary judgement on the distribution status, natural aggregation, and abnormal samples [38]. R 2 and Q 2 are the important metrics to evaluate PCA model. R 2 indicates the ability of PCs to explain the variation in a variable and Q 2 is the predictability of the model. R 2 and Q 2 closing to 1.0 demonstrates that the model is reliable and has good fitting accuracy [36]. HCA was also carried out by SIMCA-P software with the first four principal components of PCA model. The distances among samples were calculated through Ward's minimum-variance method. The result was presented as a dendrogram in the study.
OPLS-DA extends a regression of PCA, uses the class membership to maximize the variation, and introduces an orthogonal signal correction filter to separately handle the systematic variation correlated or uncorrelated to the Y variable. Therefore, it has better discriminant ability for the samples with larger within-class divergence than that of PCA [39]. R 2 X, R 2 Y, and Q 2 are the parameters frequently used to evaluate OPLS-DA model. R 2 X shows the capability of differentiating groups in established model. R 2 Y and Q 2 reflect the goodness-of-fit and ability of prediction in models, respectively. Values of the three parameters closing to 1.0 indicate reasonably good differentiating, fitness, and prediction for the constructed model [40].
To further explore the chemical compositions that contribute significantly for the sample discrimination, VIP scores were adopted. VIP scores larger than 1.0 demonstrates that variables are important for sample discrimination. In this study, VIP > 1.0 and P < 0.05 were used to select variables that contribute greatly to the differences between female and male groups [41]. The paired Student's t-test was performed for the significance analysis.

Chemical Compound Assignment
The chemical compounds were identified by adding standards or consulting literature according to chemical shifts, peak areas, and multiplicity. Standards considered in the present Molecules 2021, 26, 6458 9 of 12 samples were added. The presence of added standards was confirmed if the existing peaks in the spectra increased, whereas their presence was ruled out if new peaks appeared.

Discussion
There are no significant differences in 1 H NMR spectra of the 11 female samples visually, nor are the 16 male samples. The female and male samples have similar profile in chemical shifts and multiplicity but several differences in intensities of peaks, which implies that the content of related compounds is different. In addition, 30 compounds are assigned with 1 H NMR spectra. All of the compounds are the secondary metabolites of plants. The 27 samples could be divided into two groups with PCA, HCA, and OPLS-DA analysis successfully, which are coincident with their gender. The result proves that 1 H NMR combined with multivariate statistics to discriminate gender is feasible, which provides a novel idea for the classification of dioecious plants.
The content of compounds could be reflected with related intensities of 1 H NMR spectra. Based on this principle, it is found that the content of four compounds, i.e., daucosterol, β-sitosterol, ursolic acid, and betulonic acid, in male group are significantly higher than that in female group. All of the four compounds have notable pharmaceutical activities. The effects of daucosterol have been reported for the suppression of cancer, promotion of neural stem cell proliferation, induction of Th1 immune response, and especially suppression of acute enteritis [42]. As mentioned in introduction, only the male was applied in clinical practice for bacillary dysentery and acute enteritis. This study shows that daucosterol is one of markers to distinguish the female and male flower buds of P. tomentosa. Furthermore, its higher content in male samples may be the reason why the male is adopted for acute enteritis. The pharmacological activities of ursolic acid are antimicrobial activity [43], antivirus [44], anticancer [45], vascular protection [46], and Gram-positive bacteria inhibition [44]. The antimicrobial activity and high content of ursolic acid in male may play an important role for the male to treat bacillary dysentery and acute enteritis, which are mainly caused by bacterial infection.
β-Sitosterol can regulate the tumor growth by decreasing the membrane fluidity, reduce the fluidity of fatty acyl group of phospholipids, increase the thickness of double layer coating of lipidosome, and maintain the stability of lipidosome [47]. It is reported that betulonic acid is a bioactive substance exhibiting antiviral, cytotoxic, and antiangiogenic activities. It is also widely used as an intermediate in synthesis of various triterpenoid derivatives with anti-inflammatory, antiviral, and antiproliferative properties [48]. The anti-inflammatory activity of β-sitosterol and betulonic acid with high content in male may be conducive to the treatment of bacillary dysentery and acute enteritis. In summary, the four compounds with higher content are not only markers for distinguishing the female and male, but also accounted for the application of the male in clinical and pharmaceutical practice.

Conclusions
In this study, 1 H NMR combined with multivariate statistics was used to study differences in chemical compositions between female and male flower buds of P. tomentosa. Several differences in intensities of signals could be observed by visual inspection. The female and male groups were clearly differentiated with PCA, HCA, and OPLS-DA analysis. Furthermore, to discover the differences in chemical compositions between them, 30 compounds were assigned with standards and literature. Four compounds, i.e., daucosterol, β-sitosterol, ursolic acid, and betulonic acid were filtered as markers for their higher content in male samples. The pharmaceutical activities of the four markers were coincident with their clinical practice of the male. It is the first time that differences in chemical composition of the dioecious plant have been revealed with 1 H NMR, which reveals the essential difference in secondary metabolites of dioecious plants.