1H NMR Fingerprinting of Soybean Extracts, with Emphasis on Identification and Quantification of Isoflavones

1H NMR spectra were recorded of methanolic extracts of seven soybean varieties (Glycine max.), cultivated using traditional and organic farming techniques. It was possible to identify signals belonging to the groups of amino acids, carbohydrates, organic acids and aromatic substances in the spectra. In the aromatic zone, the isoflavone signals were of particular interest: genistein, daidzein, genistin, daidzin, malonylgenistin, acetylgenistin, malonyldaidzin signals were assigned and these compounds were quantified, resulting in accordance with published data, and further demonstrating the potential of the NMR technique in food science.


Introduction
Soybean (Glycine max L.), traditionally produced and consumed in China, is today one of the most important agricultural commodities in the world, cultivated mainly for its high content of protein and oil. In recent decades, several studies have shown the health benefits of soy components; regular consumption of soy foods can reduce the incidence of breast, colon, and prostate cancers [1], prevent heart disease and osteoporosis, and reduce menopausal symptoms [2]. These discoveries have resulted in the development and commercialization of many functional foods and food supplements based on OPEN ACCESS soy ingredients. Isoflavones are most likely the components responsible for the health benefits of soy [3]. Isoflavones belong to a group of compounds that share a basic structure consisting of two benzyl rings joined by a three-carbon bridge. Isoflavones in soybeans and soy products exist as aglycones (daidzein, genistein, and glycitein), 7-O-β-glucosides and two glucoside conjugate forms, acetylglucosides and malonylglucosides ( Figure 1). The 7-O-β-glucosides are commonly known as daidzin, genistin or glycitin. Soybeans contain high amounts of isoflavones, normally in the range of 1-4 mg/g dry weight [4]. Soybean components, in particular isoflavones, can widely vary, depending on the varieties, cultivation technique etc. [5], so it is important to develop specific and rapid analytical methods for the characterization of soybeans and soy products.
NMR spectroscopy has become increasingly important in food science [6], both as a fingerprinting technique [7] and as quantitative analysis tool [8,9]. This advance in the development of NMR methods in food characterization and control is mainly due to the simple preparation of samples, the speed of analysis and the possibility of obtaining structural information in a complex food matrix. In this experimental work, the 1H NMR spectra of methanolic extracts of different defatted soybean varieties, coming from traditional and organic farming, were recorded, in order to determine the potential of the NMR technique for soybean composition control. Figure 2 shows the 1H NMR spectrum of the methanolic extract of a soybean defatted meal, registered with PRESAT water suppression. Many signals of the spectra (alanine, organic acids, choline, sugars and isoflavones) were identified and are reported in Table 1. A group of signals were tentatively attributed to soy saponins characterized by the linkage to a 2,3-dihydro-2,5-dihydroxy-6methyl-4H-pyran-4-one (DDMP) moiety.

1H NMR Signal Assignment
The identification of the substances was made by recording NMR spectra of pure compounds, or by comparison with spectra previously reported in literature or in databases. For isoflavones, a further confirmation of the signal assignments was obtained by spiking the soybean extracts with appropriate standards, as reported in section 2.2. The identification of DDMP saponins signals was based on the 1H NMR data previously reported for soy DDMP saponins recorded in deuterated DMSO (10). Table 1 reports only the observable and not overlapping signals, other signals were present in the zone 0-2.4 ppm, but strongly overlapped with signals of lipids.

Assignment of Isoflavones Signals
One of the most interesting classes of substances in soybean is isoflavones. These compounds are characterized by the polycyclic structure reported in Figure 3, with A and B aromatic rings that give signals in a region of the 1H NMR spectrum (up to 6 ppm) generally free from interferences.
The spectra of soybean samples were carefully studied in this zone (6-10 ppm) in order to establish that the isoflavones signals did not overlap and were reliable for quantitative analysis.
In order to obtain reliable signal assignment, the 1H NMR spectra of the standard solutions of genistein, daidzein, genistin and daidzin were recorded using the same conditions as the samples. The complete signal assignment of the standard compounds is reported in Table 2. Atoms were designated according to the numbering scheme of the carbon skeleton defined in Figure 3.
Comparison of the chemical shifts of aglycones (daidzein and genistein) and of their 7-O-glucosides (daidzin and genistin), shows that hydrogens of B ring are not influenced by the bond with glucose, while those of ring A are shifted to lower fields: for daidzin shifts were 0.28 ppm for hydrogen linked to C6, 0.38 ppm for hydrogen linked to C8, 0.10 ppm for hydrogen linked to C5, 0.07 ppm for hydrogen linked to C2. For genistin they were 0.29 ppm for hydrogen linked to C6, 0.36 ppm for hydrogen linked to C8 and 0.07 ppm for hydrogen linked to C2.
The isoflavones signals were then identified in the 1H NMR spectra of the soybean extracts. To obtain unambiguous assignments, samples were spiked with standards. As previously reported, the aromatic zone in the 1H NMR spectrum was rather free from interferences and the isoflavone signals were easily detectable. A specific NMR signal was selected for each isoflavone, and its area was utilized for the quantitative analysis. The signals chosen are those reported in the last part of Table 1.
For the malonylglucosides and acetylglucosides forms of isoflavones not available as standards, the signal assignment was performed on the basis of the specific characteristics of H signals (chemical shifts, multiplicities and coupling constants), assuming little shifts of aromatic signals in respect to simple glycosides, due to the binding of acids to the C6-hydroxy group of glucose. Figure 4 shows the group of signals attributed to H6 of acetylgenistin and malonylgenistin, on the basis of the following consideration. It is common knowledge that, in general, the linkage of the acetyl group to a molecule causes shifts of 1H NMR signals to lower field, and this property was previously demonstrated also in the specific case of acetic esters of glucose (11). So it was reasonable to assume that the deshielding effect propagates also to the isoflavone moiety, and the signal at 6.529 ppm, showing the same multiplicity and coupling constant of H6 of genistin, was attributed to H6 of acetylgenistin. A similar shift was also previously reported for genistin and acetylgenistin 1H NMR spectra recorded in DMSO (12). A similar signal at 6.495 ppm was therefore assigned to malonylgenistin. In order to explain the unexpected shift to a higher field induced by the malonyl group to H6 we suppose that the free acid group of malonic acid could make weak interaction with the hydroxyl group at position 5 of genistin, via hydrogen bonding. The modified electron distribution could cause a little shielding near the H6, hence a shift to high fields. This hypothesis is supported by the fact that only H6 of malonylgenistin shows this behavior; the other observable hydrogen that can be attributed to malonylgenistin (H8, 6.71 ppm, Figure 4) is shifted to low fields with respect to the genistin signal as expected, like all the observable hydrogens of malonyl daidzin, which does not have the hydroxyl group in position 5. The signal at 7.250 ppm was chosen for the quantification of malonyldaidzin, while it was not possible to identify a clearly separate signal for acetyldaidzin. No forms of glycitein were detected in the spectra, probably because of their low abundance in the soybean varieties analyzed.

1H NMR Quantification of Isoflavones in Soybean Samples
For a reliable quantitative analysis by 1H NMR it is very important to improve the phase and the baseline; the phase was corrected manually for each sample, and a polynomial baseline correction was applied over the entire spectral range. In some cases a further manual adjustment of the baseline was performed. The manual integration of the selected signals and the comparison with TSP area enabled the quantitative determination of the soybean isoflavones. The data obtained are reported in Table 3.
The data allow some preliminary conclusions to be made: the most abundant isoflavones in the soybean varieties analyzed are malonylglucosides, followed by glucosides, which is in agreement with the data previously reported (4,5). Aglycone forms are present in very low concentration especially in samples from organic farming. Total isoflavones content of the soybean varieties analyzed included a large concentration range, and soybeans from traditional farming present, in general, a higher content of isoflavones, except for varieties T5 and T7.
The quantities of isoflavones obtained from soybean samples are in accordance with previously published data [4].

Experimental Section
Samples. 1H NMR analyses were performed on the seeds of seven experimental soy varieties (Table 4), each coming from two different cultivation techniques (traditional and organic).
Sample preparation. Finely ground soybean samples were defatted by ethyl ether extraction in an automated Soxhlet apparatus (VELP, Milan). 0.5 g of the soybean defatted meal was then extracted for 1 hour at room temperature under magnetic stirring with 25 mL of a mixture methanol/water (4:1 v/v). The solution was filtered using filter paper and then a nylon filter (0.45 μm), taken to dryness and dissolved in 1 mL of CD3OD containing 0.02 mg of 3-(trimethylsilyl)-propionate-d4 (TSP) and transferred in a 5 mm NMR sample tube. TSP was used as both a chemical shift reference (δ = 0) and internal standard for the quantitative analysis.
1H NMR conditions. Spectra were recorded on a VARIAN INOVA-600 MHz spectrometer, operating at 14.1 T and equipped with a 5 mm-triple resonance inverse probe. The 1H NMR spectra were acquired with low power selective water signal irradiation during relaxation delay (d1). Data were collected at 308 K, with sample rotation (20 Hz), 32K complex points, using a 90° pulse length. 128 scans were acquired with a spectral width of 8000 Hz, an acquisition time of 1.892 s and a recycle delay of 1.5 s. The NMR spectra were processed by MestreC software. The spectra were Fourier transformed with FT size of 64K and 0.3 Hz line-broadening factor, phased and baseline corrected, and referenced to the TSP peak (0 ppm). Phase correction was performed manually for each sample, and a polynomial baseline correction was applied over the entire spectral range.

Conclusions
This study represents a further confirmation that 1H NMR is a powerful tool for the analysis of food matrices, offering the advantages of simple sample preparation and the rapidity of data acquisition. In the particular case of soybean, the approach presented could have many interesting applications, for example in studying the different distribution of metabolites in GMO soy, or for the screening of food and food supplements based on soy components (e.g., isoflavones supplements) present on the market. The method could also be useful for a rapid quantification of the main soybean isoflavones, as it requires minimal sample pre-treatment.