Rapid Screening and Identification of Daidzein Metabolites in Rats Based on UHPLC-LTQ-Orbitrap Mass Spectrometry Coupled with Data-Mining Technologies

Daidzein, the main bioactive soy isoflavone in Nature, has been found to possess many biological functions. It has been investigated in particular as a phytoestrogen owing to the similarity of its structure with that of the human hormone estrogen. Due to the lack of comprehensive studies on daidzein metabolism, further research is still required to clarify its in vivo metabolic fate and intermediate processes. In this study, an efficient strategy was established using UHPLC-LTQ-Orbitrap mass spectrometry to profile the metabolism of daidzein in rats. Meanwhile, multiple data-mining methods including high-resolution extracted ion chromatogram (HREIC), multiple mass defect filtering (MMDF), neutral loss fragment (NLF), and diagnostic product ion (DPI) were utilized to investigate daidzein metabolites from the HR-ESI-MS1 to ESI-MSn stage in both positive and negative ion modes. Consequently, 59 metabolites, including prototype compounds, were positively or tentatively elucidated based on reference standards, accurate mass measurements, mass fragmentation behaviors, chromatographic retention times, and corresponding calculated ClogP values. As a result, dehydration, hydrogenation, methylation, dimethylation, glucuronidation, glucosylation, sulfonation, ring-cleavage, and their composite reactions were ascertained to interpret its in vivo biotransformation. Overall, our results not only revealed the potential pharmacodynamics forms of daidzein, but also aid in establishing a practical strategy for rapid screening and identifying metabolites of natural compounds.


Introduction
Daidzein (4 ,7-dihydroxyisoflavone), one of the most prominent soy isoflavones, is largely restricted to leguminous plants, such as Trifolium pretense L., Medicago sativa L. and Pueraria lobata Ohwi [1,2]. In recent years, the observed health benefits and versatile pharmacological properties of daidzein, including its anti-cancer (anti-breast cancer and anti-prostate cancer), anti-cardiovascular disease, anti-osteoporosis, anti-diabetic, anti-aging, anti-oxidant, and anti-inflammatory activities have been extensively investigated [3][4][5][6]. In addition, daidzein is also reported to exhibit various bio-activities against dermatosis and neurodegenerative diseases [7][8][9][10]. However, until now there are no comprehensive studies focused on its in vivo metabolism, which is important for revealing its pharmacologically active substances.
In the past decades, metabolism studies were normally initiated when a molecule had cleared the discovery process and entered the development phase, and they were usually achieved using liquid chromatography/ultraviolet and visible spectroscopy (LC/UV) and gas chromatography/mass spectrometry (GC/MS). Especially, the preponderance of liquid chromatography coupled with high resolution mass spectrometry (LC-HR-MS), such as LTQ-Orbitrap MS, FT-ICR-MS, etc., in the structural characterization of known and unknown metabolites due to the properties of high speed, efficiency, selectivity and detection sensitivity has been fully proved by disparate research teams [11][12][13]. Besides, it can also provide precise elemental composition from accurate mass measurement, which is tremendously helpful to identify of major-to-minor metabolites in vitro and in vivo [14]. Accordingly, drug metabolism can be fully presented with high degrees of certainty even in situations where the corresponding reference standards are lacking. Thus, it is feasible to establish a comprehensive and integrated analytical workflow-interpretation sequence to obtain useful information from complex backgrounds [15][16][17][18]. Recently, efficient data-mining methods including isotope pattern filtering (IPF) [19], diagnostic product ion (DPI) [15,20], neutral loss filtering (NLF) [21], extracted ion chromatogram (EIC) [22], mass defect filter (MDF), and multiple mass defect filters (MMDFs) [23] have been successful applied to systematically profile the in vivo drug metabolism of drugs.
Herein, a high sensitive and specific UHPLC-LTQ-Orbitrap MS based method in both negative and positive ion modes with multiple data-mining methods was established to profile and identify the major-to-minor metabolites in Sprague-Dawley (SD) rats after oral administration of daidzein. Meanwhile, the potential metabolic pathways of daidzein were also proposed in this study. To our best knowledge, it is the first time to the in vivo metabolism of daidzein has been comprehensively investigated.

Establishment of the Analytical Workflow-Interpretation Method
A novel and integrated strategy ( Figure 1) was established for profiling the in vivo daidzein metabolism based on UHPLC-LTQ-Orbitrap MS coupled with multiple post-acquisition data-mining methods. First, the reported metabolites of isoflavones were summarized via literature searches to ascertain the necessary reference standards. Secondly, an ESI-MS n dataset of samples and five selected reference standards were obtained in data-dependent scan (DDS) acquisition mode. After that, HREIC and MMDF were used to screen the daidzein metabolite candidates at the HR-MS 1 level. Among them, HREIC was employed to ascertain the known and predicted metabolites, while MMDF was utilized to obtain HR-MS 1 special information of the unknown and unpredicted metabolites. Then, the PIL-DE data-acquisition method was adopted to obtain the ESI-MS n datasets of those screened metabolite candidates. Afterwards, the potential structures of daidzein metabolites were expounded in accordance with reference standards, chromatographic retention times, accurate mass measurement, the proposed DPIs and NLFs (summarized from the mass fragmentation behaviors of reference standards), and the corresponding calculated ClogP values. Along with the established strategy, daidzein metabolites would be positively or tentatively identified and the corresponding metabolic pathways were also proposed.

The Establishment of MMDF Approach
In order to obtain the special HR-MS 1 datasets of common-to-uncommon and major-to-minor metabolites accurately and comprehensively, the MMDF approach was implemented as a complement to the HREIC method. For an MMDF method, in the case of negative ion mode, first and foremost, according the results of literature studies and HREIC searches, three common metabolites (daidzein, daidzin, and equol) were selected to be MDF templates, which covered drug and conjugation filters. The second step was to confirm the mass range and mass defect range based on the substitution of various templates. As a result, each MDF window was frequently set to ±50 mDa around the mass losses of the templates over a mass range of ±50 Da around the filter template masses. Therefore, the parent drug filter templates were set as follows: (1) parent drug template (m/z 253.0494) and its conjugation templates (m/z 335.0219 for sulfate conjugation and m/z 429.0815 for glucuronide conjugation); (2) puerarin template (m/z 415.1023) and its conjugation templates (m/z 495.0591 for sulfate conjugation and m/z 591.1344 for glucuronide conjugation); (3) equol template (m/z 241.0858) and its conjugation templates (m/z 321.0427 for sulfate conjugation and m/z 417.1179 for glucuronide conjugation). Based on this efficient method, even the heterogeneous ions exist, minor metabolites can also be screened out from the complex background noise and endogenous components.

Mass Fragmentation Behavior Analyses of Daidzein and Its Homologues
For a better understanding of the ESI-MS n fragmentation patterns of daidzein and the other four reference standards (daidzin, genistein, genistin, and puerarin), the mixed standard solution was continuously analyzed by UHPLC-LTQ-Orbitrap MS. Taking daidzein in negative ion mode for example, it showed the [M-H]  were respectively generated by loss of CO, CHO, CO2, 2CO, 2CO + C, and C8H6O. In particular, a successive neutral loss of CO (m/z 225.0556 and m/z 197.0606) could be considered as the distinctive fragmentation behavior of soy isoflavones to implement the rapid metabolite identification. The proposed mass fragmentation patterns of daidzein were illustrated in Figure 2. Daidzin, genistein, and genistin possessed similar mass fragmentation patterns to those of daidzein, which were also illustrated in Figure 2 [24]. In addition, the corresponding mass fragmentation behaviors in positive ion mode are shown in Figure S1.

The Establishment of MMDF Approach
In order to obtain the special HR-MS 1 datasets of common-to-uncommon and major-to-minor metabolites accurately and comprehensively, the MMDF approach was implemented as a complement to the HREIC method. For an MMDF method, in the case of negative ion mode, first and foremost, according the results of literature studies and HREIC searches, three common metabolites (daidzein, daidzin, and equol) were selected to be MDF templates, which covered drug and conjugation filters. The second step was to confirm the mass range and mass defect range based on the substitution of various templates. As a result, each MDF window was frequently set to ±50 mDa around the mass losses of the templates over a mass range of ±50 Da around the filter template masses. Therefore, the parent drug filter templates were set as follows: (1) parent drug template (m/z 253.0494) and its conjugation templates (m/z 335.0219 for sulfate conjugation and m/z 429.0815 for glucuronide conjugation); (2) puerarin template (m/z 415.1023) and its conjugation templates (m/z 495.0591 for sulfate conjugation and m/z 591.1344 for glucuronide conjugation); (3) equol template (m/z 241.0858) and its conjugation templates (m/z 321.0427 for sulfate conjugation and m/z 417.1179 for glucuronide conjugation). Based on this efficient method, even the heterogeneous ions exist, minor metabolites can also be screened out from the complex background noise and endogenous components.

Mass Fragmentation Behavior Analyses of Daidzein and Its Homologues
For a better understanding of the ESI-MS n fragmentation patterns of daidzein and the other four reference standards (daidzin, genistein, genistin, and puerarin), the mixed standard solution was continuously analyzed by UHPLC-LTQ-Orbitrap MS. Taking daidzein in negative ion mode for example, it showed the [ In particular, a successive neutral loss of CO (m/z 225.0556 and m/z 197.0606) could be considered as the distinctive fragmentation behavior of soy isoflavones to implement the rapid metabolite identification. The proposed mass fragmentation patterns of daidzein were illustrated in Figure 2. Daidzin, genistein, and genistin possessed similar mass fragmentation patterns to those of daidzein, which were also illustrated in Figure 2 [24]. In addition, the corresponding mass fragmentation behaviors in positive ion mode are shown in Figure S1. Puerarin, a common C-glycoside, possessed specific fragmentation behaviors, which were totally different from the other four reference standards. It generated the [M-H] − ion at m/z 415.1059 (C21H19O9, 3.51 ppm) in the ESI-MS spectrum in negative ion mode. In the ESI-MS 2 spectrum, it yielded a series of product ions which occurred on heteroside moiety by neutral loss of H2O, 2H2O, 2H2O + CH2O, 2H2O + 2CH2O, 2H2O + 2CH2O + C, 2H2O + 2CH2O + 2C, and 3H2O + 2CH2O + 2C, including m/z 397.0917 (C21H17O8, 0.82 ppm), m/z 379.0812 (C21H15O7, 1.14 ppm), m/z 349.0706 (C20H13O6, 0.79 ppm), m/z 319.0600 (C19H11O5, 1.01 ppm), m/z 307.0600 (C18H11O5, 0.55 ppm), m/z 295.0600 (C17H11O5, 2.07 ppm), and m/z 277.0495 (C17H9O4, 0.44 ppm) [25]. Among them, a successive neutral loss of water moiety could be employed as the characteristic of C-glycoside based structures. Especially, the neutral loss of 120 Da (m/z 415.1059 to m/z 295.0600) could also be utilized to differentiate C-glycoside metabolites. The proposed mass fragmentation patterns of puerarin in negative ion mode were illustrated in Figure 3 and the corresponding spectra in positive ion mode are shown in Figure S1C.
Compounds with similar substructures will exhibit similar fragmentation behaviors in the ESI-MS n spectra, and thus produce certain common DPIs and regular NLFs. In various complex matrices, ascertaining the DPIs could facilitate the rapid and comprehensive identification of drug metabolites. For example, the DPI at m/z 135, which was diagnostically owing to Retro-Diels-Alder (RDA) rearrangement from the 1,4-position in the C-ring of daidzein, would provide information about whether bioreactions occurred on the A-ring or not. Thus, in the ESI-MS 2 spectra of daidzein Puerarin, a common C-glycoside, possessed specific fragmentation behaviors, which were totally different from the other four reference standards. It generated the [ [25]. Among them, a successive neutral loss of water moiety could be employed as the characteristic of C-glycoside based structures. Especially, the neutral loss of 120 Da (m/z 415.1059 to m/z 295.0600) could also be utilized to differentiate C-glycoside metabolites. The proposed mass fragmentation patterns of puerarin in negative ion mode were illustrated in Figure 3 and the corresponding spectra in positive ion mode are shown in Figure S1C. Compounds with similar substructures will exhibit similar fragmentation behaviors in the ESI-MS n spectra, and thus produce certain common DPIs and regular NLFs. In various complex matrices, ascertaining the DPIs could facilitate the rapid and comprehensive identification of drug metabolites. For example, the DPI at m/z 135, which was diagnostically owing to Retro-Diels-Alder (RDA) rearrangement from the 1,4-position in the C-ring of daidzein, would provide information about whether bioreactions occurred on the A-ring or not. Thus, in the ESI-MS 2 spectra of daidzein metabolites, the occurrence of DPIs at m/z 135 or m/z 135 + X (X = mass weight of substituent groups, such as 14, 16, 80, 162, 176, etc.) gave the information similar to that described above. For example, in the MS 2 spectrum of genistein, DPI at m/z 151, which was 16 Da more than the DPI at m/z 135.0086 yielded by daidzein, further validated the above deduction. In addition, NLF could provide much more information for the structural elucidation. For example, the successive NLFs of 28 Da (CO) in the ESI-MS n spectra of daidzein and 120 Da (C 4 H 8 O 4 ) in ESI-MS n spectra of puerarin also provided tremendous help to identify these metabolites. metabolites, the occurrence of DPIs at m/z 135 or m/z 135 + X (X = mass weight of substituent groups, such as 14, 16, 80, 162, 176, etc.) gave the information similar to that described above. For example, in the MS 2 spectrum of genistein, DPI at m/z 151, which was 16 Da more than the DPI at m/z 135.0086 yielded by daidzein, further validated the above deduction. In addition, NLF could provide much more information for the structural elucidation. For example, the successive NLFs of 28 Da (CO) in the ESI-MS n spectra of daidzein and 120 Da (C4H8O4) in ESI-MS n spectra of puerarin also provided tremendous help to identify these metabolites.

Identification of Daidzein Metabolites in Rats
According to total ion chromatograms (TICs) provided by Xcalibur 2.1, the FS mass spectra of rat urine and plasma samples after oral administration of daidzein were respectively compared with control ones for the identification of metabolites. Processing the data acquired by the UHPLC-LTQ-Orbitrap instrument led to the discovery of 59 metabolites in both negative and positive ion modes. Among them, 40 metabolites were found in positive ion mode, while 50 metabolites were found in negative ion mode. All the related mass data was summarized in Table 1, and HREICs of daidzein metabolites are shown in Figure 4.

Identification of Daidzein Metabolites in Rats
According to total ion chromatograms (TICs) provided by Xcalibur 2.1, the FS mass spectra of rat urine and plasma samples after oral administration of daidzein were respectively compared with control ones for the identification of metabolites. Processing the data acquired by the UHPLC-LTQ-Orbitrap instrument led to the discovery of 59 metabolites in both negative and positive ion modes. Among them, 40 metabolites were found in positive ion mode, while 50 metabolites were found in negative ion mode. All the related mass data was summarized in Table 1, and HREICs of daidzein metabolites are shown in Figure 4.            ion at m/z 253, which attributed to the presence of a glucosyl group in its structure. By comparing the retention time and mass fragmentation behavior with reference standard, M6 was unambiguously assigned as daidzin [28]. Metabolite M10 generated the [M−H] − ion at m/z 415.1023 (C 21 H 19 O 9 , 0.00 ppm). Its ESI-MS n spectra were the same as those of daidzin, which indicated it might be daidzin isomer. Thus, M10 was tentatively identified to be daidzein-4 -O-glucoside due to the existing hydroxyl group in the C-ring. Meanwhile, in the ESI-MS n spectra of M4, NLFs of 162 Da (from m/z 579 to m/z 415) and 324 Da (from m/z 415 to m/z 253) indicated it might be a glucosylation product of daidzin. The observed base peak ion at m/z 253 further indicated that glucosyl group might be introduced to the heteroside moiety. Thus, owing to the existence of various active sites in the glucosyl moiety, the actual reaction site could not be determined and M4 was tentatively identified as daidzin-O-glucoside.
Metabolites M7 and M11 showed retention times of 3.85 and 4.28 min, respectively. They exhibited the same molecular ion at m/z 429.0815 (C 21 H 17 O 10 , error ≤ ±1.50 ppm), which were 176 Da higher than that of daidzein. In their ESI-MS 2 spectra, the DPI at m/z 253 resulted from a neutral loss of a glucuronide moiety. According to the structure of daidzein, there are two possible conjugation sites which could form positional isomers with identical mass weights. Therefore, M7 and M11 were respectively characterized as daidzein-7-O-glucuronide (ClogP, −0.10) and daidzein-4 -O-glucuronide (ClogP, 0.08) [1].
Metabolites M13, M19, M28, M29, and M45 were eluted at 4.37, 4.61, 5.27, 5.51, and 8.05 min, respectively. All of them gave rise to [M−H] − ions at m/z 269.0444 (C 15 H 9 O 5 , error ≤ ±1.00 ppm), which were 16 Da more than that of M0. In their ESI-MS 2 spectra, DPI at m/z 151 that were also 16 Da more than DPI at m/z 135 yielded by daidzein, indicated that a hydroxyl group was introduced to the A-ring. Among those metabolites, M45 showed a retention time precisely matching that of a reference standard of genistein, thus, M45 was positively identified as genistein, while M13, M19, M28, and M29 were tentatively proposed as positional isomers of genistein [29,30].
Metabolites M8, M24, and M41 were extracted in the HREIC at m/z 283.0608 (C 16 H 11 O 5 , error ≤ ±1.00 ppm) in negative ion mode with retention times at 3.97, 4.94, and 6.62 min. They were 14 Da more than that of genistein, and the DPI at m/z 268 [M-H-CH 3 ] − indicated that they might be methylated products of genistein. The DPI at m/z 165 (yielded by RDA rearrangement occurred on positions 1 and 3) of M8 and M41 further indicated that the methyl group was introduced observed. Metabolite M51 was eluted at 8.95 min with its [M + H] + ion at m/z 243.1015 (C 15 H 15 O 3 , −2.80 ppm). In its ESI-MS 2 spectrum, abundant DPIs at m/z 123 (RDA rearrangement occurring at positions 1 and 3) and m/z 107 (RDA rearrangement occurrig at positions 1 and 4) were observed, which were identical with the detected metabolites in negative ion mode and published literature [31]. Therefore, M26, M33, M46, and M51 were tentatively identified as equol or its positional isomers [32,33].
Metabolite In their ESI-MS 2 spectra, the product ion at m/z 199 was produced through neutral loss of CO moiety from the parent ion of daidzein with no more characteristic product ions being observed. They were tentatively identified as decarbonylation products of daidzein on the basis of accurate mass weight and elemental composition.

Proposed Metabolic Pathways of Daidzein
In this paper, a total of 59 daidzein metabolites (prototype compound included) with different structures were observed and identified in rats. The proposed metabolic pathways of daidzein are illustrated in Figure 5. There are cardinal corresponding bio-reactions found in vivo, which can be divided into several categories, including dehydration, hydrogenation, methylation, dimethylation, glucuronidation, glucosylation, sulfonation, ring-cleavage and their composite reactions. In addition, it should be noted that some unusual products were detected. For example, the carbonyl group in C-ring of M31, M32, M34 and M40 was transformed into a hydroxyl during in vivo biotransformation. Then, a dehydroxylation of this newly generated hydroxyl group occurred, and thus metabolites M26, M27, M33, and M46 were attributed to iso-flavonol-like compounds. In addition, metabolites M3 and M12 were attributed to C-glycoside kind compounds in this study, which have not been reported ever before.  4) were observed, which were identical with the detected metabolites in negative ion mode and published literature [31]. Therefore, M26, M33, M46, and M51 were tentatively identified as equol or its positional isomers [32,33]. In their ESI-MS 2 spectra, the product ion at m/z 199 was produced through neutral loss of CO moiety from the parent ion of daidzein with no more characteristic product ions being observed. They were tentatively identified as decarbonylation products of daidzein on the basis of accurate mass weight and elemental composition.

Proposed Metabolic Pathways of Daidzein
In this paper, a total of 59 daidzein metabolites (prototype compound included) with different structures were observed and identified in rats. The proposed metabolic pathways of daidzein are illustrated in Figure 5. There are cardinal corresponding bio-reactions found in vivo, which can be divided into several categories, including dehydration, hydrogenation, methylation, dimethylation, glucuronidation, glucosylation, sulfonation, ring-cleavage and their composite reactions. In addition, it should be noted that some unusual products were detected. For example, the carbonyl group in C-ring of M31, M32, M34 and M40 was transformed into a hydroxyl during in vivo biotransformation. Then, a dehydroxylation of this newly generated hydroxyl group occurred, and thus metabolites M26, M27, M33, and M46 were attributed to iso-flavonol-like compounds. In addition, metabolites M3 and M12 were attributed to C-glycoside kind compounds in this study, which have not been reported ever before.  were taken from the suborbital venous plexus of rats at 0.5, 1, 1.5, 2 and 4 h post-administration after oral administration every time. Each sample was centrifuged at 3500 rpm (4 • C) for 15 min to separate plasma. After that, plasma samples from the same group were merged into a collective one.

Urine Sample Collection
The rats were maintained in metabolic cages to collect urine samples (0-24 h) for test and control urine samples, which were also centrifuged at 3500 rpm (4 • C) for 10 min to exclude the residue. Finally, as described above, all samples from the same group were merged together.

Biological Sample Preparation
An approach involved protein and solid residue precipitation and concentration was performed to prepare all biological samples. Plasma and urine samples (1 mL) were respectively added into SPE cartridge, which was pretreated with methanol (5 mL) and deionized water (5 mL). And then, SPE cartridge was successively washed with deionized water (5 mL) and methanol (3 mL). The methanol eluate was collected and evaporated in nitrogen at room temperature. The residue was then redissolved in 80 µL of acetonitrile/water (5:95, v/v) and centrifuged at 14,000 rpm (4 • C) for 30 min. The supernatant was used for instrumental analysis.
The optimized operating parameters in both negative and positive ion modes were as set as follows: capillary voltage of 35 V, electrospray voltage of 3.0 kV, capillary temperature of 350 • C, sheath gas flow rate of 40 (arbitrary units), auxiliary gas flow rate of 20 (arbitrary units), and tube lens of 110 V. Metabolites were detected using full-scan MS analysis from m/z 100-800 at a resolving power of 70,000. Data-dependent ESI-MS 2 analyses were triggered by the three most-abundant ions from the precursor ions while ESI-MS 3 analyses of the most-abundant product ions were followed. Collision-induced dissociation (CID) was performed with an isolation width of 2.0 Da. The collision energy was set to 40%.
In the full scan (FS) experiment, HRMS data were recorded at mass resolving power of 70,000 full width at half maximum (FWHM, calculated for m/z 200). To minimize the total analysis time, data-dependent MS/MS scanning to trigger fragmentation spectra of target ions was performed. The collision energy for collision induced dissociation (CID) was adjusted to 40% of maximum. The dynamic exclusion (DE) to prevent repetition was employed, and the repeat count was set at 5 with the dynamic repeat time at 30 s and dynamic exclusion duration at 60 s. In addition, the parent ion list (PIL)-DE dependent acquisition mode was also employed as a complementary method to obtain MS n stage of the obtained datasets [15].