Software Assisted Multi-Tiered Mass Spectrometry Identification of Compounds in Traditional Chinese Medicine: Dalbergia odorifera as an Example

The complexity of metabolites in traditional Chinese medicine (TCM) hinders the comprehensive profiling and accurate identification of metabolites. In this study, an approach that integrates enhanced column separation, mass spectrometry post-processing and result verification was proposed and applied in the identification of flavonoids in Dalbergia odorifera. Firstly, column chromatography fractionation, followed by liquid chromatography–tandem mass spectrometry was used for systematic separation and detection. Secondly, a three-level data post-processing method was applied to the identification of flavonoids. Finally, fragmentation rules were used to verify the flavonoid compounds. As a result, a total of 197 flavonoids were characterized in D. odorifera, among which seven compounds were unambiguously identified in level 1, 80 compounds were tentatively identified by MS-DIAL and Compound Discoverer in level 2a, 95 compounds were annotated by Compound discoverer and Peogenesis QI in level 2b, and 15 compounds were exclusively annotated by using SIRIUS software in level 3. This study provides an approach for the rapid and efficient identification of the majority of components in herbal medicines.


Introduction
Traditional Chinese medicine (TCM) plays an important role in the treatment and prevention of diseases within China and the world at large. To remove the mask and achieve the scientific utilization of TCM, ongoing devotions have been made to probe the chemical basis, metabolism and mechanism of action, among which the chemical basis is a cornerstone [1][2][3]. However, due to the complexity of intrinsic metabolites, the in-depth identification of TCM components still remains to be a bottleneck. Moreover, unknowns in complex matrices obtained by phytochemical isolation and purification can be unambiguously identified by nuclear magnetic resonance [4]. The low-throughput plus tedious isolation process of this method confines its application for a holistic and fast analysis of TCM components. In the past decades, due to its unprecedented selectivity and sensitivity, liquid chromatography-tandem mass spectrometry (LC-MS), especially multistage high-resolution MS (HRMS), has become increasingly popular and a most widely used technique for the separation and identification of TCM components [5][6][7]. In addition to the masses of the targets, HRMS/MS can convey certain structural information through abundant spectral fragments, which are critical for metabolites identification. However,

Results and Discussion
At the first step, the extract was fractionated by column chromatography before LC-MS detection. This step makes it possible to detect more components in D. odorifera by increasing the peak capacity, reducing the interference of co-eluting components and enriching the minor components [24]. Subsequently, reference substances and three in silico data-processing solutions were used to improve the reliability and efficiency of the identification [25]. Following Schymanski rules, the identification results were divided into three levels (1, 2a, 2b and 3), according to the confidence and the analysis mechanism. Finally, the identification results were confirmed by manual interpretation of the fragments in levels 2a-3.

Column Chromatography Fractionation
In this section, the extract was segmented into eight fractions (Fr1~Fr8) by silica gel, and a total of 3456 mass features were detected, with 666, 719, 754, 647, 591, 534, 293 and 281 mass features in Fr1~Fr8, respectively. Compared with 851 mass features detected in the total extract, silica gel column fraction enhanced chromatographic separation and improved sensitivity. In this experiment, we mainly characterized small molecule flavonoids, and a total of 197 flavonoids were annotated (Supplementary Table S1). In addition to monoflavones, a large number of compounds with molecular weights within 500-800 Da in Fr1~Fr2 were also detected. These compounds were deduced to be biflavones and flavonoids with fatty acid side chains, which were not discussed in detail.

Level 1: Identified by Comparing with Reference Standards
To identify known compounds, all the available reference standards of flavonoids which potentially exist in D. odorifera were collected in the author's laboratory. Independent and orthogonal data, including retention time, accurate mass and MS/MS, were acquired for these compounds under identical experimental conditions [26]. The candidates were filtered according to the following rules: (1) a retention time error not exceeding 0.05 min, (2) an exact mass error not exceeding 5 mDa and (3) matching fragment ions. Seven compounds (F 12 , F 13 , F 14 , F 15, F 20, F 22 and F 24 ) were identified by using this method, and their structures are shown in Figure 2. Albeit unambiguous identification of the targets, it is difficult to obtain enough reference materials for extensive compound identification, due to high price of reference substances, difficulty of collection and lack of reference substances for some compounds. Therefore, the remaining 190 compounds were performed with the following level 2a identification. min, (2) an exact mass error not exceeding 5 mDa and (3) matching fragment ions. Seven compounds (F12, F13, F14, F15, F20, F22 and F24) were identified by using this method, and their structures are shown in Figure 2. Albeit unambiguous identification of the targets, it is difficult to obtain enough reference materials for extensive compound identification, due to high price of reference substances, difficulty of collection and lack of reference substances for some compounds. Therefore, the remaining 190 compounds were performed with the following level 2a identification.

Level 2a: Searching in Spectral Databases
Given the targets without chemical reference standards, searching the public or commercially available spectral databases acts as a supplementary approach. At present, the common databases, containing fragmentation spectrum information of small molecules from different sources, mainly include METLIN (https://metlin.scripps.edu/, accessed on 5 February 2022), MassBank (http://www.massbank.jp, accessed on 5 February 2022), mzCloud (https://www.mzcloud.org/, accessed on 5 February 2022) and so on. There are also a small number of plant-specific databases, such as ReSpect [27] and PlaSMA [28]. A total of 2957 flavonoids were included in ReSpect, which is a part of the MS-DIAL database. METLIN reportedly has more than 500,000 molecular standard data with fragmentary spectra. Massbank records 89,826 unique spectra, 15,059 unique compounds and 16,840 unique isomers. Since mzCloud contains the database of high-resolution and lowresolution MS n spectrum obtained under many experimental conditions, it solves the problem of spectral reproducibility to a certain extent.
To enable batch processing and the visualization of results, database searching can be implemented by using software, such as MS DIAL, Compound Discoverer (CD), RAM, etc. Database searching is usually accomplished by calculating the total score integrating MS1 similarity and MS/MS similarity, and RT similarity in some cases. For the MS/MS spectral similarity, the "dot product" scoring method is often adopted to perform spectral

Level 2a: Searching in Spectral Databases
Given the targets without chemical reference standards, searching the public or commercially available spectral databases acts as a supplementary approach. At present, the common databases, containing fragmentation spectrum information of small molecules from different sources, mainly include METLIN (https://metlin.scripps.edu/, accessed on 5 February 2022), MassBank (http://www.massbank.jp, accessed on 5 February 2022), mzCloud (https://www.mzcloud.org/, accessed on 5 February 2022) and so on. There are also a small number of plant-specific databases, such as ReSpect [27] and PlaSMA [28]. A total of 2957 flavonoids were included in ReSpect, which is a part of the MS-DIAL database. METLIN reportedly has more than 500,000 molecular standard data with fragmentary spectra. Massbank records 89,826 unique spectra, 15,059 unique compounds and 16,840 unique isomers. Since mzCloud contains the database of high-resolution and low-resolution MS n spectrum obtained under many experimental conditions, it solves the problem of spectral reproducibility to a certain extent.
To enable batch processing and the visualization of results, database searching can be implemented by using software, such as MS DIAL, Compound Discoverer (CD), RAM, etc. Database searching is usually accomplished by calculating the total score integrating MS1 similarity and MS/MS similarity, and RT similarity in some cases. For the MS/MS spectral similarity, the "dot product" scoring method is often adopted to perform spectral matching [29]. A high score of spectral database matching often indicates high confidence in structural annotations. Recently, a method called entropy similarity that outperformed dot product similarity was proposed [30].
Although a spectral database search is a simple and efficient approach for metabolite identification, insufficient MS/MS spectra from authentic compounds in databases and differences in mass spectrometry instruments are the main limiting factors for this approach. Moreover, components from herbal medicines contributed to only a small proportion of these databases. In addition, positional isomers are widespread in herbal medicines, and they often display closely similar mass spectra. It is difficult to distinguish isomers with similar structures solely by this method. To overcome this problem, com-plementing other orthogonal information, such as retention time, collision cross-section and biosynthesis pathways, will lead to a higher identification confidence. Here, based on software of MS DIAL 4.70 and CD 3.3, we identified 80 flavonoids in D. odorifera by searching spectral databases.

MS DIAL
MS DIAL is an untargeted metabolomics data processing pipeline that is suitable for both data-dependent and data-independent data [31]. According to the latest statistics, the MS-DIAL database, integrating publicly available databases (such as MassBank and GNPS), contains 13,303 unique compounds in positive ion mode and 12,879 unique compounds in negative ion mode. Here, we identified the flavonoids in D. odorifera by searching the positive spectral database. Moreover, the identification score cutoff was set at 80%, a high confidence in structure annotation. Finally, a total of 50 flavonoids from D. odorifera were effectively characterized.
Compound 104 (Rt = 24.13 min) displayed a precursor ion [M + H] + at m/z 269.0794, matching the molecular formula [C 16 H 13 O 4 ] + . By comparing the MS/MS spectrum with positive spectral database, it was identified as 7-hydroxy-3-(4-methoxyphenyl)-4H-chromen-4-one (Biochanin B) with a total score at 96.6%. As shown in Figure 3a, the fragmentation spectrum of compound 104 highly matched with the reference spectrum in fragmentation and peak intensity. Furthermore, the instrument of the reference spectrum was the same as our instrument, thus further illustrating the reliability of the result. Correspondingly, compound 68 was also assigned to Biochanin B, with a lower similarity of 86.40%. However, the fragmentation spectrum of compound 68 was visually different from the reference spectrum. As shown in Figure 3b Figure S1a shows that the fragments of compound 89 are highly matched with the reference spectrum, but the relative intensities of corresponding ions are quite different, with low a similarity score of 80.6%. It may be due to the use of different instruments or the great difference in fragmentation energy. Such structure annotation results may have certain deviations, because the correct structure may be isomers that are very relevant to the annotated structure. By taking this into account, we compared the fragment spectra of compound 89 collected under different instruments and different energies (Supplementary Figure S1b). The results showed that the base peak ions were always at m/z 229.0499, although the relative intensity of some ions varied greatly. The discrepancy indicates the possible wrong automatic identification of compound 89. Therefore, careful manual inspections are usually necessary in similarity, matching of the spectra. In addition, mass spectrometry fragments of samples with different concentrations were compared, and no significant difference was observed with consistent identification results.
Molecules 2022, 27, x FOR PEER REVIEW 6 of 17 compound 89. Therefore, careful manual inspections are usually necessary in similarity, matching of the spectra. In addition, mass spectrometry fragments of samples with different concentrations were compared, and no significant difference was observed with consistent identification results.

Compound Discoverer (CD) based on mzCloud Database
CD software can make full use of the rich high-resolution accurate-mass (HRAM) data produced by an Orbitrap mass spectrometer to annotate the structures of small molecule metabolites. In addition to spectral database exact searching, similarity searching of mzCloud and automated annotation of spectra with predicted fragments enable CD software to identify more metabolites [32]. Here, 63 components were successfully identified.
By comparing the MS/MS spectrum with mzCloud database, compound 147 was annotated as Biochanin A, of which the mzCloud Best Match is 98.2%. The mirror plot of the mzCloud search result of compound 147 indicates confident identification. As shown in Supplementary Figure S2, some matched fragments were automatically annotated with accurate masses, structural fragments and charge states, and this is an improvement compared with MS-DIAL. For compound 147, the product ions 269.0455, 241.0507, 213.0555 and 153.0189 are annotated as [C15H9O5] + , [C14H9O4] + , [C13H9O3] + and [C7H5O4] + , respectively, which helps to infer the structure of questions. In this way, even if some product ions do not match well with the reference spectrum, the structures could be elucidated from the annotated product ions.

Level 2b: Searching in Molecular-Structure Databases
Compared with the spectral database containing only a few compounds, the molecular-structure databases are more comprehensive. Molecular-structure databases commonly used in the identification of small molecule metabolites include ChemSpider

Compound Discoverer (CD) Based on mzCloud Database
CD software can make full use of the rich high-resolution accurate-mass (HRAM) data produced by an Orbitrap mass spectrometer to annotate the structures of small molecule metabolites. In addition to spectral database exact searching, similarity searching of mz-Cloud and automated annotation of spectra with predicted fragments enable CD software to identify more metabolites [32]. Here, 63 components were successfully identified.
By comparing the MS/MS spectrum with mzCloud database, compound 147 was annotated as Biochanin A, of which the mzCloud Best Match is 98.2%. The mirror plot of the mzCloud search result of compound 147 indicates confident identification. As shown in Supplementary Figure S2, some matched fragments were automatically annotated with accurate masses, structural fragments and charge states, and this is an improvement compared with MS-DIAL. For compound 147, the product ions 269. 0455 ] + , respectively, which helps to infer the structure of questions. In this way, even if some product ions do not match well with the reference spectrum, the structures could be elucidated from the annotated product ions.

Level 2b: Searching in Molecular-Structure Databases
Compared with the spectral database containing only a few compounds, the molecularstructure databases are more comprehensive. Molecular-structure databases commonly used in the identification of small molecule metabolites include ChemSpider (http://www. Theoretical fragmentation can be obtained in software, such as Progenesis QI (MetFrag algorithm), to rank the potential targets. MetFrag [33] takes a target compound structure and iteratively breaks bonds to produce a series of possible fragments, which are then compared to each ion in acquired MS/MS spectra. However, because the molecular struc-ture databases contain a huge number of closely similar structures, and their theoretical fragments are independent of experimental data, this often leads to ambiguous identifications. As for a certain group of compounds, fragmentation rules can be elucidated from experimental data; therefore, in silico MS/MS spectra can be achieved from chemical structures. Databases, such as LipidBlast [34] and Acyl-CoA libraries, have been successfully constructed and applied for the identification of some sequential constituents. These libraries greatly improved the identification confidence and provide quasi-level 2 results.
In this work, two kinds of software with different approaches of searching in molecular structure databases were selected for the structural annotation of flavonoids in D. odorifera. As a result, 95 compounds were identified.

Progenesis QI
Progenesis QI is a small molecule discovery analysis vehicle for processing metabolic profiling data. It is worth noting that, on top of the commercial databases, Progenesis QI software also supports users when they define their own structure databases for compound characterization [35]. Here, we identify the flavonoids in D. odorifera by searching four molecular-structure databases with Progenesis QI. Two online databases, HMDB and Natural Products (NPDB), and the other two self-built databases, namely a structure database of D. odorifera components obtained from Reaxys (DODB, 112 components) and a theoretical structure database of flavones (TSDB, 40,095 components), were consulted, respectively. As a result, 157 components were annotated in HMDB, 177 components in NPDB, 175 components in DODB and 178 components in the TSDB. Although the number of annotated components in different databases varied slightly, they worked out in a similar way. For instance, compound 58 (Rt = 15.05 min) exhibited a precursor ion at m/z 269.0806 and was characterized as formononetin in NPDB, DODB and TSDB. In addition, for those isomers with similar structures, they usually produce similar fragmentation spectra, causing equivocal annotations. As shown in Figure 4a  In addition to an online search of the mzCloud spectral database, CD can also annotate compound structures by searching Chemspider. The potential structure candidates generated by the ChemSpider search were ranked by comparing the experimental frag- suggested that their structures were different. However, the relative intensity changes of the product ions in the spectra are often ignored when annotating compounds from structural databases, and this is usually the main reason for the wrong annotation of compounds. Figure 4c shows that compound 167 and compound 194 generate the same product ions at m/z 255.0663, 210.0684, 199.0763, 181.0656, 168.0578 and 157.0655, whereas there was a visibly striking difference in the peak relative response of these product ions. Hence, these two compounds were tentatively annotated as the same compound. From this result, we can clearly see the drawbacks of the structural database search. In addition, multiple compounds with similar structures may be obtained simultaneously when searching in structure databases, which is more conducive to the classification of compound structures. For instance, according to the results of the top one of the candidate compound ranking, compound 27 was identified as a flavonoid. As shown in Supplementary Figure S3, the top five candidates are flavonols, and the only differences in their structures are the positions of hydroxyl group and methoxy group.

Level 3: De Novo Molecular Structure Identification
Continuous efforts have been made to explore the chemical basis of herbal medicines, and there are still plenty of unreported compounds. These brand-new compounds can be annotated by de novo molecular structure elucidation. De novo molecular design is a process of automatically proposing new chemical structures to optimally meet the desired molecular profile [37]. In the process of structure annotation, the de novo analysis method does not rely on the spectral database, but reconstructs the structure of compounds according to the distribution information of fragmentation spectrum of unknown compounds. In recent years, de novo technology has been widely used in protein or peptide structure reconstruction, but the application in the identification of small molecular compounds is relatively scare. In this paper, we introduce a method developed by Sebastian Böcker's team [38] combining structural fingerprint with fragmentation trees (implemented by SIRIUS 4.9.8 software) for the de novo identification of unknown compounds in D. odorifera.
Mass spectrometry fragments are produced by the fragmentation of precursor ions; therefore, all the molecular formulas of mass spectrometry fragments are theoretically the substructures of precursor ions. The fragment tree consists of nodes, corresponding to precursors and fragments, and the entire fragmentation tree displays the predicted multistage mass spectrogram. The structural fingerprint predicted from fragmentation tree and fragment spectrum contains the fragment features and intensity information of fragmentation spectrum [39]. For the unknowns, SIRIUS can perform de novo structure elucidation by predicting molecular fingerprints, or annotate them by searching theoretical databases. Here, we annotate a total of 197 components in D. odorifera through SIRIUS software, 15 of which were used for the results in level 3.
When SIRIUS is used to identify unknowns, firstly, the software will generate a fragment tree with the highest confidence, based on the fragmentation spectrum of queries. From the fragment tree, the relative intensity and neutral loss information of product ions can be easily obtained (Supplementary Figure S4), and these greatly enrich our cognition of the mass spectrometry fragmentation behavior of unknowns and make product ion annotation more convenient. As shown in Supplementary Figure S4, H 6 ), respectively, and the compound has two hydroxyl substituents in the A ring and no substituent in the B ring. Furthermore, according to the fragment tree, the SIRIUS software predicts a corresponding molecular fingerprint with the CSI: finger ID. For compounds existing in the molecular structure database, this predicted molecular fingerprint can be used to search for compounds with similar structures in the database. For novel compounds that have not been reported, this predicted fingerprint supports de novo structure elucidation or searching for similar structures from theoretical databases [38].

Result Validation
In order to verify the identification results in levels 2a-3, the fragmentation spectra of flavonoids were selected for the study of cracking rules. Traditionally, only reference compounds were adopted for the elucidation of cracking rules. However, due to the limit of the reference standards (a), the rules are usually fragmentary and incomplete. In this study, compounds with a score higher than 90% in level 2 (b) and other rare flavonoids in mzCloud database (c) (fragment spectra were collected in HCD mode) were ticketed for the quasi reference compound. Therefore, cracking rules were elucidated from 39 (quasi) reference compounds 2.3 times more than the traditional method. The 39 compounds include 12 types of flavonoids, namely four flavones, six flavonols, nine isoflavones, one aurone, six flavanones, two chalcones, two flavanols, four neoflavones I, one neoflavonoid II, one neoflavonoid III, two neoflavonoids IV and one neoflavonoid VIII. The detailed structures are shown in Figure 2 and Supplementary Figure S5 Figure S6). Accordingly, compounds 31, 46 and 86 identified by SIRUS were substantiated.

Isoflavones
According to compounds F 11 -F 19 , the common fragmentation mode of isoflavones is 1,3 X fragmentation, and strong [ 1,3 A + H] + and [ 1,3 B + H] + can be observed in the secondary spectrum, such as m/z 137.02 and 119.05 of F 14 . In addition, a series of peak ions showing the loss of neutral molecule CO (28Da) are detected in the spectrum, such as m/z 243.06, 215.07 and 159.04 of F 12 , which are characteristic for isoflavones (Supplementary Figure  S7). Compounds 61, 98 and 113 conformed to the above rules.

Flavanones, Chalcones and Flavanols
The MS/MS of compounds F 20 -F 27 indicated that both the flavanones and chalcones showed main fragments at 1,3 X and 1,4 X, and the loss of ring B ([M-B + H] + ). The base peak ion is generally [ 1,3 A + H] + , contributing more than 40% of the total intensity of all fragments, which is the most prominent feature of flavanones and chalcones. In addition, [ 1,3 Figure S8a). It is generally difficult to distinguish between flavanones and chalcones only by using an MS/MS spectrum. However, according to literature reports [40], due to the existence of the double bond at the C2-C3 position of chalcones, the conjugated structure displays a different UV spectrum from that of flavanones.

B + H] + and [ 1,4 B + H] + are second only to [ 1,3 A + H] + in intensity (Supplementary
Our analysis of F 28 and F 29 showed that the flavanols mainly fragment around 3-OH. Moreover, the common fragments are 1,3 A, 0,2 A, 0,3 A and 2,4 X. Due to the presence of 3-OH, flavanols are prone to produce dehydrated fragments during fragmentation, such as  Figure S8b).

Neoflavonoids I
By analyzing F 30 -F 33 , it can be observed that, due to the existence of epoxy bonds, the fragmentation sites of the neoflavonoids I are in the B ring and the C ring, and the main fragments are 5,10a D, 5,11a D, 6,10a A and 6,11a A (Figure 5a). Affected by the 5-and 11-position oxygen atoms, the neoflavonoids I are prone to produce high-abundance [ 6,11a A + H] + fragments, such as the m/z 137.0589 produced by F 30 , which is expected to be a key to characterize such compounds. Compounds 145 and 150 were identified as neoflavonoids I, and the abundant ions 152.0472 and 123.0444 in the MS/MS spectra corresponded to [ 6,11a A + H] + fragments, and the identification results were verified.

Neoflavonoids Ⅱ
The fragmentation behavior of neoflavonoids Ⅱ was predicted from compound F34. As shown in Figure 5b, F34 is prone to neutral loss of CO and 2CO to obtain the abundant ion [M-CO + H] + (m/z 241.0495) and [M-2CO + H] + (m/z 213.0546). In addition, due to the existence of ester bonds, ring B is prone to ring opening in mass spectrometry. For example, the abundant ion at m/z 157.0648 is obtained after ring A and ring B fragmentation. In contrast, ring C is relatively stable and generally does not open.

Aurones
As shown in Figure 6a, compound F35 produced high-abundance complementary ions at m/z 134.04 and 137.03 corresponding to 1,3 B and 1,3 A, respectively. The former is base peak ion and a radical ion, which may be its characteristic fragment. In addition, F35 generates the [M-CH3 + H] + fragment by losing the neutral molecule CH3 (15 Da) and generates the [ 1,4 A + H] + fragment.
2.5.7. Neoflavonoids Ⅲ The fragmentation behavior of the neoflavonoids Ⅲ was predicted from compound

Neoflavonoids II
The fragmentation behavior of neoflavonoids II was predicted from compound F 34 . As shown in Figure 5b, F 34 is prone to neutral loss of CO and 2CO to obtain the abundant ion [M-CO + H] + (m/z 241.0495) and [M-2CO + H] + (m/z 213.0546). In addition, due to the existence of ester bonds, ring B is prone to ring opening in mass spectrometry. For example, the abundant ion at m/z 157.0648 is obtained after ring A and ring B fragmentation. In contrast, ring C is relatively stable and generally does not open.

Aurones
As shown in Figure 6a, compound F 35 produced high-abundance complementary ions at m/z 134.04 and 137.03 corresponding to 1,3 B and 1,3 A, respectively. The former is base peak ion and a radical ion, which may be its characteristic fragment. In addition,

Neoflavonoids Ⅳ
By analyzing the MS spectra of F37 and F38, it was found that, due to the C1 epoxy and the C3 hydroxyl group, the neoflavonoids Ⅳ are prone to the cleavage of the single bond between C1and C2. Therefore, the common fragments are 1,2 B, 0,2 B and B + . The base peak ion is generally [ 1,2 B + H] + , and its abundance is much higher than other ions, which is the most characteristic fragmentation mode of neoflavonoids Ⅳ (Figure 6c). For example, compound 70 was attributed to neoflavonoid Ⅳ, and its base peak ion was m/z 138.0318, corresponding to [ 1,2 B + H] + , demonstrating the identification reliability.

Neoflavonoids III
The fragmentation behavior of the neoflavonoids III was predicted from compound F 36 . As shown in Figure 6b, the main MS/MS fragments of F 36 were 1,3 AB, 1,4 Band 0,3 X, where [ 1,3 A + H] + (m/z 123.04) is the base peak ion. Since there is no C4 carbonyl group, the [ 1,3 A + H] + fragment is one degree larger in unsaturation than other flavonoids, and this may be its characteristic ion.

Neoflavonoids IV
By analyzing the MS spectra of F 37 and F 38 , it was found that, due to the C1 epoxy and the C3 hydroxyl group, the neoflavonoids IV are prone to the cleavage of the single bond between C1and C2. Therefore, the common fragments are 1,2 B, 0,2 B and B + . The base peak ion is generally [ 1,2 B + H] + , and its abundance is much higher than other ions, which is the most characteristic fragmentation mode of neoflavonoids IV (Figure 6c). For example, compound 70 was attributed to neoflavonoid IV, and its base peak ion was m/z 138.0318, corresponding to [ 1,2 B + H] + , demonstrating the identification reliability.

Standard Solution Preparation
Seven reference compounds were mixed and dissolved in 100% methanol to prepare stock solution (approximately 50 µg·mL −1 for each compound). The solution was centrifuged at 14,000 rpm for 10 min, and the supernatant was stored at 4 • C before analysis.

Sample Preparation
The dried and powdered heartwood of D. odorifera (50 g) was extracted two times with 70% (v/v) ethanol (total amount 1000 mL) under reflux and then filtered by filter paper. The solution was evaporated under reduced pressure to yield EtOH extract (25 g). The residue was dissolved in 75 mL methanol, and 3 mL of the solution (about 1 g residue) was submitted to column chromatography over silica gel and eluted with a gradient solvent system of petroleum ether-acetone (v/v, 20:1, 10:1, 8:1, 6:1, 4:1, 2:1, 1:1 and 0:1, each 200 mL) to give eight fractions (Fr1-Fr8). The solutions of eight fractions were evaporated under reduced pressure, and each residue was dissolved in 10 mL methanol. Samples of eight fractions were centrifuged at 14,000 rpm for 10 min, and the supernatant was stored at 4 • C before UHPLC-MS analysis.

UHPLC-LTQ-Orbitrap MS System
The UHPLC separation coupled to MS detection was conducted on an Ultimate ® 3000 UHPLC system (Thermo Fisher Scientific, San Jose, CA, USA). An ACQUITY UPLC ® HSS T3 column (2.1 × 100 mm, 1. A LTQ-Orbitrap Velos Pro hybrid mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA) equipped with an ESI source was employed to acquire high-resolution MS data in the positive and negative ion mode. The following source parameters were applied: spray voltage, 3.8 kV; capillary temperature, 350 • C; source heater temperature, 300 • C; sheath gas (N 2 ), 40 arbitrary units; and auxiliary gas (N 2 ), 10 arbitrary units. Two different fragmentation modes, CID and HCD, were employed in data-dependent acquisition mode. Normalized collision energies (NCEs) in CID mode and HCD mode were set at 40% and 80%, respectively. A duty circle included two events (I and II). Full scan over m/z 240-1200 at a resolution of 30,000 (FWHM defined at m/z 400) in profile format (p) was performed in Event I. Event II recorded MS2 spectrum of most intense ions. The resolution of Event II was 7500. The minimum signal intensity to trigger MS2 fragmentation was set to 10,000. Dynamic exclusion parameters were set as follows: repeat count, 2; repeat duration, 10 s; exclusion list size, 50; exclusion duration, 10 s; low exclusion mass width, 1.5 Da; and high exclusion mass width, 1.5 Da.

Data Analysis
Data post-processing was performed on the obtained raw data with different software. MSConvert software was used to convert the raw data into mzML files. For MS-DIAL, the mzML files were uploaded for MS features extraction, and the MS/MS positive ion public spectrum database was imported. For Progenesis QI, the raw data were imported for peak extraction. Four databases were respectively imported for the characterization of unknowns, and the accurate mass tolerance was set to 10 ppm. For SIRIUS 4.9.8, the mzML files were analyzed, and the parameters were set as follows: instrument, orbitrap; MS/MS isotope scorer, score; MS2 MassDev, 10 ppm; candidates, 10; candidates per ion, 1; adducts, [M + H] + , [M + Na] + and [2M + H] + ; elemental composition, C, H and O; use heuristic above m/z, 300; use heuristic only above m/z, 650; search in DBs, all; and canopus, parameter-free.

Conclusions
In this study, an approach that integrates enhanced column separation, intelligent data post-processing and result verification was proposed and successfully applied to the characterization of flavoniods in D. odorifera. As a result, 3456 mass features were detected and a total of 197 flavonoids were identified or tentatively characterized, including 7, 80, 95 and 15 compounds in levels 1, 2a, 2b, and 3, respectively. Furthermore, the results of levels 2a-3 were validated by fragmentation rules. Through the above process, unknown compounds in complex matrices were quickly and efficiently identified with existing software bioinformatics tools with maximal identification confidence. It is expected that the discussed method above could be a reference for the analysis of other TCMs.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/molecules27072333/s1, Figure S1: Fragment spectra of compound 89; Figure S2: Matching diagram of compound 147 in Compound Discoverer 3.3; Figure S3: The top 5 candidates of compound 27; Figure S4: The fragment tree of compound 138; Figure S5: Structure and structure numbering of different subtypes of flavonoids; Figure S6: The proposed MS fragmentation pathways of compounds F 1 -F 10 ; Figure S7: The proposed MS fragmentation pathways of compounds F 11 -F 19 ; Figure S8: The proposed MS fragmentation pathways of compounds F 20 -F 29 ; Table S1

Data Availability Statement:
The data presented in this study are available in supplementary materials.

Conflicts of Interest:
The authors declare no conflict of interest.
Sample Summary: Liquid chromatography tandem high resolution mass spectrometry has been used to probe the complex ingredients of traditional Chinese medicines. However, the data were mainly manually interpreted, which hiders the high-throughput and accurate analysis of herbal samples. In this study, multiple types of software were adopted to identify compounds in tradi-tional Chinese medicine by comparing with reference standards, searching spectral databases, matching the structural databases and in silico fragments, and de novo structural elucidation. Thus, the flavonoids in Dalbergia odorifera, a famous traditional Chinese medicine, were quickly identi-fied with maximal confidences.