1. Introduction
Caffeic acid (CA), also known as 3,4-dihydroxy cinnamic acid, is an organic compound that has two functional groups (phenolic hydroxyl and acrylic acid) [
1]. Caffeic acid derivatives refer to a large class of compounds that contain caffeic acid structural units [
2]. Caffeic acid and its derivatives are widely distributed in medicinal plants, vegetables and fruits [
3]. As a kind of safe and effective natural phenolic acid compound with a wide range of sources, caffeic acid exhibits many pharmacological effects, such as antioxidation [
4], antibacterial [
5], antiviral [
6], antitumor [
7], anti-inflammatory [
8] and neuroprotection [
9] effects and the ability to regulate blood glucose and blood lipids [
10].
This paper summarizes the structural and Nuclear Magnetic Resonance Spectroscopy (NMR) spectral features of plant-derived caffeic acid and its derivatives due to their physiological activities and wide distribution in nature. The results provide a reference for the rapid structural identification of these compounds. The process of extracting these compounds from plants is complicated and affected by the plant growth cycle, climatic environment and other factors; thus, the plant cannot provide stable raw materials for natural product extraction, which greatly limits its large-scale production. Therefore, the biosynthetic pathways that generate caffeic acid and its derivatives are summarized and found to mainly involve the shikimic acid pathway, from which phenylalanine is deaminated to cinnamic acid and then converted into caffeic acid [
11].
Therefore,
13C-NMR and
1H-NMR data (
Tables S1–S18) and biosynthetic pathways (
Figure 1) of 173 caffeic acid and its derivatives on plants with different types of substituents (
Figure 2,
Figure 3,
Figure 4,
Figure 5,
Figure 6,
Figure 7 and
Figure 8 and
Table 1,
Table 2,
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7) were summarized to provide a reference for further research on the structural identification and biofabrication of caffeic acid and its derivatives.
2. Methodology
A comprehensive survey of the structural information, NMR data and biosynthetic pathways of caffeic acid and its derivatives was conducted by searching the scientific literature published in online databases (including PubMed, CNKI and SciFinder) and other sources (such as Ph. D. dissertations and M. Sc. theses). The search terms “caffeic acid”, “caffeic acid derivatives”, “caffeic acid and NMR”, “caffeic acid derivatives and NMR”, “caffeic acid and biosynthetic pathways” and “caffeic acid derivatives and biosynthetic pathways” were used for data collection. In total, 162 publications were included from 1984 to 2023. EndNote was used to collate published literature. To classify caffeic acid derivatives according to their structures, ChemDraw 20.0 software was used to draw chemical structures.
3. Structure and Classification of Caffeic Acid and Its Derivatives
In this paper, 1743 caffeic acid and its derivatives are compared. The skeletons of these caffeic acid derivatives can be classified into the following types according to the type of substituent: caffeoyl ester derivatives (
Figure 1 and
Table 1), caffeyltartaric acid (
Figure 2 and
Table 2), caffeic acid amide derivatives (
Figure 3 and
Table 3), caffeoyl shikimic acid (
Figure 4 and
Table 4), caffeoyl quinic acid (
Figure 5 and
Table 5), caffeoyl danshensu (
Figure 6 and
Table 6) and caffeoyl glycoside (
Figure 7 and
Table 7).
Caffeoyl ester derivatives are mainly synthesized by the ester formation of caffeic acid with different alcohols. Caffeic acid amide derivatives are produced by the condensation reaction between caffeic acid and amino acids. Caffeoyl tartaric acids are produced by the condensation of tartaric acid and caffeic acid through esterification. Caffeoyl shikimic acid is condensed from shikimic acid and caffeic acid by an esterification reaction. Caffeoyl quinic acid is a class of phenolic acid natural ingredients formed by the condensation of quinic acid with a varying number of caffeic acids through esterification. Because the carboxyl group of caffeic acid and the three hydroxyl groups on the alicyclic ring of quinic acid mangiferylate are easily acylated, the isomers are particularly abundant. Caffeoyl danshensu is formed by the esterification and condensation of caffeic acid and its hydrated product 3,4-dihydroxyphenyllactic acid. The main types of sugars in caffeoyl glycoside are glucose, rhamnose, xylose, furanose and glucuronic acid. The classification of sugar type mainly depends on acid hydrolysis, gas chromatography-mass spectrometry (GC–MS), NMR and other technologies.
4. 13C-NMR and 1H-NMR Data of Caffeic Acid and Its Derivatives
First, the number of caffeoyl groups was determined by
13C-NMR and
1H-NMR, and there were five hydrogen proton signals in the
1H-NMR (CD
3OD, 500 MHz) of caffeic acid. In the aromatic region,
δH 6.99 (1H, d,
J = 1.8 Hz), 6.84 (1H, dd,
J = 1.8, 8.2 Hz) and 6.73 (1H, d,
J = 8.2 Hz) are characteristic signals for hydrogen protons of the benzene-ring ABX system.
δH 7.27 (1H, d,
J = 15.9 Hz) and 6.28 (1H, d,
J = 15.9 Hz) are the characteristic signals for the hydrogen of adjacent alkenes in trans-double bonds. The chemical shifts of the double-bonded α and β hydrogens on the side chain are strongly influenced by the terminal carbonyl conjugation effect, with the α-H located in the higher field (
δH 6.2~6.5) and the β-H located in the lower field (
δH 7.4~7.7). There were nine carbon signals in the
13C-NMR spectrum (CD
3OD, 125 MHz), of which
δC 147.9, 146.5, 129.3, 123.1, 121.7 and 116.4 were carbon signals of the benzene ring skeleton.
δC 141.5 and 114.6 were trans-double-bonded carbon signals, and
δC 176.2 was a carboxyl carbon signal [
128]. The cis-alkenyl carbon is more abundant than the trans-alkenyl carbon. The side chain α and β double bond structures can be used to determine the cis-trans isomers by the coupling constants of the alkene protons. In the
1H-NMR, most compounds have trans-alkene bonding signals
δH 7.61, 6.35 (each 1H,
J = 16.0 Hz, -CH=CH-), and a few have cis-alkene bonding signals
δH 5.93 (1H, d,
J = 12.8 Hz) and 7.11 (1H, d,
J = 12.8 Hz). The α and β double bonds are not as structurally stable as in the trans form if they are cis-substituted [
129].
The structure of monacyl compounds can be determined by 1D NMR spectroscopy. For disubstituted or more substituted compounds, a 2D NMR spectrum is needed to accurately localize the linkages. First, the parent nucleus is determined, and then the substituent position is determined. Generally, the hydrogen on the 3, 4hydroxyl groups and 9 carboxy groups of caffeic acid is replaced.
4.1. Caffeoyl Ester Derivatives
Table S1 shows the
13C-NMR and
1H-NMR data of caffeoyl ester derivatives.
Examples are as follows. See
Table 8 below.
4.1.1. 1H-NMR Data Obtained for Caffeoyl Ester Derivatives
When C
3-OH, C
4-OH or C
9-OH are esterified, there is little effect on the chemical shift values of H-2, H-5 or H-8. As the induced effect is transmitted through bonding electrons, the influence of the induced effect diminishes with increasing distance from the electronegative substituent, and effects over three bonds apart are usually negligible [
130].
4.1.2. 13C-NMR Data Obtained for Caffeoyl Ester Derivatives
When C3-OH (or C4-OH) undergoes esterification, -OCOCH3 is the electron-donating group, which increases the electron cloud density of C-3 and C-2 (or C-4 and C-5) and decreases the value of the chemical shift of this carbon. The α-site of the substituent group is the most influential, followed by the β-site, and the γ-site is shifted to higher fields, which is caused by the γ-effect. In general, the induced effect is negligible for the carbon above the γ-site.
When C9-OH undergoes esterification, -O(CH2)nCH3 is an electron-donating group, which increases the electron cloud density of C-9 and C-8 and decreases the chemical shift value of this carbon.
4.2. Caffeyltartaric Acid
Table S2 shows the
13C-NMR and
1H-NMR data of caffeyltartaric acid.
The example is as follows. See
Table 9 below.
4.2.1. 1H-NMR Data Obtained for Caffeyltartaric Acid
Tartaric acid.
1H-NMR (600 MHz, CDCl
3) δ
H: 4.34 (1H, H-2,3) [
131].
A symmetric structure exists for tartaric acid, with esterification of caffeic acid C9 with OH on tartaric acid C-2′ or C-3′, and elevated C2′-H, C3′-H chemical shift values.
4.2.2. 13C-NMR Data Obtained for Caffeyltartaric Acid
Tartaric acid.
13C-NMR (150 MHz, CDCl
3) δ
C: 178.0 (C-1, C-4), 75.0 (C-2, C-3) [
131].
This carbon chemical shift value decreases when esterification of tartaric acid C2′-OH (or C3′-OH) and caffeic acid C9-OH occurs.
4.3. Caffeic Acid Amide Derivatives
Table S3 shows the
13C-NMR and
1H-NMR data of caffeic acid amide derivatives.
The example is as follows. See
Table 10 below.
4.3.1. 1H-NMR Data Obtained for Caffeic Acid Amide Derivatives
When the esterification of C9-OH occurs, it has little effect on the chemical shift value of H-8 because the induced effect is transmitted through bonding electrons, and the influence of the induced effect diminishes as the distance from the electronegative substituent increases, and the effect of more than three bonds apart is usually negligible.
4.3.2. 13C-NMR Data Obtained for Caffeic Acid Amide Derivatives
When C9-OH undergoes esterification, which increases the C-9 electron cloud density, the chemical shift value of this carbon decreases.
4.4. Caffeoyl Shikimic Acid
Table S4 shows the
13C-NMR and
1H-NMR data of caffeoyl shikimic acid.
The example is as follows. See
Table 11 below.
4.4.1. 1H-NMR Data Obtained for Caffeoyl Shikimic Acid
Shikimic acid C3-OH, C4-OH, and C5-OH can be acylated with caffeic acid C9′-COOH to form esters. Once the hydroxyl group is ester, the hydrogen signal connected to the same carbon will shift to the low field 1.1~1.6.
4.4.2. 13C-NMR Data Obtained for Caffeoyl Shikimic Acid
Shikimic acid C3-OH, C4-OH and C5-OH can be acylated to esters with caffeic acid C9′-COOH, with the carbon (C3, C4, or C5) chemical shift values of the shikimic acid directly linked to the caffeoyl shifted to the low field and the chemical shift values of the caffeoyl C9′ shifted to the high field. Shikimic acid C3-OH, C4-OH and C5-OH can be acylated to esters with caffeic acid C9′-COOH, with the carbon (C3, C4, or C5) chemical shift values of the shikimic acid directly linked to the caffeoyl shifted to the low field and the chemical shift values of the caffeoyl C9′ shifted to the high field.
4.5. Caffeoyl Quinic Acid
Tables S5–S8 show the
13C-NMR and
1H-NMR data of caffeoyl quinic acid.
The example is as follows. See
Table 12 below.
Due to its proximity to H-3 and H-5, H-4 usually appears as a double–double peak. For C4-OH without esterification, H-4 usually occurs between
δH 3.7 and 4.1. When C4-OH is esterified, H-4 is displaced to the lower field
δH 1.0~1.6. Due to the presence of H-4″ at
δH 5.11, compound
41 can be identified as 4-O-caffeoyl-substituted dicaffeoylquinic acids. H-3 (or H-5) generally appears as a multiple peak due to its coupling to H-2 (or H-6) and H-4. For C3-OH (or C5-OH) without esterification, H-3 (or H-5) usually occurs between
δH 4.0 and 4.6. When C3-OH (or C5-OH) is esterified, H-3 (or H-5) shifts to a low field of
δH 1.0~1.6. The caffeoyl group at the C-3 position is on the upright bond, and H-4 and H-5 maintain the coupling state of the neighboring ax-ax, resulting in a double–double peak at H-4. By observing the signal, compound
41 was identified as 3,4-dicaffeoylquinic acid [
43].
4.5.1. 1H-NMR Data Obtained for Caffeoyl Quinic Acid
If C
1-OH is not esterified in the quinic acid parent nucleus, the chemical shifts of H-4 and H-6 are typically between
δH 2.1 and 2.3 in the form of multiple peaks. When C
1-OH is esterified, the resonance frequencies of the four hydrogens in H-2 and H-6 become significantly different, appearing in the hydrogen spectrum as four double–double peaked protons with different chemical shifts (
δH 2.0~3.5). Due to its proximity to H-3 and H-5, H-4 usually appears as a double–double peak. For C
4-OH without esterification, H-4 usually occurs between
δH 3.7 and 4.1. When C
4-OH is esterified, H-4 is displaced to the lower field
δH 1.0~1.6. H-3 (or H-5) generally appears as a multiple peak due to its coupling to H-2 (or H-6) and H-4. For C
3-OH (or C
5-OH) without esterification, H-3 (or H-5) usually occurs between
δH 4.0 and 4.6. When C
3-OH (or C
5-OH) is esterified, H-3 (or H-5) shifts to a low field of
δH 1.0~1.6 [
44].
Once the hydroxyl group becomes an ester, the hydrogen signals attached to the same carbon are shifted to the lower field 1.1~1.6, and the five-position is more significant than the three-position. For molecules with two acylation groups, the shift of the hydrogen signal to the lower field will be more obvious, which may result from mutual accumulation. Regular acylation of caffeoyl quinic acid generally also shifts the two trans-alkene hydrogens (H-7′ and H-8′) on the caffeic acid unit to the low field.
Coupling constants are also important in structural inference, especially in stereo-structural and conformational problems. For example, when the acylating group is in the ax bond, the coupling constants of the two neighboring hydrogens in the eq-eq conformation or the eq-ax conformation are 2~3 Hz. When the acylation group is in the eq bond, the coupling constant of the neighboring ax-ax configuration hydrogen is 10 Hz, and the coupling constant of the eq-ax configuration is 5 Hz [
129].
4.5.2. 13C-NMR Data Obtained for Caffeoyl Quinic Acid
13C NMR showed two carbonyl carbons (
δC 165~175). The chemical shifts of C-2 and C-6 in the quinic acid unit are usually between
δC 30 and 40. The chemical shifts of C-1, C-3, C-4 and C-5 are within
δC 60~80 due to hydroxyl substitution [
129].
Quinic acid fragments C1-OH, C3-OH, C4-OH and C5-OH can be acylated with caffeic acid to form esters with the presence of acylation shifts, which corresponds to an increase in the chemical shift value of the carbon and a decrease in the caffeoyl C-9′ (or C-9″) chemical shift value. If the chemical shift values of H-2 and H-6 and C-2 and C-6 are very similar, the molecule may have symmetry. When methoxy binds to the C-7 carbonyl group of quinic acid to form an ester, the C-7 carboxyl group shifts to a higher field.
4.6. Caffeoyl Danshensu
Tables S9–S11 show the
13C-NMR and
1H-NMR data of caffeoyl danshensu. As a basis for spectral analysis, the spectral characteristics of two different structural types of compounds, rosmarinic acid and prolithospermic acid, are described below.
The structure of rosmarinic acid (74) is characterized by the absence of a substituent at the two-position of the caffeoyl; thus, the aromatic protons of caffeoyl (H-2, 5, 6) are shown as a one, two, four coupling system. A group of danshensu side chain protons in the high field region δH 2.8~5.0 show the spin coupling (ABX, H2-7, H-8) system, which constitutes an important feature of the hydrogen spectrum of these compounds. It is not difficult to find the relevant characteristic peaks from the carbon spectrum. Since the polymerization unit of these compounds has a unit structure containing an o-diphenol hydroxyl group and four oxygenated aromatic quaternary carbons appear in the δC 140~150 interval, the degree of polymerization is two. The CH2 peak of δC 38.1 and the CH peak of 78.4 indicate that the dimer contains the structural unit of 3,4-dihydroxyphenyllactic acid.
The structural difference between prolithospermic acid (91) and rosmarinic acid is that prolithospermic acid contains a unit structure of dihydrofuran rings. The absolute configurations of the two chiral carbons of the dihydrofuran ring are the R and S configurations. In its high-resolution spectrum, the chemical shifts of a set of characteristic alicyclic hydrogens of the dihydrofuran ring are δC 5.85 and δC 4.31 (H-7, H-8), with a coupling constant of approximately 4.0 Hz in 1H-NMR. In 13C-NMR, the CH peak is δC 88.0 and the CH peak (C-8) is 57.2.
The spectral characteristics of trimers and tetramers in salvianolic acid are the above two condensation modes. In the analysis of the structure, the degree of polymerization was first determined by the number of oxygenated aromatic season carbons (δC 140~150) in the carbon spectrum, and then the characteristic peak of the high field region in the hydrogen spectrum or carbon spectrum was used to determine the polymerization mode.
Examples are as follows. See
Table 13 below.
4.6.1. 1H-NMR Data Obtained for Caffeoyl Danshensu
The two benzyl hydrogen signals (δH 3.0~3.5) and the proton signature of the same carbon as the acyloxy group (δC 5.18~5.33) of the danshensu part, and the latter split with the benzyl hydrogen to form a double–double peak J = 7 Hz and 4 Hz. The chemical shifts of some aromatic hydrogens in danshensu generally occur at δH 6.7~6.9, and the cleavage is insignificant when the hydroxyl group is methylated.
Usually, when the caffeic acid ester C9′-COOH is formed from danshensu C8-OH, the chemical shift value of H-8 moves to the low field, and caffeoyl C-2′ and C-3′ are connected with dihydrofuran rings.
4.6.2. 13C-NMR Data Obtained for Caffeoyl Danshensu
When the ester of caffeic acid C9′-COOH is formed from danshensu C8-OH, the C-5, C-6, C-7, C-8 and C-9 chemical shift values shift to the high field, and caffeoyl C-8′ and C-9′ chemical shift values shift to the high field. Caffeoyl C-2′ and C-3′ are linked to the dihydrofuran ring, and the chemical shift values of C-2′ and C-3′ are shifted to the lower domains.
4.7. Caffeoyl Glycoside
Tables S12–S18 show the
13C-NMR and
1H-NMR data of caffeoyl caffeoyl glycoside.
The sugars connected by caffeoyl glycosides are generally approximately 1 to 4. The characteristic end-substrate proton signals at δH 4.3~6.0 and end-substrate carbon signals at δC 95~105 can be used to initially determine the number of sugars. Furthermore, 2D NMR techniques, such as Heteronuclear Multiple Quantum Coherence (HMQC), 1H detected heteronuclear multiple bond correlation (HMBC) and total correlation spectroscopy(TOCSY), were used to determine the type of sugar and ascribe the signal for sugar.
The example is as follows. See
Table 14 below.
In the HMBC spectra, the correlation between
δH 5.03 (H-5″) and
δC 169.1 (C-9),
δC 60.4 (C-6″),
δC 36.9 (C-4″) and
δC 103.1 (C-3″) indicates that the caffeoyl group is adjacent to the 5″-OH. In addition, HMBC spectra also showed that the
1H NMR signal of
δH 5.16 (H-1″) correlated with the
13C NMR signal of
δC 142.5 (C-2″),
δC 67.0 (C-7″), and
δC 43.3 (C-8″). The
1H NMR signal of
δH 4.80 (H-1′) was related to the
13C NMR signal of
δC 95.2 (C-1″), indicating that the sugar residue is attached to 1″-OH. All
1H and
13C NMR signals of compound
121 were resolved by
1H-
1H COSY, HSQC and HMBC spectra. The ROESY spectra and coupling constants were analyzed to determine the relative configuration of compound
121. Based on the large coupling constant (15.6 Hz) between H-7 and H-8, it indicated that the caffeoyl portion is E-configuration. The NMR chemical shift values of compound 121 combined with the GC analysis results of the sugar and D-glucose obtained by acid hydrolysis showed that the hexose part was D-glucose. The high coupling constant (7.8 Hz) from
3JH-1′,H-2′ indicates that the glucosyl unit is β-oriented. Based on the above inferences, compound
121 was identified as Verminoside [
90].
4.7.1. 1H-NMR Data Obtained for Caffeoyl Glycoside
The type of sugar in caffeoyl glycosides can be determined by the chemical shift and coupling constant observed for the characteristic end-substrate hydrogen signal of the sugar. In general, the end-substrate proton signals of sugar in 1H NMR are approximately δH 5.0 ppm, δH 4.3~6.0, 1H(d), glucose δH 4.2~4.4 (d, J = 8.0 Hz) and rhamnose δH 5.1~5.3 (d, J = 1.0 Hz). Most compounds showed characteristic double peaks, while a few showed wide single peaks. The glycyclic proton signal is between δC 3.5~4.5 ppm. The methyl proton signal of methyl five-carbon sugars (such as rhamnose) is approximately δH 1.0 ppm. The signals of the end-substrate and methyl proton are far away from other signals and can be easily recognized, and the number of sugars, the types of sugars and the location of connections can be inferred.
The relative configuration of the glycoside bond was determined by 1H-NMR and the coupling constants of C1-H and C2-H. In most monosaccharides, such as glucose and their glycosides, the two-sided angle between the end-group proton and H-2 is 180° because H-2 on the sugar is located on the upright bond when the oxygen on the end group is β-oriented, and the 3JH1,H2 value is approximately 6~8 Hz. For the α-configuration, the angle between the two surfaces is 60°, and the 3JH1,H2 values are from 1 to 3 Hz. The terminal group configuration of pyranose with H-2 in the upright bond can be determined by the 3JH1,H2 values of the terminal group hydrogen measured by 1H-NMR spectra. However, in rhamnoside, differentiation through the 3JH1,H2 values is impossible because H-2 is located on the flat-volume bond, and the dihedral angles of the two protons are 60° in both the α and β configurations of the end group. For furanose, regardless of whether its end matrix and C2 proton are in cis or trans, its J value does not change much (the value remains in 0~5), so the glycoside bond configuration cannot be judged.
4.7.2. 13C-NMR Data Obtained for Caffeoyl Glycoside
Type and Amount of Sugar
The diversity of caffeoyl glycosides is evidenced by the type of glycosides and the sugar fraction, as there are differences in the number of sugars, the types of sugars, the way the sugars are connected to each other and the way the sugars are connected to the glycosides.
The chemical shift of the methyl carbon of the sugar is around δC 18, and the presence of multiple signals (minus the methyl group in the glycoside) can indicate the presence of several methyl pentoses. CH2OH is approximately δC 62, and CHOH is approximately δC 68~85. The carbon signal in the furanose ring appears in a lower field than that in the pyranose ring, which can distinguish the size of the sugar–oxygen ring. For the furan oxygen ring, CH-OH (C3, C5) >80 ppm; for the pyran oxygen ring, CH-OH (C3, C5) <78 ppm. Most of the end-group carbon signals of glycosides are between 95 and 105, such as glucose and rhamnose with δC 105.1 and δC 103.8, respectively. Several signals can indicate the presence of several sugars in the repeating units of the sugar chain; most of the signals on the sugar can be specified by comparison with similar sugars or glycoside derivatives.
The end-group differential isomers of glycosides, such as glucose, leading to large differences in the chemical shift values of the end-group carbons, and the relative configuration (α or β) of the sugar can be determined from the chemical shift values of the end-group carbons. In common sugars, the end-group carbonization shift of β-D and α-L glycosides is usually greater than
δC 100. When ester glycosides, tertiary alcohol glycosides, and individual phenolic glycosides are present, the chemical shift values can drop to
δC 98. The end group carbon chemical shift values for α-D- and β-L-type glycosidic bonds are usually less than
δC 100. Therefore, the number of sugars and the conformation of glycosidic bonds contained in oligosaccharides and glycosides can be roughly inferred from the number of carbon signals and chemical shift values in the
δC 95~105 region [
132].
Determining the Binding Position of Sugar (the Glycosylation Position)
Currently, 13C NMR methods are often used to determine the location of sugar linkages in caffeoyl glycosides, which primarily involves attributing signals to individual carbons to identify the carbon that produces the glycosidic shift. In practical work, the attribution of chemical shifts is mainly based on comparison with analogs and reasonable prediction by the rule of glycosylation shift, and the selected reference compounds are generally free glycosides and methyl glycosides.
The linkage between sugars and aglycones in caffeoyl glycosides is formed by the combination of the hydroxyl groups of sugars and aglycones. The carboxyl group of sugar and aglycone combine to form an ester bond. In hydroxyl glycosylation, C generally shifts δC 8 to 10 toward the lower field, and it affects the values of neighboring C. Glycosylation of the link position between sugars generally moves the shift to the low field at approximately δC 3~8. However, sugars form ester glycosides with carboxyl groups, the glycosylation shift value is high, the carboxyl carbon glycosidic shift is approximately two, and the end group carbon of the sugar is generally shifted to δC 95~96. When sugars form glycosides with carboxyl groups, phenolic hydroxyl groups and enol hydroxyl groups, the glycosylation shift value is relatively special, the α-C shift to the high field is 0~4 units, and the β-C shift to the low field direction. The sugar end-group carbon is displaced to the low field in phenolic and enol glycosides and the high field in ester glycosides, with small displacements (0~4 units). Typically, acetylation of the hydroxyl group shifts its alkyl carbon (α-C) signal to the low field (+2~+4 ppm) and its neighboring carbon (β-C, which is γ-C with respect to the acetyl group) signal to the high field (−6~−2 ppm).
To determine the position of the linkage between the two monosaccharides in a disaccharide glycoside, the 13C spectral data of the disaccharide glycoside were compared with the 13C spectral data of the corresponding monosaccharide. If the chemical shift of a carbon atom of the inner sugar is shifted in the low-field direction (usually 4~7 ppm) and the chemical shifts of its two neighboring carbon atoms are slightly shifted in the high-field direction (approximately 1~2 ppm), this carbon atom of the inner sugar is the linkage position of the sugar.
To identify the signals of individual carbon and H atoms, spectroscopic techniques such as HMBC and nuclear overhauser effect spectroscopy (NOESY) were utilized to infer the linkage order and linkage position of the sugar chain by observing the linked CH or HH remote coupling.
5. Possible Biosynthetic Pathways for the Generation of Caffeic Acid and Its Derivatives
Basically, most of the phenolics in higher plants are synthesized by the mangiferic acid pathway. Carbon dioxide in plant photosynthesis forms primary carbon metabolites, glucose and some other carbohydrates. These primary metabolites are generated through glycolysis and other ways to generate erythrose and phosphenol-pyruvate through the catalytic conversion of related enzymes into shikimic acid and then shikimic acid into phenylalanine, tyrosine, tryptophan and other aromatic amino acids [
133]. Phenylalanine generates cinnamic acid by the action of phenylalanin ammonia-lyase (PAL), which in turn generates 4-coumaric acid by the action of cinnamic acid 4-hydroxylase (C4H) and the production of 4-coumaroyl-CoA by the action of 4-coumarate:coenzyme a ligase (4CL) [
134,
135]. C3H catalyzes the formation of caffeic acid from coumaric acid [
136].
The specific biosynthetic pathways contain the following main pathways: first, rosmarinic acid and salvianolic acids are synthesized from 4-coumaric acid and 4-hydroxyphenyllactic acid as precursors; second, chlorogenic acids are synthesized from quinic acid and caffeic acid as precursors; and third, caffeic acid, tyrosol and hydroxytyrosol are used as precursors to synthesize caffeoyl glycosides, such as acteoside (
Figure 1).
5.1. Caffeoyl Ester Derivatives, Caffeyltartaric Acid, Caffeic Acid Amide Derivatives, Caffeoyl Shikimic Acid and Caffeoyl Quinic Acid
There are three biosynthetic pathways by which 4-coumaroyl-CoA continues to produce chlorogenic acid (CGA), and these pathways are still debated. The following biosynthetic pathways have been proposed: the first pathway is that hydroxycinnamoyl CoA shikimate hydroxycinnamoyl transferase (HCT) can catalyze the hydroxylation of 4-coumaroyl-CoA to react with shikimic acid to produce 4-coumaroyl shikimic acid ester, which further generates caffeoyl shikimic acid, and finally, caffeoyl-CoA, and hydroxycinnamoyl CoA quinate hydroxycinnamoyl transferase (HQT) can catalyze caffeoyl-CoA and quinic acid to synthesize CGA through transesterification. The second pathway suggests that CGA is derived from quinic acid and caffeoyl D-glucose and is catalyzed by hydroxycinnamoyl D-glucose: quinate hydroxycinnamoyl transferase (HCGQT). In the third pathway, p-coumaroyl quinic acid is produced through catalyzed by HCT and then CGA is produced by p-coumarate 3-hydroxylase (C3H) hydroxylation [
137,
138].
The biosynthesis of chicoric acid involves a two-step process. In the cytosol, two BAHD acyltransferases, EpHTT and EpHQT catalyze the production of caftaric acid and chlorogenic acid intermediates, respectively. Both compounds are transported to the vacuole to form chicoric acid catalyzed by EpCAS [
139].
However, the biosynthetic pathway that generates caffeoyl ester derivatives and caffeic acid amide derivatives in nature is not well understood.
5.2. Caffeoyl Danshensu
The caffeoyl danshensu biosynthesis pathway includes two parallel pathways, the phenylalanine pathway and the tyrosine pathway. The tyrosine of the tyrosine branch is treated by tyrosine aminotransferase (TAT) to produce 4-hydroxyphenylpyruvic acid, and 4-hydroxyphenylpyruvic acid is treated by 4-hydroxyphenylpyruvate reductase (HPPR) to produce 4-hydroxyphenylpyruvic acid. Biochemical studies have shown that the initial stage of rosmarinic acid (RA) in Salvia miltiorrhiza is the hydroxylation of 4-hydroxyphenyllactic acid (pHPL) at aromatic ring C-3, which is catalyzed by an unknown CYP450 to produce 3,4-dihydroxyphenyllactic acid (DHPL) [
140]. Rosmarinic acid synthase (RAS) then binds DHPL to the 4-coumaroyl portion to form the ester 4-coumaroyl-3,4-dihydroxyphenyllactic acid, which is hydroxylated by the cytochrome p450-dependent monooxygenase CYP98A14 to form RA. It differs from parts of other plants, where pHPL is a direct substrate of RAS, bound to 4-coumaryl-coa, and RA is formed by the dihydroxylation of esters [
141,
142,
143,
144,
145]. RA is formed by the hydroxylation of 3-hydroxylase (3-H) and 3′ hydroxylase (3′-H) [
141,
146,
147,
148].
Caffeic acid is catalyzed to form caffeoyl-CoA, which is then catalyzed by RAS with 4-hydroxyphenyllactic acid to form caffeoyl-4′-hydroxyphenyllactic acid and then catalyzed by CYP98A14 to form RA [
140].
The biosynthetic pathway from rosmarinic acid to salvianolic acid B is still not fully understood. However, in a study by Di et al. [
140], the following synthetic route was suggested: salvianolic acid B is produced by direct polymerization of two molecules of rosmarinic acid, which involves a redox reaction catalyzed by an unknown oxidase. After performing a comprehensive analysis of key enzyme-encoding genes in the biosynthesis pathway of active ingredients in salvianolic acid, Xu et al. [
149] found that five genes encoding laccases were detected in the biosynthesis pathway of salvianolic acid. Among them, two genes are closely related to the content of salvianolic acid and other macromolecules, such as salvianolic acid B. Therefore, they speculate that the process of rosmarinic acid synthesis of salvianolic acid is likely to be catalyzed by laccase in
Salvia miltiorrhiza [
11].
5.3. Caffeic Acid Glycoside
Acteoside is among the most widely distributed disaccharide caffeoyl esters, consisting of the following components: CA, glucose, rhamnose and hydroxytyrosol (3,4-dihydroxyphenylethanol, HT). At present, there is a general consensus on the potential metabolic modules of acteoside biosynthesis, which mainly include the phenylalanine metabolic pathway, dopamine pathway/tyramine pathway and downstream acyl transfer and glycosylation crossing pathway.
The intermediates of the dopamine/tyramine pathway, tyrosol and hydroxytyrosol, are another key precursor to the biosynthesis of acteoside. Both intermediates can generate hydroxytyrosol glucoside, which is the precursor of acteoside and can be produced through different pathways, which is an important branch pathway of acteoside biosynthesis. Tyrosine produces L-DOPA by polyphenol oxidase (PPO)/tyrosine hydroxylase (TH), and DOPA decarboxylase (DODC)/tyrosine decarboxylase (TyDC) catalyzes the production of dopamine from L-DOPA, which is later followed by hydroxytyrosol by the action of copper amine oxidase (CuAO) and alcohol dehydrogenase (ALDH) [
150,
151,
152,
153,
154]. In the other pathway, dopamine is catalyzed by copper amine oxidase (CuAO)/monoamine oxidase (MAO) to generate 3,4-dihydroxyphenylpyruvic (3,4-DHPAA), after which ALDH catalyzes the generation of hydroxytyrosol from 3,4-DHPAA [
153]. The tyramine pathway can provide tyrosol or hydroxytyrosol precursors for the acteoside biosynthesis pathway. Tyrosine is catalyzed by TyDC to form tyramine, which is oxidized by CuAO/tyramine oxidase (TYO) to 4-hydroxyphenylacetaldehyde (4-HPAA). 4-HPAA can be reduced to tyrosol by 4-hydroxyphenylpyruvate reductase (4HPAR)/ALDH, which then generates hydroxytyrosol catalyzed by tyrosol hydroxylase (TLH) [
150,
151,
153,
155,
156].
4-Hydroxyphenylpyruvic acid (4-HPPDC) generates 4-HPAA by the action of 4-hydroxyphenylpyruvate decarboxylase (HPPADC) [
157,
158]. Torrens Spence et al. [
159] first identified pyridoxal phosphate-dependent 4-hydroxyphenylacetaldehyde synthase (4HPAAS), which directly catalyzes the conversion of tyrosine to 4-HPAA.
Based on the structural analogs of the acteoside, their centers are glucose, esterified with caffeoyl groups and modified by rhamnosyl groups at the C3 position, but the molecular catalytic mechanism leading to their acylation and rhamnosylation has not been reported. From the hydrolysis, metabolism experiments and chemical structure of the acteoside, it can be inferred that there are two potential possibilities for the synthesis of acteoside [
160,
161,
162]. First, caffeoyl-CoA and hydroxytyrosol glucoside generate derhamnosylacteoside under the action of HCT, and derhamnosylacteoside is catalyzed by UDP-rhamnose glucosyltransferase (URT) to further generate acteoside. Another potential pathway is that hydroxytyrosol generates hydroxytyrosol glucoside by UDP-glucose:glycosyltransferase (UGT), hydroxytyrosol glucoside is catalyzed by URT to generate dercaffeoylacteoside, and dercaffeoylacteoside is finally condensed with caffeoyl-CoA to generate acteoside by the action of HCT.
6. Conclusions
Caffeic acid and its derivatives are widely distributed in plants, exhibit many physiological activities, undergo rapid metabolism, have a relatively simple chemical structure and are natural active ingredients with good application prospects. In this paper, the structural information and NMR data of 1743 caffeic acid and its derivatives are reviewed and compiled to summarize the patterns of chemical shifts and the effects of their neighboring and interstitial substituents for the seven major classes of compounds. From the statistical results, in general, the acetylation of the hydroxyl group will shift its alkyl carbon (α-C) signal to the low field (+2~4 ppm) and its neighboring carbon (β-C, relative to the γ-C of the acetyl group) signal to the high field (−6~−2 ppm).
After the sugar is glycosidized with a glycoside, the chemical shift values of the α-C and β-C of the glycoside and the end-group carbons of the sugar are changed, and this change is called a glycosidization shift. The value of the glycosylation shift is related to the structure of the aglycone but not to the type of sugar. If the aglycone is a chain structure, the glycosidation shift value of the sugar end group carbon decreases as the glycosidic element is a primary, secondary or tertiary group. In the structure of the glycosidized sugar molecule, the chemical shift of α-C, which is usually directly connected to the end-group carbon, is more varied, and β-C is slightly affected, while other carbon atoms are less affected. When sugar and alcohol hydroxyl groups form glycosides, the sugar end group carbon shifts to a lower field and the displacement amplitude is related to the type of alcohol of the aglycone.
The 13C-NMR and 1H-NMR shift characteristics observed for different substitution types of caffeic acid and its derivatives can provide a basis and reference for the identification of caffeic acid and its derivatives in the future; in addition, this information provides predictions for the discovery of new structures and strong evidence for the study of metabolites of caffeic acid and its derivatives in vivo. A 2D NMR spectroscopy plays an important role in determining the structure of new phenolic acids. The application of C,H-COSY and C,H-COLOC enables rapid structural determinations of new compounds and more accurate attribution of chemical shifts to individual protons and carbons.
The biosynthetic pathways of caffeic acid and its derivatives are also summarized in this paper. Both caffeic acid and its derivatives are first synthesized in plants through the shikimic acid pathway, from which phenylalanine is deaminated to cinnamic acid and converted to caffeic acid. The specific biosynthetic pathways contain the following main pathways: first, rosmarinic acid and salvianolic acids are synthesized from 4-coumaric acid and 4-hydroxyphenyllactic acid as precursors; second, chlorogenic acids are synthesized from quinic acid and caffeic acid as precursors; and third, caffeic acid, tyrosol and hydroxytyrosol are used as precursors to synthesize caffeoyl glycosides such as acteoside.
However, as methods to effectively mine gene elements and gene function identification methods are lacking, progress in the analysis of caffeic acid and its derivatives has been slow, and its biosynthetic pathway has not been fully elucidated. Based on the above progress achieved for the biosynthetic pathways of caffeic acid and its derivatives, the biosynthetic pathway of chlorogenic acid-like components is relatively clear. The biogenic pathway of salvianolic acids is not completely clear, and only the upstream rosmarinic acid biogenic pathway has been partially clarified. The source pathway of other phenolic acids downstream of rosmarinic acid, such as salvianolic acid B, is not completely clear and is only in the preliminary stage of exploration and speculation. Although laccase has been identified and hypothesized to play an important role in the biosynthetic pathway of salvianolic acid B, the gene for laccase has not been completely cloned or validated. Therefore, it is necessary to continue exploring the role of laccase in catalyzing rosmarinic acid synthesis of salvianolic acid B and elucidate its mechanism. The key elements and pathways of acteoside synthesis have not been fully resolved, mainly acyltransferase and rhamnosyltransferase in the downstream acyltransferase and glycosylation pathways have not been reported, and the catalytic sequence of acylation and rhamnosylation has not been verified, which has hindered the biosynthesis of acteoside and requires further exploration.
At present, much research on caffeic acid derivatives is being carried out, and researchers are trying to find new caffeic acid derivative drugs with richer biological activities. With the deepening of research, the biosynthetic pathway system of caffeic acid and its derivatives will be increasingly clarified, and more natural drugs or synthetic drugs will be developed based on the caffeic acid derivative family.