Plant-Derived Caffeic Acid and Its Derivatives: An Overview of Their NMR Data and Biosynthetic Pathways

In recent years, caffeic acid and its derivatives have received increasing attention due to their obvious physiological activities and wide distribution in nature. In this paper, to clarify the status of research on plant-derived caffeic acid and its derivatives, nuclear magnetic resonance spectroscopy data and possible biosynthetic pathways of these compounds were collected from scientific databases (SciFinder, PubMed and China Knowledge). According to different types of substituents, 17 caffeic acid and its derivatives can be divided into the following classes: caffeoyl ester derivatives, caffeyltartaric acid, caffeic acid amide derivatives, caffeoyl shikimic acid, caffeoyl quinic acid, caffeoyl danshens and caffeoyl glycoside. Generalization of their 13C-NMR and 1H-NMR data revealed that acylation with caffeic acid to form esters involves acylation shifts, which increase the chemical shift values of the corresponding carbons and decrease the chemical shift values of the corresponding carbons of caffeoyl. Once the hydroxyl group is ester, the hydrogen signal connected to the same carbon shifts to the low field (1.1~1.6). The biosynthetic pathways were summarized, and it was found that caffeic acid and its derivatives are first synthesized in plants through the shikimic acid pathway, in which phenylalanine is deaminated to cinnamic acid and then transformed into caffeic acid and its derivatives. The purpose of this review is to provide a reference for further research on the rapid structural identification and biofabrication of caffeic acid and its derivatives.


Introduction
Caffeic acid (CA), also known as 3,4-dihydroxy cinnamic acid, is an organic compound that has two functional groups (phenolic hydroxyl and acrylic acid) [1].Caffeic acid derivatives refer to a large class of compounds that contain caffeic acid structural units [2].Caffeic acid and its derivatives are widely distributed in medicinal plants, vegetables and fruits [3].As a kind of safe and effective natural phenolic acid compound with a wide range of sources, caffeic acid exhibits many pharmacological effects, such as antioxidation [4], antibacterial [5], antiviral [6], antitumor [7], anti-inflammatory [8] and neuroprotection [9] effects and the ability to regulate blood glucose and blood lipids [10].
This paper summarizes the structural and Nuclear Magnetic Resonance Spectroscopy (NMR) spectral features of plant-derived caffeic acid and its derivatives due to their physiological activities and wide distribution in nature.The results provide a reference for the rapid structural identification of these compounds.The process of extracting these compounds from plants is complicated and affected by the plant growth cycle, climatic environment and other factors; thus, the plant cannot provide stable raw materials for natural Molecules 2024, 29, 1625 2 of 30 product extraction, which greatly limits its large-scale production.Therefore, the biosynthetic pathways that generate caffeic acid and its derivatives are summarized and found to mainly involve the shikimic acid pathway, from which phenylalanine is deaminated to cinnamic acid and then converted into caffeic acid [11].
Therefore, 13 C-NMR and 1 H-NMR data (Tables S1-S18) and biosynthetic pathways (Figure 1) of 173 caffeic acid and its derivatives on plants with different types of substituents (Figures 2-8 and Tables 1-7) were summarized to provide a reference for further research on the structural identification and biofabrication of caffeic acid and its derivatives.

Methodology
A comprehensive survey of the structural information, NMR data and biosynthetic pathways of caffeic acid and its derivatives was conducted by searching the scientific literature published in online databases (including PubMed, CNKI and SciFinder) and other sources (such as Ph.D. dissertations and M. Sc. theses).The search terms "caffeic acid", "caffeic acid derivatives", "caffeic acid and NMR", "caffeic acid derivatives and NMR", "caffeic acid and biosynthetic pathways" and "caffeic acid derivatives and biosynthetic pathways" were used for data collection.In total, 162 publications were included from 1984 to 2023.EndNote was used to collate published literature.To classify caffeic acid derivatives according to their structures, ChemDraw 20.0 software was used to draw chemical structures.

Structure and Classification of Caffeic Acid and Its Derivatives
In this paper, 1743 caffeic acid and its derivatives are compared.The skeletons of these caffeic acid derivatives can be classified into the following types according to the type of substituent: caffeoyl ester derivatives (Figure 1 and Table 1), caffeyltartaric acid (Figure 2 and Table 2), caffeic acid amide derivatives (Figure 3 and Table 3), caffeoyl shikimic acid (Figure 4 and Table 4), caffeoyl quinic acid (Figure 5 and Table 5), caffeoyl danshensu (Figure 6 and Table 6) and caffeoyl glycoside (Figure 7 and Table 7).
Caffeoyl ester derivatives are mainly synthesized by the ester formation of caffeic acid with different alcohols.Caffeic acid amide derivatives are produced by the condensation reaction between caffeic acid and amino acids.Caffeoyl tartaric acids are produced by the condensation of tartaric acid and caffeic acid through esterification.Caffeoyl shikimic acid is condensed from shikimic acid and caffeic acid by an esterification reaction.Caffeoyl quinic acid is a class of phenolic acid natural ingredients formed by the condensation of quinic acid with a varying number of caffeic acids through esterification.Because the carboxyl group of caffeic acid and the three hydroxyl groups on the alicyclic ring of quinic acid mangiferylate are easily acylated, the isomers are particularly abundant.Caffeoyl danshensu is formed by the esterification and condensation of caffeic acid and its hydrated product 3,4-dihydroxyphenyllactic acid.The main types of sugars in caffeoyl glycoside are glucose, rhamnose, xylose, furanose and glucuronic acid.The classification of sugar type mainly depends on acid hydrolysis, gas chromatography-mass spectrometry (GC-MS), NMR and other technologies.

13 C-NMR and 1 H-NMR Data of Caffeic Acid and Its Derivatives
First, the number of caffeoyl groups was determined by 13 C-NMR and 1 H-NMR, and there were five hydrogen proton signals in the 1 H-NMR (CD 3 OD, 500 MHz) of caffeic acid.In the aromatic region, δ H 6.99 (1H, d, J = 1.8 Hz), 6.84 (1H, dd, J = 1.8, 8.2 Hz) and 6.73 (1H, d, J = 8.2 Hz) are characteristic signals for hydrogen protons of the benzene-ring ABX system.δ H 7.27 (1H, d, J = 15.9Hz) and 6.28 (1H, d, J = 15.9Hz) are the characteristic signals for the hydrogen of adjacent alkenes in trans-double bonds.The chemical shifts of the double-bonded α and β hydrogens on the side chain are strongly influenced by the terminal carbonyl conjugation effect, with the α-H located in the higher field (δ H 6.2~6.5) and the β-H located in the lower field (δ H 7.4~7.7).There were nine carbon signals in the 13   [128].The cis-alkenyl carbon is more abundant than the trans-alkenyl carbon.The side chain α and β double bond structures can be used to determine the cis-trans isomers by the coupling constants of the alkene protons.In the 1 H-NMR, most compounds have trans-alkene bonding signals δ H 7.61, 6.35 (each 1H, J = 16.0Hz, -CH=CH-), and a few have cis-alkene bonding signals δ H 5.93 (1H, d, J = 12.8 Hz) and 7.11 (1H, d, J = 12.8 Hz).The α and β double bonds are not as structurally stable as in the trans form if they are cis-substituted [129].
The structure of monacyl compounds can be determined by 1D NMR spectroscopy.For disubstituted or more substituted compounds, a 2D NMR spectrum is needed to accurately localize the linkages.First, the parent nucleus is determined, and then the substituent position is determined.Generally, the hydrogen on the 3, 4hydroxyl groups and 9 carboxy groups of caffeic acid is replaced.

Caffeoyl Ester Derivatives
Table S1 shows the 13 C-NMR and 1 H-NMR data of caffeoyl ester derivatives.Examples are as follows.See Table 8 below.a Data of 1 were from reference [12] and recorded in CD 3 OD.b Data were from reference [13] and recorded in CD 3 OD.

1 H-NMR Data Obtained for Caffeoyl Ester Derivatives
When C 3 -OH, C 4 -OH or C 9 -OH are esterified, there is little effect on the chemical shift values of H-2, H-5 or H-8.As the induced effect is transmitted through bonding electrons, the influence of the induced effect diminishes with increasing distance from the electronegative substituent, and effects over three bonds apart are usually negligible [130].

13 C-NMR Data Obtained for Caffeoyl Ester Derivatives
When C 3 -OH (or C 4 -OH) undergoes esterification, -OCOCH 3 is the electron-donating group, which increases the electron cloud density of C-3 and C-2 (or C-4 and C-5) and decreases the value of the chemical shift of this carbon.The α-site of the substituent group is the most influential, followed by the β-site, and the γ-site is shifted to higher fields, which is caused by the γ-effect.In general, the induced effect is negligible for the carbon above the γ-site.
When C 9 -OH undergoes esterification, -O(CH 2 ) n CH 3 is an electron-donating group, which increases the electron cloud density of C-9 and C-8 and decreases the chemical shift value of this carbon.

Caffeyltartaric Acid
Table S2 shows the 13 C-NMR and 1 H-NMR data of caffeyltartaric acid.The example is as follows.See Table 9 below.

Caffeic Acid Amide Derivatives
Table S3 shows the 13 C-NMR and 1 H-NMR data of caffeic acid amide derivatives.The example is as follows.See Table 10 below.Data were from reference [26] and recorded in MeOH-d 4 .

1 H-NMR Data Obtained for Caffeic Acid Amide Derivatives
When the esterification of C 9 -OH occurs, it has little effect on the chemical shift value of H-8 because the induced effect is transmitted through bonding electrons, and the influence of the induced effect diminishes as the distance from the electronegative substituent increases, and the effect of more than three bonds apart is usually negligible.4.3.2. 13 C-NMR Data Obtained for Caffeic Acid Amide Derivatives When C 9 -OH undergoes esterification, which increases the C-9 electron cloud density, the chemical shift value of this carbon decreases.

Caffeoyl Shikimic Acid
Table S4 shows the 13 C-NMR and 1 H-NMR data of caffeoyl shikimic acid.The example is as follows.See Table 11 below.Data were from reference [29] and recorded in DMSO-d 6 .

Caffeoyl Quinic Acid
Tables S5-S8 show the 13 C-NMR and 1 H-NMR data of caffeoyl quinic acid.The example is as follows.See Table 12 below.Due to its proximity to H-3 and H-5, H-4 usually appears as a double-double peak.For C4-OH without esterification, H-4 usually occurs between δ H 3.7 and 4.1.When C4-OH is esterified, H-4 is displaced to the lower field δ H 1.0~1.6.Due to the presence of H-4 at δ H 5.11, compound 41 can be identified as 4-O-caffeoyl-substituted dicaffeoylquinic acids.H-3 (or H-5) generally appears as a multiple peak due to its coupling to H-2 (or H-6) and H-4.For C3-OH (or C5-OH) without esterification, H-3 (or H-5) usually occurs between δ H 4.0 and 4.6.When C3-OH (or C5-OH) is esterified, H-3 (or H-5) shifts to a low field of δ H 1.0~1.6.The caffeoyl group at the C-3 position is on the upright bond, and H-4 and H-5 maintain the coupling state of the neighboring ax-ax, resulting in a double-double peak at H-4.By observing the signal, compound 41 was identified as 3,4-dicaffeoylquinic acid [43].Data were from reference [43] and recorded in CD 3 OD.

1 H-NMR Data Obtained for Caffeoyl Quinic Acid
If C 1 -OH is not esterified in the quinic acid parent nucleus, the chemical shifts of H-4 and H-6 are typically between δ H 2.1 and 2.3 in the form of multiple peaks.When C 1 -OH is esterified, the resonance frequencies of the four hydrogens in H-2 and H-6 become significantly different, appearing in the hydrogen spectrum as four double-double peaked protons with different chemical shifts (δ H 2.0~3.5).Due to its proximity to H-3 and H-5, H-4 usually appears as a double-double peak.For C 4 -OH without esterification, H-4 usually occurs between δ H 3.7 and 4.1.When C 4 -OH is esterified, H-4 is displaced to the lower field δ H 1.0~1.6.H-3 (or H-5) generally appears as a multiple peak due to its coupling to H-2 (or H-6) and H-4.For C 3 -OH (or C 5 -OH) without esterification, H-3 (or H-5) usually occurs between δ H 4.0 and 4.6.When C 3 -OH (or C 5 -OH) is esterified, H-3 (or H-5) shifts to a low field of δ H 1.0~1.6 [44].
Once the hydroxyl group becomes an ester, the hydrogen signals attached to the same carbon are shifted to the lower field 1.1~1.6,and the five-position is more significant than the three-position.For molecules with two acylation groups, the shift of the hydrogen signal to the lower field will be more obvious, which may result from mutual accumulation.Regular acylation of caffeoyl quinic acid generally also shifts the two trans-alkene hydrogens (H-7 and H-8 ) on the caffeic acid unit to the low field.
Coupling constants are also important in structural inference, especially in stereostructural and conformational problems.For example, when the acylating group is in the ax bond, the coupling constants of the two neighboring hydrogens in the eq-eq conformation or the eq-ax conformation are 2~3 Hz.When the acylation group is in the eq bond, the coupling constant of the neighboring ax-ax configuration hydrogen is 10 Hz, and the coupling constant of the eq-ax configuration is 5 Hz [129].4.5.2. 13 C-NMR Data Obtained for Caffeoyl Quinic Acid 13 C NMR showed two carbonyl carbons (δ C 165~175).The chemical shifts of C-2 and C-6 in the quinic acid unit are usually between δ C 30 and 40.The chemical shifts of C-1, C-3, C-4 and C-5 are within δ C 60~80 due to hydroxyl substitution [129].
Quinic acid fragments C 1 -OH, C 3 -OH, C 4 -OH and C 5 -OH can be acylated with caffeic acid to form esters with the presence of acylation shifts, which corresponds to an increase in the chemical shift value of the carbon and a decrease in the caffeoyl C-9 (or C-9 ) chemical shift value.If the chemical shift values of H-2 and H-6 and C-2 and C-6 are very similar, the molecule may have symmetry.When methoxy binds to the C-7 carbonyl group of quinic acid to form an ester, the C-7 carboxyl group shifts to a higher field.

Caffeoyl Danshensu
Tables S9-S11 show the 13 C-NMR and 1 H-NMR data of caffeoyl danshensu.As a basis for spectral analysis, the spectral characteristics of two different structural types of compounds, rosmarinic acid and prolithospermic acid, are described below.
The structure of rosmarinic acid ( 74) is characterized by the absence of a substituent at the two-position of the caffeoyl; thus, the aromatic protons of caffeoyl (H-2, 5, 6) are shown as a one, two, four coupling system.A group of danshensu side chain protons in the high field region δ H 2.8~5.0 show the spin coupling (ABX, H2-7, H-8) system, which constitutes an important feature of the hydrogen spectrum of these compounds.It is not difficult to find the relevant characteristic peaks from the carbon spectrum.Since the polymerization unit of these compounds has a unit structure containing an o-diphenol hydroxyl group and four oxygenated aromatic quaternary carbons appear in the δ C 140~150 interval, the degree of polymerization is two.The CH 2 peak of δ C 38.1 and the CH peak of 78.4 indicate that the dimer contains the structural unit of 3,4-dihydroxyphenyllactic acid.
The structural difference between prolithospermic acid (91) and rosmarinic acid is that prolithospermic acid contains a unit structure of dihydrofuran rings.The absolute configurations of the two chiral carbons of the dihydrofuran ring are the R and S configurations.In its high-resolution spectrum, the chemical shifts of a set of characteristic alicyclic hydrogens of the dihydrofuran ring are δ C 5.85 and δ C 4.31 (H-7, H-8), with a coupling constant of approximately 4.0 Hz in 1 H-NMR.In 13 C-NMR, the CH peak is δ C 88.0 and the CH peak (C-8) is 57.2.
The spectral characteristics of trimers and tetramers in salvianolic acid are the above two condensation modes.In the analysis of the structure, the degree of polymerization was first determined by the number of oxygenated aromatic season carbons (δ C 140~150) in the carbon spectrum, and then the characteristic peak of the high field region in the hydrogen spectrum or carbon spectrum was used to determine the polymerization mode.
Examples are as follows.See Table 13 below.a Data were from reference [55] and recorded in CD 3 OD.b Data were from reference [68] and recorded in (CD 3 ) 2 CO-D 2 O(10:1).

1 H-NMR Data Obtained for Caffeoyl Danshensu
The two benzyl hydrogen signals (δ H 3.0~3.5)and the proton signature of the same carbon as the acyloxy group (δ C 5.18~5.33) of the danshensu part, and the latter split with the benzyl hydrogen to form a double-double peak J = 7 Hz and 4 Hz.The chemical shifts of some aromatic hydrogens in danshensu generally occur at δ H 6.7~6.9, and the cleavage is insignificant when the hydroxyl group is methylated.
Usually, when the caffeic acid ester C 9 -COOH is formed from danshensu C 8 -OH, the chemical shift value of H-8 moves to the low field, and caffeoyl C-2 and C-3 are connected with dihydrofuran rings.

Caffeoyl Glycoside
Tables S12-S18 show the 13 C-NMR and 1 H-NMR data of caffeoyl caffeoyl glycoside.The sugars connected by caffeoyl glycosides are generally approximately 1 to 4. The characteristic end-substrate proton signals at δ H 4.3~6.0 and end-substrate carbon signals at δ C 95~105 can be used to initially determine the number of sugars.Furthermore, 2D NMR techniques, such as Heteronuclear Multiple Quantum Coherence (HMQC), 1 H detected heteronuclear multiple bond correlation (HMBC) and total correlation spectroscopy(TOCSY), were used to determine the type of sugar and ascribe the signal for sugar.
The example is as follows.See Table 14 below.Data were from reference [90] and recorded in CD 3 OD.
In the HMBC spectra, the correlation between δ H 5.03 (H-5 ) and δ C 169.1 (C-9), δ C 60.4 (C-6 ), δ C 36.9 (C-4 ) and δ C 103.1 (C-3 ) indicates that the caffeoyl group is adjacent to the 5 -OH.In addition, HMBC spectra also showed that the 1 H NMR signal of δ H 5.16 (H-1 ) correlated with the 13 C NMR signal of δ C 142.5 (C-2 ), δ C 67.0 (C-7 ), and δ C 43.3 (C-8 ).The 1 H NMR signal of δ H 4.80 (H-1 ) was related to the 13 C NMR signal of δ C 95.2 (C-1 ), indicating that the sugar residue is attached to 1 -OH.All 1 H and 13 C NMR signals of compound 121 were resolved by 1 H-1 H COSY, HSQC and HMBC spectra.The ROESY spectra and coupling constants were analyzed to determine the relative configuration of compound 121.Based on the large coupling constant (15.6 Hz) between H-7 and H-8, it indicated that the caffeoyl portion is E-configuration.The NMR chemical shift values of compound 121 combined with the GC analysis results of the sugar and D-glucose obtained by acid hydrolysis showed that the hexose part was D-glucose.The high coupling constant (7.8 Hz) from 3 J H-1 ,H-2 indicates that the glucosyl unit is β-oriented.Based on the above inferences, compound 121 was identified as Verminoside [90].

1 H-NMR Data Obtained for Caffeoyl Glycoside
The type of sugar in caffeoyl glycosides can be determined by the chemical shift and coupling constant observed for the characteristic end-substrate hydrogen signal of the sugar.In general, the end-substrate proton signals of sugar in 1 H NMR are approximately δ H 5.0 ppm, δ H 4.3~6.0,1H(d), glucose δ H 4.2~4.4(d, J = 8.0 Hz) and rhamnose δ H 5.1~5.3(d, J = 1.0 Hz).Most compounds showed characteristic double peaks, while a few showed wide single peaks.The glycyclic proton signal is between δ C 3.5~4.5 ppm.The methyl proton signal of methyl five-carbon sugars (such as rhamnose) is approximately δ H 1.0 ppm.The signals of the end-substrate and methyl proton are far away from other signals and can be easily recognized, and the number of sugars, the types of sugars and the location of connections can be inferred.
The relative configuration of the glycoside bond was determined by 1 H-NMR and the coupling constants of C 1 -H and C 2 -H.In most monosaccharides, such as glucose and their glycosides, the two-sided angle between the end-group proton and H-2 is 180 • because H-2 on the sugar is located on the upright bond when the oxygen on the end group is β-oriented, and the 3 J H1,H2 value is approximately 6~8 Hz.For the α-configuration, the angle between the two surfaces is 60 • , and the 3 J H1,H2 values are from 1 to 3 Hz.The terminal group configuration of pyranose with H-2 in the upright bond can be determined by the 3 J H1,H2 values of the terminal group hydrogen measured by 1 H-NMR spectra.However, in rhamnoside, differentiation through the 3 J H1,H2 values is impossible because H-2 is located on the flat-volume bond, and the dihedral angles of the two protons are 60 • in both the α and β configurations of the end group.For furanose, regardless of whether its end matrix and C 2 proton are in cis or trans, its J value does not change much (the value remains in 0~5), so the glycoside bond configuration cannot be judged.4.7.2. 13 C-NMR Data Obtained for Caffeoyl Glycoside Type and Amount of Sugar The diversity of caffeoyl glycosides is evidenced by the type of glycosides and the sugar fraction, as there are differences in the number of sugars, the types of sugars, the way the sugars are connected to each other and the way the sugars are connected to the glycosides.
The chemical shift of the methyl carbon of the sugar is around δ C 18, and the presence of multiple signals (minus the methyl group in the glycoside) can indicate the presence of several methyl pentoses.CH 2 OH is approximately δ C 62, and CHOH is approximately δ C 68~85.The carbon signal in the furanose ring appears in a lower field than that in the pyranose ring, which can distinguish the size of the sugar-oxygen ring.The end-group differential isomers of glycosides, such as glucose, leading to large differences in the chemical shift values of the end-group carbons, and the relative configuration (α or β) of the sugar can be determined from the chemical shift values of the end-group carbons.In common sugars, the end-group carbonization shift of β-D and α-L glycosides is usually greater than δ C 100.When ester glycosides, tertiary alcohol glycosides, and individual phenolic glycosides are present, the chemical shift values can drop to δ C 98.The end group carbon chemical shift values for α-Dand β-L-type glycosidic bonds are usually less than δ C 100.Therefore, the number of sugars and the conformation of glycosidic bonds contained in oligosaccharides and glycosides can be roughly inferred from the number of carbon signals and chemical shift values in the δ C 95~105 region [132].
Determining the Binding Position of Sugar (the Glycosylation Position) Currently, 13 C NMR methods are often used to determine the location of sugar linkages in caffeoyl glycosides, which primarily involves attributing signals to individual carbons to identify the carbon that produces the glycosidic shift.In practical work, the attribution of chemical shifts is mainly based on comparison with analogs and reasonable prediction by the rule of glycosylation shift, and the selected reference compounds are generally free glycosides and methyl glycosides.
The linkage between sugars and aglycones in caffeoyl glycosides is formed by the combination of the hydroxyl groups of sugars and aglycones.The carboxyl group of sugar and aglycone combine to form an ester bond.In hydroxyl glycosylation, C generally shifts δ C 8 to 10 toward the lower field, and it affects the values of neighboring C. Glycosylation of the link position between sugars generally moves the shift to the low field at approximately δ C 3~8.However, sugars form ester glycosides with carboxyl groups, the glycosylation shift value is high, the carboxyl carbon glycosidic shift is approximately two, and the end group carbon of the sugar is generally shifted to δ C 95~96.When sugars form glycosides with carboxyl groups, phenolic hydroxyl groups and enol hydroxyl groups, the glycosylation shift value is relatively special, the α-C shift to the high field is 0~4 units, and the β-C shift to the low field direction.The sugar end-group carbon is displaced to the low field in phenolic and enol glycosides and the high field in ester glycosides, with small displacements (0~4 units).Typically, acetylation of the hydroxyl group shifts its alkyl carbon (α-C) signal to the low field (+2~+4 ppm) and its neighboring carbon (β-C, which is γ-C with respect to the acetyl group) signal to the high field (−6~−2 ppm).
To determine the position of the linkage between the two monosaccharides in a disaccharide glycoside, the 13 C spectral data of the disaccharide glycoside were compared with the 13 C spectral data of the corresponding monosaccharide.If the chemical shift of a carbon atom of the inner sugar is shifted in the low-field direction (usually 4~7 ppm) and the chemical shifts of its two neighboring carbon atoms are slightly shifted in the high-field direction (approximately 1~2 ppm), this carbon atom of the inner sugar is the linkage position of the sugar.
To identify the signals of individual carbon and H atoms, spectroscopic techniques such as HMBC and nuclear overhauser effect spectroscopy (NOESY) were utilized to infer the linkage order and linkage position of the sugar chain by observing the linked CH or HH remote coupling.

Possible Biosynthetic Pathways for the Generation of Caffeic Acid and Its Derivatives
Basically, most of the phenolics in higher plants are synthesized by the mangiferic acid pathway.Carbon dioxide in plant photosynthesis forms primary carbon metabolites, glucose and some other carbohydrates.These primary metabolites are generated through glycolysis and other ways to generate erythrose and phosphenol-pyruvate through the catalytic conversion of related enzymes into shikimic acid and then shikimic acid into phenylalanine, tyrosine, tryptophan and other aromatic amino acids [133].Phenylalanine generates cinnamic acid by the action of phenylalanin ammonia-lyase (PAL), which in turn generates 4-coumaric acid by the action of cinnamic acid 4-hydroxylase (C4H) and the production of 4-coumaroyl-CoA by the action of 4-coumarate:coenzyme a ligase (4CL) [134,135].C3H catalyzes the formation of caffeic acid from coumaric acid [136].
The specific biosynthetic pathways contain the following main pathways: first, rosmarinic acid and salvianolic acids are synthesized from 4-coumaric acid and 4-hydroxyphenyllactic acid as precursors; second, chlorogenic acids are synthesized from quinic acid and caffeic acid as precursors; and third, caffeic acid, tyrosol and hydroxytyrosol are used as precursors to synthesize caffeoyl glycosides, such as acteoside (Figure 1).

Caffeoyl Ester Derivatives, Caffeyltartaric Acid, Caffeic Acid Amide Derivatives, Caffeoyl Shikimic Acid and Caffeoyl Quinic Acid
There are three biosynthetic pathways by which 4-coumaroyl-CoA continues to produce chlorogenic acid (CGA), and these pathways are still debated.The following biosynthetic pathways have been proposed: the first pathway is that hydroxycinnamoyl CoA shikimate hydroxycinnamoyl transferase (HCT) can catalyze the hydroxylation of 4-coumaroyl-CoA to react with shikimic acid to produce 4-coumaroyl shikimic acid ester, which further generates caffeoyl shikimic acid, and finally, caffeoyl-CoA, and hydroxycinnamoyl CoA quinate hydroxycinnamoyl transferase (HQT) can catalyze caffeoyl-CoA and quinic acid to synthesize CGA through transesterification.The second pathway suggests that CGA is derived from quinic acid and caffeoyl D-glucose and is catalyzed by hydroxycinnamoyl D-glucose: quinate hydroxycinnamoyl transferase (HCGQT).In the third pathway, pcoumaroyl quinic acid is produced through catalyzed by HCT and then CGA is produced by p-coumarate 3-hydroxylase (C3H) hydroxylation [137,138].
The biosynthesis of chicoric acid involves a two-step process.In the cytosol, two BAHD acyltransferases, EpHTT and EpHQT catalyze the production of caftaric acid and chlorogenic acid intermediates, respectively.Both compounds are transported to the vacuole to form chicoric acid catalyzed by EpCAS [139].
However, the biosynthetic pathway that generates caffeoyl ester derivatives and caffeic acid amide derivatives in nature is not well understood.
Caffeic acid is catalyzed to form caffeoyl-CoA, which is then catalyzed by RAS with 4-hydroxyphenyllactic acid to form caffeoyl-4 -hydroxyphenyllactic acid and then catalyzed by CYP98A14 to form RA [140].
The biosynthetic pathway from rosmarinic acid to salvianolic acid B is still not fully understood.However, in a study by Di et al. [140], the following synthetic route was suggested: salvianolic acid B is produced by direct polymerization of two molecules of rosmarinic acid, which involves a redox reaction catalyzed by an unknown oxidase.After performing a comprehensive analysis of key enzyme-encoding genes in the biosynthesis pathway of active ingredients in salvianolic acid, Xu et al. [149] found that five genes encoding laccases were detected in the biosynthesis pathway of salvianolic acid.Among them, two genes are closely related to the content of salvianolic acid and other macromolecules, such as salvianolic acid B. Therefore, they speculate that the process of rosmarinic acid synthesis of salvianolic acid is likely to be catalyzed by laccase in Salvia miltiorrhiza [11].

Caffeic Acid Glycoside
Acteoside is among the most widely distributed disaccharide caffeoyl esters, consisting of the following components: CA, glucose, rhamnose and hydroxytyrosol (3,4dihydroxyphenylethanol, HT).At present, there is a general consensus on the potential metabolic modules of acteoside biosynthesis, which mainly include the phenylalanine metabolic pathway, dopamine pathway/tyramine pathway and downstream acyl transfer and glycosylation crossing pathway.
Based on the structural analogs of the acteoside, their centers are glucose, esterified with caffeoyl groups and modified by rhamnosyl groups at the C3 position, but the molecular catalytic mechanism leading to their acylation and rhamnosylation has not been reported.From the hydrolysis, metabolism experiments and chemical structure of the acteoside, it can be inferred that there are two potential possibilities for the synthesis of acteoside [160][161][162].First, caffeoyl-CoA and hydroxytyrosol glucoside generate derhamnosylacteoside under the action of HCT, and derhamnosylacteoside is catalyzed by UDP-rhamnose glucosyltransferase (URT) to further generate acteoside.Another potential pathway is that hydroxytyrosol generates hydroxytyrosol glucoside by UDP-glucose:glycosyltransferase (UGT), hydroxytyrosol glucoside is catalyzed by URT to generate dercaffeoylacteoside, and dercaffeoylacteoside is finally condensed with caffeoyl-CoA to generate acteoside by the action of HCT.

Conclusions
Caffeic acid and its derivatives are widely distributed in plants, exhibit many physiological activities, undergo rapid metabolism, have a relatively simple chemical structure and are natural active ingredients with good application prospects.In this paper, the structural information and NMR data of 1743 caffeic acid and its derivatives are reviewed and compiled to summarize the patterns of chemical shifts and the effects of their neighboring and interstitial substituents for the seven major classes of compounds.From the statistical results, in general, the acetylation of the hydroxyl group will shift its alkyl carbon (α-C) signal to the low field (+2~4 ppm) and its neighboring carbon (β-C, relative to the γ-C of the acetyl group) signal to the high field (−6~−2 ppm).
After the sugar is glycosidized with a glycoside, the chemical shift values of the α-C and β-C of the glycoside and the end-group carbons of the sugar are changed, and this change is called a glycosidization shift.The value of the glycosylation shift is related to the structure of the aglycone but not to the type of sugar.If the aglycone is a chain structure, the glycosidation shift value of the sugar end group carbon decreases as the glycosidic element is a primary, secondary or tertiary group.In the structure of the glycosidized sugar molecule, the chemical shift of α-C, which is usually directly connected to the end-group carbon, is more varied, and β-C is slightly affected, while other carbon atoms are less affected.When sugar and alcohol hydroxyl groups form glycosides, the sugar end group carbon shifts to a lower field and the displacement amplitude is related to the type of alcohol of the aglycone.
The 13 C-NMR and 1 H-NMR shift characteristics observed for different substitution types of caffeic acid and its derivatives can provide a basis and reference for the identification of caffeic acid and its derivatives in the future; in addition, this information provides predictions for the discovery of new structures and strong evidence for the study of metabolites of caffeic acid and its derivatives in vivo.A 2D NMR spectroscopy plays an important role in determining the structure of new phenolic acids.The application of C,H-COSY and C,H-COLOC enables rapid structural determinations of new compounds and more accurate attribution of chemical shifts to individual protons and carbons.
The biosynthetic pathways of caffeic acid and its derivatives are also summarized in this paper.Both caffeic acid and its derivatives are first synthesized in plants through the shikimic acid pathway, from which phenylalanine is deaminated to cinnamic acid and converted to caffeic acid.The specific biosynthetic pathways contain the following main pathways: first, rosmarinic acid and salvianolic acids are synthesized from 4-coumaric acid and 4-hydroxyphenyllactic acid as precursors; second, chlorogenic acids are synthesized from quinic acid and caffeic acid as precursors; and third, caffeic acid, tyrosol and hydroxytyrosol are used as precursors to synthesize caffeoyl glycosides such as acteoside.
However, as methods to effectively mine gene elements and gene function identification methods are lacking, progress in the analysis of caffeic acid and its derivatives has been slow, and its biosynthetic pathway has not been fully elucidated.Based on the above progress achieved for the biosynthetic pathways of caffeic acid and its derivatives, the biosynthetic pathway of chlorogenic acid-like components is relatively clear.The biogenic pathway of salvianolic acids is not completely clear, and only the upstream rosmarinic acid biogenic pathway has been partially clarified.The source pathway of other phenolic acids downstream of rosmarinic acid, such as salvianolic acid B, is not completely clear and is only in the preliminary stage of exploration and speculation.Although laccase has been identified and hypothesized to play an important role in the biosynthetic pathway of salvianolic acid B, the gene for laccase has not been completely cloned or validated.Therefore, it is necessary to continue exploring the role of laccase in catalyzing rosmarinic acid synthesis of salvianolic acid B and elucidate its mechanism.The key elements and pathways of acteoside synthesis have not been fully resolved, mainly acyltransferase and rhamnosyltransferase in the downstream acyltransferase and glycosylation pathways have not been reported, and the catalytic sequence of acylation and rhamnosylation has not been verified, which has hindered the biosynthesis of acteoside and requires further exploration.
At present, much research on caffeic acid derivatives is being carried out, and researchers are trying to find new caffeic acid derivative drugs with richer biological activities.With the deepening of research, the biosynthetic pathway system of caffeic acid and its derivatives will be increasingly clarified, and more natural drugs or synthetic drugs will be developed based on the caffeic acid derivative family.
Author Contributions: J.Y. collated documents and wrote the manuscript; J.X. and L.L. collaborated with the selection, preparation, and revision of the manuscript; S.X., M.S. and C.X. polished the language; C.L., M.L. and Z.Z.collaborated in the revision of the manuscript.All authors have read and agreed to the published version of the manuscript.

4. 4 . 1 . 1 H
-NMR Data Obtained for Caffeoyl Shikimic Acid Shikimic acid C 3 -OH, C 4 -OH, and C 5 -OH can be acylated with caffeic acid C 9 -COOH to form esters. Once the hydroxyl group is ester, the hydrogen signal connected to the same carbon will shift to the low field 1.1~1.6.4.4.2. 13 C-NMR Data Obtained for Caffeoyl Shikimic Acid Shikimic acid C3-OH, C4-OH and C5-OH can be acylated to esters with caffeic acid C9 -COOH, with the carbon (C3, C4, or C5) chemical shift values of the shikimic acid directly linked to the caffeoyl shifted to the low field and the chemical shift values of the caffeoyl C9 shifted to the high field.Shikimic acid C 3 -OH, C 4 -OH and C 5 -OH can be acylated to esters with caffeic acid C 9 -COOH, with the carbon (C 3 , C 4 , or C 5 ) chemical shift values of the shikimic acid directly linked to the caffeoyl shifted to the low field and the chemical shift values of the caffeoyl C 9 shifted to the high field.

4. 6
.2.13 C-NMR Data Obtained for Caffeoyl Danshensu When the ester of caffeic acid C 9 -COOH is formed from danshensu C 8 -OH, the C-5, C-6, C-7, C-8 and C-9 chemical shift values shift to the high field, and caffeoyl C-8 and C-9 chemical shift values shift to the high field.Caffeoyl C-2 and C-3 are linked to the dihydrofuran ring, and the chemical shift values of C-2 and C-3 are shifted to the lower domains.
For the furan oxygen ring, CH-OH (C 3 , C 5 ) >80 ppm; for the pyran oxygen ring, CH-OH (C 3 , C 5 ) <78 ppm.Most of the end-group carbon signals of glycosides are between 95 and 105, such as glucose and rhamnose with δ C 105.1 and δ C 103.8, respectively.Several signals can indicate the presence of several sugars in the repeating units of the sugar chain; most of the signals on the sugar can be specified by comparison with similar sugars or glycoside derivatives.

Funding:
This work was supported by the Hunan Graduate Research Innovation Project (Grant No: 2023CX150); Hunan University of Chinese Medicine Graduate Research Innovation Project (Grant No: CX20230789); Hunan Province Science and Technology Innovation Leading Talent Project (Grant No: 2021RC4034); Hunan Science and Technology Innovation Team Project (Grant No: 2021RC4064); Hunan Provincial Natural Science Foundation (Grant No: 2022JJ80085); Key project at central government level: The ability establishment of sustainable use for valuable Chinese medicine resources (Grant No: 2060302).Institutional Review Board Statement: Not applicable.Informed Consent Statement: Not applicable.
[22] were from reference[22]and recorded in D 2 O.