RNA-Seq and Iso-Seq Reveal the Important Role of COMT and CCoAOMT Genes in Accumulation of Scopoletin in Noni (Morinda citrifolia)

Scopoletin, the main component of clinical drugs and the functional component of health products, is highly abundant in noni fruit (Morinda citrifolia). Multiple enzyme genes regulate scopoletin accumulation. In the present study, differentially expressed genes of noni were analyzed by RNA sequencing (RNA-Seq) and the full-length genes by isoform-sequencing (Iso-Seq) to find the critical genes in the scopoletin accumulation mechanism pathway. A total of 32,682 full-length nonchimeric reads (FLNC) were obtained, out of which 16,620 non-redundant transcripts were validated. Based on KEGG (Kyoto Encyclopedia of Genes and Genomes) annotation and differential expression analysis, two differentially expressed genes, caffeic acid 3-O-methyltransferase (COMT) and caffeoyl-CoA O-methyltransferase (CCoAOMT), were found in the scopoletin accumulation pathway of noni. Real-time quantitative polymerase chain reaction (q-PCR), phylogenetic tree analysis, gene expression analysis, and the change in scopoletin content confirmed that these two proteins are important in this pathway. Based on these results, the current study supposed that COMT and CCoAOMT play a significant role in the accumulation of scopoletin in noni fruit, and COMT (gene number: gene 7446, gene 8422, and gene 6794) and CCoAOMT (gene number: gene 12,084) were more significant. These results provide the importance of COMT and CCoAOMT and a basis for further understanding the accumulation mechanism of scopoletin in noni.


Introduction
Noni (Morinda citrifolia Linn.) is an evergreen shrub growing in tropical and subtropical areas [1,2]. At the early stage of fruit formation, the surface remains rough. However, when the fruit ripens, the surface becomes smooth. The fruit is dark green when young, pale green to white when ripe, and soft and transparent grayish brown when fully mature [3].
For a long time, noni was considered a medicinal and edible plant by people in many tropical regions [4][5][6][7][8]. Previous studies confirmed that noni contains various functional phytochemicals used clinically as adjuvant medicine in hypertension, hyperglycemia, and cancer [9][10][11]. Scopoletin is one of the important functional phytochemicals of noni [12]. In addition, scopoletin is also found in a variety of plants such as Arabidopsis thaliana [13], Chenopodium murale [14], Canscora decussata [15], Hypochaeris radicata [16], and Fagraea ceilanica [17]. The metabolic pathway of scopoletin biosynthesis involves various types of chemical reactions and the catalysis of various enzymes. Glucose and lignin-derived aromatics are used as substrates, tyrosine and phenylalanine are produced by the shikimate acid pathway, phenylalanine ammonia-lyase (PAL) catalyzes phenylalanine to cinnamic acid, which is catalyzed into coumaric acid by cinnamate-4-hydroxylase (C4H), and tyrosine directly generates p-coumaric acid under tyrosine ammonia-lyase (TAL) catalysis. P-coumaric acid is converted to caffeic acid by O hydroxylated of 4-hydroxyphenyl acetic

PacBio Iso-Seq Library Preparation and Sequencing
Oligo (dT) was used to enrich the mRNA that contains polyA, and then mRNA was reverse-transcribed to cDNA by using SMARTer PCR cDNA Synthesis Kit (Takara, Beijing, China). PCR amplification enriched the synthesized cDNA, and the optimal PCR conditions were determined by cyclic optimization. Part of the cDNA was screened for fragments with BluePippin to enrich for fragments above 4 Kb, and the screened fragments were subjected to large-scale PCR to obtain sufficient total cDNA. Full-length cDNA was used for damage repair, end repair, the connection of the SMRT (Single Molecule Real-Time) dumbbell connector, and construction of the molar mixed library with unscreened fragments and fragments larger than 4 Kb. Exonuclease digestion removed sequences of unconnected joints at both ends of cDNA, and finally, primers were combined and DNA polymerase was bound to form a complete SMRT bell library.

Illumina RNA-Seq Library Preparation and Sequencing
The library was constructed with 5 µg total RNA. After the isolation of mRNA by magnetic bead separation, the mRNA was broken by ions. The double-stranded cDNA was synthesized, toned 3 plus A, and connected to the index connector (TruseqTM RNA sample prep Kit, Illumina, San Diego, CA, USA). Afterward, the target fragments were purified for library enrichment by 15 cycles of PCR amplification and agarose electrophoresis at a concentration of 2%. Samples were then quantified using TBS380 (Picogreen, Invitrogen, Carlsbad, CA, USA) and mixed in proportion to the data and operated on the computer. Next, bridge PCR amplification was performed on the cBot to generate clusters. Finally, 2*150 bp were sequenced using the platform of Illumina Hiseq.

Iso-Seq Data Analysis
After PacBio sequencing was completed, the original disconnecting and low-quality reads were carried out. The output was filtered and processed with SMRTlink v5.1 software, and the final data obtained were considered valid. Self-correcting subreads formed CCS (Circular Consensus Sequence) to obtain high-quality transcriptional consistent sequences. The nonchimeric consistent sequences containing 5 primer, 3 primer, and poly A tail are called full-length nonchimeric sequences (FLNC). Since there are a large number of redundant sequences in the FLNC, the redundant sequences need to be clustered together to remove the redundancy by the ICE algorithm [35] and obtain corrected consensus sequences.

Illumina RNA-Seq Data Quality Control
In this study, the software used for quality control was trimmomatic (http://www. usadellab.org/cms/uploads/supplementary/Trimmomatic, accessed on 20 September 2018). First, the joint sequences in reads and the reads without inserted fragments due to the self-connection of the joint were removed. Then, bases with lower quality (mass value less than 20) at the end of the sequence (3 'end) were trimmed. If there were bases with a mass value of less than 10 in the rest of the sequence, the whole sequence was removed. In addition, reads with an N ratio of more than 10% were removed. Finally, the sequence with a length of less than 75 bp after quality trimming was discarded, and high-quality sequencing data (clean data) were obtained.

Transcript Correction and Redundancy Removal
The disadvantage of Iso-Seq is that the additional insertion of bases leads to a higher frequency of errors, especially in the homopolymer region. However, such errors occurred randomly and did not have the sequencing bias as the RNA-Seq. The drawback of RNA-Seq is that the read length is short and splicing errors may occur. So, the high-accuracy sequences of RNA-Seq can be used to calibrate the sequencing result of Iso-Seq further. With the RNA-Seq data available, Lordec [36] was used to correct further the polished consensus sequences with the RNA-Seq data. The CD-HIT [37,38] was used to cluster and compare nucleic acid sequences through sequence alignment, and the redundant and similar sequences were removed. We clustered the corrected transcript sequences according to the 95% similarity between sequences and conducted statistics on the distribution of length and frequency before and after the redundancy of the transcript.

Gene Functional Annotation
The following databases were used to annotate the gene functions: NCBI

Differential Expression Analysis
In RNA-Seq analysis, gene expression levels were calculated by the number of clean reads in the genomic region. Then, reads obtained by sequencing were mapped with Iso-Seq full-length transcripts. According to the mapping result between all RNA-Seq and Iso-Seq transcripts, the RPKM (Reads Per Kilobase per Million mapped reads) value of each transcript in the samples was calculated, and the resulting value was used as the expression amount of the transcript. Finally, the expression of the transcripts in the samples of each group was analyzed for significant differences, and the relative differentially expressed transcripts were identified and visualized. The significance of expression difference between samples is generally considered to be less than 0.05, and the q value was the corrected p value with greater statistical significance. The RPKM value was the standard to measure gene expression level, and the expression difference analysis software was edgeR (10.18129/B9.bioc.edgeR, accessed on 12 October 2018). RSEM (http://deweylab.biostat.wisc.edu/rsem/, accessed on 12 October 2018) [39] was used to quantify gene and isoform abundances. So, in this study, a q value of less than 0.05 was used to perform Go enrichment analysis to identify which DEGs (different expression genes) were significantly enriched in GO terms and metabolic pathways at Bonferroni-corrected p-value ≤0.05 compared with the whole-transcriptome background. GO functional enrichment and KEGG pathway analysis were carried out using Goatools

Phylogenetic Analysis of CCoAOMTs and COMTs
Based on the functional annotation of KEGG and the transcriptome sequencing analysis of fresh and two-day ripened fruit, some genes were differentially expressed in the scopoletin pathway. Two differentially expressed genes COMT and CCoAOMT were identified in the pathway related to the synthesis of ferulic aldehyde, a precursor synthesized by scopoletin. Then, eight COMTs and four CCoAOMTs in noni were found in the annotation library of Iso-Seq. Two phylogenetic trees were constructed based on 42 CCoAOMT nucleotide sequences and 30 COMT nucleotide sequences that covered different species to analyze evolutionary relationships and predict the gene function. Accession numbers of sequences used in this article are listed in Supplementary Tables S1 and S2. These sequence data can be found in the Phytozome and GenBank/EMBL data libraries. The nucleotide sequences of the noni CCoAOMTs and COMTs used in the corresponding phylogenetic trees are shown in Supplementary Table S3.

q-PCR
Noni fruit RNA was extracted with an RNA kit (Vazyme Biotech co., Itd., Nanjing, China), with 450 ng RNA as a template, and a reverse transcriptase kit (Thermo Scientific™ EP0733, Santa Clara, CA, USA) was used for reverse transcription. The reverse transcriptase reaction was carried out at 25 • C for 10 min, then at 50 • C for 30 min, and finally at 85 • C for 5 min after heating. The obtained cDNA was used as a template for q-PCR amplification using gene-specific primers and SG Fast q-PCR Master Mix (High Rox, Sangon Biotech (Shanghai) Co., Ltd., Beijing, China). Each sample was analyzed three times and standardized with Actin as an internal control. The primers used in the experiment are shown in Supplementary Tables S4 and S5. According to the manufacturer BioRad (Santa Clara, CA, USA), to provide q-PCR application, the dissolution curve (from 65 • C to 95 • C) was analyzed, and 2 −∆∆ Ct was used for evaluation-related gene expression. The q-PCR conditions were: reaction at 95 • C for 3 min, reaction at 95 • C for 7 s, reaction at 57 • C for 10 s, and reaction at 72 • C for 15 s. Three independent biological replicas estimated the mean and standard errors. q-PCR amplification efficiency was measured for each gene to ensure the reliability of the result.

Treatment of Noni Fruit and Scopoletin Content Determination
Ripe noni fruit (cream-colored) with a constant size was selected randomly. Then, the noni fruit was stored at the temperature of 25 • C and sampled at 0 h, 2 h, 12 h, 24 h, and 48 h. The methods of sample processing and determination of scopoletin content were consistent with the previous description [30].

Transcriptome Sequencing and Annotation
RNA from freshly harvested fruits and the fruits stored for two days were extracted for RNA-Seq and produced six Illumina RNA-Seq clean reads (Supplementary Table S6). Then, Trinity (http://trinityrnaseq.sourceforge.net/trinityrnaseq-r2013-02-25, accessed on 20 September 2018) was used to splice the short sequence produced by RNA-Seq, and then 35,721 unigenes were obtained.
RNA from root, stem, leaf, and noni fruit was mixed for Iso-Seq. A total of 25,575,577 raw subreads (35.04 G) with an average read length of 1370 bp were obtained. After removing adapters and artifacts, 483,741 CCSs were obtained in total. Then, 362,241 fulllength (FL) reads and 32,682 full-length nonchimeric reads (FLNC) were obtained with SMRTlink to detect the 5 -primer, 3 -primer, and poly-A tails.
Lordec was used to correct the high error rate of Iso-Seq in the polished consensus sequence based on the Illumina RNA-Seq reads. Then, CD-HIT was used for sequence alignment and clustering, and the corrected transcript sequences were clustered to remove redundancy according to the 95% similarity between the sequences. In total, the current study obtained 32,600 polished consensus reads and 16,620 transcripts.
To ensure structural accuracy, sequence integrity, and sequence expression accuracy, the results of RNA-Seq and Iso-Seq were combined for further analysis. In the analysis, the sequenced reads were compared with three generations of full-length transcripts, and the mapping ratio was obtained (Supplementary Table S7).
Based on the mapping results between PacBio and Illumina data, the RPKM value was the standard to measure gene expression level, and a p-value (probability of hypothesis testing by statistical models) of less than 0.05 was identified as DEGs. A total of 3913 differentially expressed genes were found among all 11,281 single genes; 2934 were up-regulated, and 979 were down-regulated ( Figure 1, Supplementary Tables S8 and S9). Then, the p-value was corrected by multiple hypothesis testing, and genes with a corrected p value of less than 0.05 were obtained to perform GO enrichment analysis and KEGG pathway analysis ( Figure 2). These genes were categorized into biological processes (BPs), cellular components (CCs), and molecular functions (MFs). In MF classification, DEGs are involved in hydrolase activity, oxidoreductase activity, catalytic activity, and dioxygenase activity. In BP classification, DEGs are involved in the organic hydroxy compound catabolic process, oxoacid metabolic process, oxidation-reduction process, organic hydroxy compound metabolic process, and monocarboxylic acid metabolic process. These reactions are related to secondary metabolism in plants and are highlighted by red boxes in the figure.
In addition, KEGG analysis showed that most of the differential genes were enriched in the metabolic pathway of phenylpropanoid biosynthesis, and this pathway contains scopoletin synthesis. Among these DEGs, some of them are involved in phenylpropanoid biosynthesis and phenylalanine metabolism, which is also the first step of the biosynthesis of scopoletin ( Figure 3). Genes 2022, 13, x FOR PEER REVIEW 7 of 18    . Enrichment of KEGG. The horizontal axis represents the enrichment factor, that is, the ratio of the number of differential genes enriched to a certain KEGG term to the number of background genes obtained by sequencing. The ordinate represents the function enriched in this KEGG term: the larger the circle, the more differential genes enriched in this function. The color spectrum from blue to red represents the uncorrected p-value.

Transcriptome Data Were Verified by q-PCR
To verify the accuracy of transcriptome sequencing, 10 up-regulated and 10 downregulated genes were selected for q-PCR. The q-PCR result is consistent with that of transcriptome sequencing, indicating the accuracy of transcriptome data ( Supplementary Figure S2).

COMT and CCoAOMT Play a Key Role in the Accumulation of Scopoletin
Based on the strong scopoletin functional activity in previous studies [30], the involvement of the scopoletin accumulation pathway in the phenylpropyl pathway attracted our attention. In the scopoletin accumulation pathway, scopoletin is produced from phenylalanine by a series of enzymatic reactions. In this pathway, the related genes are PAL, CYP73A, 4CL, COMT, CCoAOMT, REF1, and CCR (Figure 4). Only all members of COMT and CCoAOMT showed up-regulated and different expressions (the result is shown in Supplementary Table S10), and q-PCR was used to verify the importance of these two genes in noni. Figure 3. Enrichment of KEGG. The horizontal axis represents the enrichment factor, that is, the ratio of the number of differential genes enriched to a certain KEGG term to the number of background genes obtained by sequencing. The ordinate represents the function enriched in this KEGG term: the larger the circle, the more differential genes enriched in this function. The color spectrum from blue to red represents the uncorrected p-value.

Transcriptome Data Were Verified by q-PCR
To verify the accuracy of transcriptome sequencing, 10 up-regulated and 10 downregulated genes were selected for q-PCR. The q-PCR result is consistent with that of transcriptome sequencing, indicating the accuracy of transcriptome data (Supplementary Figure S2).

COMT and CCoAOMT Play a Key Role in the Accumulation of Scopoletin
Based on the strong scopoletin functional activity in previous studies [30], the involvement of the scopoletin accumulation pathway in the phenylpropyl pathway attracted our attention. In the scopoletin accumulation pathway, scopoletin is produced from phenylalanine by a series of enzymatic reactions. In this pathway, the related genes are PAL, CYP73A, 4CL, COMT, CCoAOMT, REF1, and CCR (Figure 4). Only all members of COMT and CCoAOMT showed up-regulated and different expressions (the result is shown in Supplementary Table S10), and q-PCR was used to verify the importance of these two genes in noni. were annotated as COMT in the full-length sequence obtained by Iso-Seq. Moreover, combined RNA-Seq and Iso-Seq data showed two genes (gene number: gene 7446 and gene 8422) with high differential expression and two genes (gene number: gene 10,983 and gene 6794) with low differential expression. So, these genes were selected for q-PCR verification, and the results are shown in Figure 5. These genes were expressed in a consistent trend of RNA-Seq, and all reached a significant difference within 48 h after harvest. Except for the weak expression of gene 10,983, the expression of the other three genes was significant. The relative expression of gene 8422 reached 3.5-fold at 2 h after harvest. For this phenomenon, the current study proposed that this gene was the first responsive among these four genes. The relative expression of gene 6794 reached 22.5-fold at 12 h after harvest, and gene 7446 reached 8.9-fold at 48 h after harvest. The timing of the soar in the expression of these three genes was sequential, which made us consider it a plant energy-saving mechanism. Two unigenes (gene number: gene 11,326 and gene 12,084) were annotated as CCoAOMT in the full-length sequence obtained by Iso-Seq. The expression of these two genes in the two days after the natural ripening of noni fruit was verified by q-PCR, and the results are shown in Figure 6  were annotated as COMT in the full-length sequence obtained by Iso-Seq. Moreover, combined RNA-Seq and Iso-Seq data showed two genes (gene number: gene 7446 and gene 8422) with high differential expression and two genes (gene number: gene 10,983 and gene 6794) with low differential expression. So, these genes were selected for q-PCR verification, and the results are shown in Figure 5. These genes were expressed in a consistent trend of RNA-Seq, and all reached a significant difference within 48 h after harvest. Except for the weak expression of gene 10,983, the expression of the other three genes was significant. The relative expression of gene 8422 reached 3.5-fold at 2 h after harvest. For this phenomenon, the current study proposed that this gene was the first responsive among these four genes. The relative expression of gene 6794 reached 22.5-fold at 12 h after harvest, and gene 7446 reached 8.9-fold at 48 h after harvest. The timing of the soar in the expression of these three genes was sequential, which made us consider it a plant energy-saving mechanism.
Two unigenes (gene number: gene 11,326 and gene 12,084) were annotated as CCoAOMT in the full-length sequence obtained by Iso-Seq. The expression of these two genes in the two days after the natural ripening of noni fruit was verified by q-PCR, and the results are shown in Figure 6

Phylogenetic Analysis of McCCoAOMTs and McCOMTs
The result of phylogenetic relationships (Figure 7) indicates that the CCoAOMTs could be classified into four clades (1a, 1b, 1c, and 2). Gene 11,326 was grouped in clade 1a, and gene 12,084 was grouped in clade 1b. The result of phylogenetic relationships (Figure 8) indicates that the COMTs could be classified into two clades (clade I and clade II). Gene 10,983 was grouped in clade II, and gene 7446, gene 8422, and gene 6794 were grouped in clade I.

Phylogenetic Analysis of McCCoAOMTs and McCOMTs
The result of phylogenetic relationships (Figure 7) indicates that the CCoAOMTs could be classified into four clades (1a, 1b, 1c, and 2). Gene 11,326 was grouped in clade 1a, and gene 12,084 was grouped in clade 1b. The result of phylogenetic relationships (Figure 8) indicates that the COMTs could be classified into two clades (clade I and clade II). Gene 10,983 was grouped in clade II, and gene 7446, gene 8422, and gene 6794 were grouped in clade I.

Phylogenetic Analysis of McCCoAOMTs and McCOMTs
The result of phylogenetic relationships (Figure 7) indicates that the CCoAOMTs could be classified into four clades (1a, 1b, 1c, and 2). Gene 11,326 was grouped in clade 1a, and gene 12,084 was grouped in clade 1b. The result of phylogenetic relationships (Figure 8

Accumulation of Scopoletin in Noni Fruit
To further prove the important role of COMT and CCoAOMT in the accumulation of scopoletin in noni fruit from the material level, we determined the scopoletin content in noni fruit within 48 h after harvest. As shown in Figure 9, the content of scopoletin in noni fruit increased rapidly and significantly in the second hour after harvest and then reached the maximum value of 0.35 mg/g at the 12th hour. After 12 h of the noni fruit harvest, the scopoletin content began to show a downward trend, and the content of scopoletin in noni fruit was 0.18 mg/g at the 48th hour after harvest. The scopoletin content of noni fruit at the 48th hour was half of the 12th hour, but that was double the 0 h, and the difference was significant.
fruit increased rapidly and significantly in the second hour after harvest and then reached the maximum value of 0.35 mg/g at the 12th hour. After 12 h of the noni fruit harvest, the scopoletin content began to show a downward trend, and the content of scopoletin in noni fruit was 0.18 mg/g at the 48th hour after harvest. The scopoletin content of noni fruit at the 48th hour was half of the 12th hour, but that was double the 0 h, and the difference was significant.
The content of scopoletin in noni fruit significantly increased in a short time and then decreased. However, the general trend increased gradually with the extension of postripening. These results suggest that in the short time after harvest, the change in scopoletin content rapidly responded to the regulation of key genes in its synthesis pathway to some extent. The results of q-PCR suggest that different copies of the same gene are expressed in some order, to achieve the purpose of energy saving. For example, scopoletin increased rapidly in the second hour, which might be due to the simultaneous up-regulation of four copies of COMT. At the 12th hour, although scopoletin continued to increase, only gene 6794 was strongly up-regulated, and the expression of the other three genes decreased; gene 10,983 was even down-regulated, which might be performing some kind of gene saving program.

Discussion
Natural biochemicals are an important source of drug discovery [42]. Noni, which has the secondary metabolite scopoletin, has attracted research attention as one of the adjuvant drugs in clinical treatment. Therefore, elucidating the biosynthesis pathway will lay a foundation for increasing the supply of this important substance.
Fruit still has to undergo a series of physiological changes, such as ripening, aging, and death after harvest, and still changes the internal material through respiration. Previous studies showed that in the process of fruit ripening, the metabolism of reactive oxygen species was strengthened, which would produce a toxic effect on cells and lead to the destruction of cell membrane structure [43]. Therefore, during the process from fresh fruit to two days after harvest, the fruit of noni was still undergoing complex physiological activities. In previous studies, the scopoletin content increased in the fruit two days after harvest compared with the fresh fruit of noni [30]. It was found that a large number of phenylpropanoid metabolic pathway genes were highly expressed in plants under biotic The content of scopoletin in noni fruit significantly increased in a short time and then decreased. However, the general trend increased gradually with the extension of postripening. These results suggest that in the short time after harvest, the change in scopoletin content rapidly responded to the regulation of key genes in its synthesis pathway to some extent. The results of q-PCR suggest that different copies of the same gene are expressed in some order, to achieve the purpose of energy saving. For example, scopoletin increased rapidly in the second hour, which might be due to the simultaneous up-regulation of four copies of COMT. At the 12th hour, although scopoletin continued to increase, only gene 6794 was strongly up-regulated, and the expression of the other three genes decreased; gene 10,983 was even down-regulated, which might be performing some kind of gene saving program.

Discussion
Natural biochemicals are an important source of drug discovery [42]. Noni, which has the secondary metabolite scopoletin, has attracted research attention as one of the adjuvant drugs in clinical treatment. Therefore, elucidating the biosynthesis pathway will lay a foundation for increasing the supply of this important substance.
Fruit still has to undergo a series of physiological changes, such as ripening, aging, and death after harvest, and still changes the internal material through respiration. Previous studies showed that in the process of fruit ripening, the metabolism of reactive oxygen species was strengthened, which would produce a toxic effect on cells and lead to the destruction of cell membrane structure [43]. Therefore, during the process from fresh fruit to two days after harvest, the fruit of noni was still undergoing complex physiological activities. In previous studies, the scopoletin content increased in the fruit two days after harvest compared with the fresh fruit of noni [30]. It was found that a large number of phenylpropanoid metabolic pathway genes were highly expressed in plants under biotic and abiotic stress, resulting in the accumulation of a variety of secondary metabolites, including scopoletin [44]. Scopoletin synthesis involves the first catalytic steps of the phenylpropanoid pathway, leading to p-coumaric acid [45]. Therefore, active phenylpropanoid metabolism will provide a sufficient precursor for the accumulation of scopoletin.
The current study hypothesized a differential expression of the enzyme gene related to the accumulation of scopoletin. Through transcriptome analysis, phylogenetic tree analysis, and q-PCR verification, the current study determined that McCOMT and McCCoAOMT played an important role in the accumulation of scopoletin in noni. McCOMT and Mc-CCoAOMT are in the phenylpropanoid pathway, and all enzymes in the pathway were characterized, and in many cases, gene regulation is at the transcriptional level.
The preferred substrates of COMT are caffeoyl aldehyde and 5-hydroxyconiferaldehyde [25,46]. According to the annotation of metabolic pathways in transcriptome sequencing, it is known that in the accumulation of scopoletin in noni, the transformation from caffeic acid to ferulic acid was realized. Thus, COMT could catalyze the conversion of coffee acid to ferulic acid. The transgenic tobacco plants with down-regulated COMT gene expression demonstrate that COMT plays a crucial role related to controlling lignin and phenol content in plants. Moreover, COMT activity may be related to flavonoid production in the plant lignin pathway [47]. For this, studies have shown that plants strengthen their cell walls through the accumulation of phenolic compounds, which was thought to be a ubiquitous defense response [48]. After the RNA-Seq, Iso-Seq, phylogenetic tree analysis, and the q-PCR validation of differentially expressed genes in our study, we found that the three copies of McCOMT (gene number: gene6794, gene8422, and gene7446) were regulated obviously, which was similar to the COMTs of tobacco that has certain genes that would be regulated. The phylogenetic tree analysis of McCOMT was grouped into two clades, and gene 7446, gene 8422, and gene 6794 were grouped in clade I which are active for phenolic compounds involved in lignin biosyntheses such as caffeic acid and 5-hydroxyferulic acid and their respective aldehydes and alcohols. Moreover, it was found that class I included flavonoid, simple phenol, and multifunctional COMT genes in Populus, and these COMTs are regulated to defend against biotic and abiotic stresses [49]. It was found that the ripening process of blueberry fruit was also accompanied by the phenomena of fruit softening. COMT was also found to be differentially expressed, and VcCOMT38, VcCOMT57, VcCOMT40, and VcCOMT92 belonged to clade I and are regulated consistently with those of lignin during fruit development [50]. However, gene 6794, gene 8422, and gene 7446 show no consistent up-regulation, which may indicate gene redundancy. Similarly, functional redundancy has also been reported in this gene family [51].
CCoAOMT could catalyze the methylation of acyl-coenzyme, and the function has been detected in a variety of other plants [52]. CCoAOMT has been shown to successfully methylate various flavonoids, caffeoyl CoA, anthocyanins, coumarins, and aromatic esters [53,54]. Consistent with the results of previous research, the phylogenetic tree analysis of noni McCCoAOMT was grouped into four clades, and gene 12,084 was grouped in clade 1b that was involved in the biosynthesis of some metabolites, such as flavonoids and phenylpropanoids [55]. In addition, the phenomenon of fruit softening was also observed after the harvest of noni fruits, and fruit softening was a significant manifestation of reduced lignin content [56]. Many research studies have confirmed that CCoAOMT is a methyltransferase that plays an important role in lignin biosynthesis. The down-regulation of the CCoAOMT in different plant species resulted in significant decreases in lignin contents. In contrast, the overexpression of this gene led to an increase in the lignin content, indicating the essential role of this gene in the process of lignin biosynthesis [57].
Increasing studies have confirmed the relationship between COMT and CCoAOMT and the synthesis of scopoletin in recent years. CCoAOMT1 in Arabidopsis thaliana contributed to the formation of soluble sinapoyl conjugates in leaves and was crucial for the accumulation of coumarin scopoletin [58]. Feruloyl CoA was a key precursor in scopoletin accumulation, and COMT and CCoAOMT were the key enzymes in the synthesis of ferulic acid [59].
In this study, two copies of McCCoAOMT were found up-regulated, and the current study preliminarily believed that McCCoAOMT was another key gene for the synthesis of scopoletin in noni in combination with previous studies on the changes in scopoletin content in the post-maturation process of noni fruit.

Conclusions
After the above verification and analysis, we supposed that McCOMT and McC-CoAOMT both played a significant role in accumulating scopoletin in the noni fruit. Among the four unigenes of COMT, we speculated that gene 7446, gene 8422, and gene 6794 were more important, while gene 12,084 of CCoAOMT was more important. Nevertheless, it needs to be further verified later. McCOMT needs to construct the protein vector of the corresponding target gene, and further study is needed on whether the function of the McCOMT is caffeic acid, caffeoyl aldehyde, coniferyl aldehyde, or just one of them based on protein purification later. In addition, based on the established noni genetic transformation system [60,61], the homologous transformation of McCOMT and McCCoAOMT can be performed to obtain the evidence of gene function directly. At that time, the accumulation pathway of scopoletin will be clearer.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes13111993/s1, Figure S1: Results of genome size determination; Figure S2: q-PCR verification of RNA-Seq (10 up regulated and 10 down regulated genes). Table S1: Gene ID used in the phylogenetic tree CCoAOMT, Table S2: Gene ID used in the phylogenetic tree COMT, Table S3: Nucleotide sequence of COMT and CCoAOMT, Table S4: q-PCR primers (up-regulated and down-regulated), Table S5: q-PCR primers of COMT and CCoAOMT, Table S6: Statistical table of RNA-Seq data after quality control, Table S7: Mapping ratio, Table S8: Annotation of Iso-Seq, Table S9: Different expression genes, Table S10: Related genes in scopoletin pathway.  Data Availability Statement: All data generated or analyzed during this study are included in this published article and its supplementary information files. The transcript assemblies have been deposited and are publicly available at NCBI with accession PRJNA503490. The Iso-Seq data have been deposited and are publicly available at NCBI with accession SRR12716286.

Conflicts of Interest:
The authors declare no conflict of interest.