Secondary Metabolite Differences between Naturally Grown and Conventional Coarse Green Tea

Crop culture conditions are one of the important interfaces between food, the environment, and health, and an essential research area for maintaining social-ecological integrity. In recent years, it has been reported that the difference in culture conditions between monoculture with external inputs (in cultura) and self-organized ecological niches (in natura) is significant for the resulting physiological property of plants. It has also been suggested that there exist metabolic proxies in various foods that can separate these two culture conditions, which does not depend on a single component but on the distribution of various compounds. However, little has been studied in a time series of replicated production to quantify the reproducibility of these metabolomic features associated with culture conditions. In this study, we obtained metabolome data of coarse green tea (Camellia sinensis) grown in the same region in Japan under both in cultura and in natura culture conditions over the course of six years, and constructed a list of multiple components that separated the effects of culture conditions by statistical analysis, and estimated the metabolic functions of the compounds that contributed to the separation. The results suggest that naturally grown samples are rich in allelochemicals, such as phytochemicals, alkaloids, phenylpropanoids, steroids, as well as the compounds related to microorganisms and vitamin B6 that imply the interactions with the soil microbiome. The estimated physiological functions of the distinctive compounds suggest that the in natura crop production is not only beneficial with known properties of maintaining ecosystem health such as soil functions and pathogen control, but also for the augmentation of the plant secondary metabolites that support long-term health protective effects.


Introduction
Crop culture conditions represents one of the essential subjects for social-ecological sustainability, which links human and ecosystem health through affecting the plant's metabolic state [1]. According to previous researches, two categories of culture condition are distinguished as a dominant parameter on plant metabolites: (1) in cultura condition based on the monoculture with external inputs that maximizes the individual plant growth known as the physiological optimum, and (2) in natura condition based on the self-organization of ecological niches known as the ecological optimum [2][3][4]. Although dominant in present crop production, the in cultura culture conditions are inevitably associated with biodiversity loss and imposing a heavy burden on the global environment. This tendency has evolved since prehistoric times, and there are some reports that it could lead to the sixth mass extinction in Earth's history [5,6]. A globally imminent crisis such as the planetary state shift and massive loss

Materials and Methods
All coarse green tea samples used in this study were produced by traditional tea farms at Watarai-Cho, Mie prefecture in Japan. Harvest and processing of coarse green tea leaves were homogenized with a conventional method: 2nd cropping (foliage lower than shoots) of the yearly 1st harvest during late May to early June between 2014 and 2019, followed by the standard processing of steaming, kneading, and drying in a local mechanized factory. Leaves from more than 7000 m 2 culture within 2 km distance were blended for averaging plot-wise variation and to compare samples from different culture methods with a common geographical range.
Two culture conditions of tea were differentiated to examine the metabolomic difference of tea extract: 1. in cultura conventional monoculture condition following the standard protocol of Japan Agricultural Cooperatives Ise branch, with the routine application of inter-crop cultivation, synthetic and organic fertilizers, pesticides, fungicides, and herbicides. 2. in natura mixed polyculture condition without the application of tillage, fertilizer, and chemicals, defined as Synecoculture [4]. Intercropping Agriculture 2020, 10, 632 3 of 23 with a variety of vegetables and fruit trees was introduced, along with spontaneous generation of weeds [22].
The conventional condition is controlled in physiological optimum range of production, while Synecoculture is based on ecological optimum dominant in natural vegetation. These qualitatively different growing conditions are biologically termed as in cultura and in natura conditions of field culture, respectively, in relation to the definition of sustainable diet [2].
Then, eleven sets of samples of dried coarse green tea were produced, from the Synecoculture fields from 2014 to 2019 and conventional tea culture from 2015 to 2019. We hereafter call these samples as  For 2014-2017 samples: Each 100 µL sample was mixed with 300 µL methanol and centrifuged with 10,000× g, 10 min, 4 • C. The supernatant was filtered with PTFE filter (Millipore, Cat.SLLGH04NK) and centrifuged through Monospin C18 spin columns with 5000× g, 2 min, 4 • C in order to remove insoluble matters and low polarity components. A mock sample of ultrapure water was prepared with the same procedure, and was used to evaluate and remove background noise contained in the sample preparation and/or LC-MS analysis.
For 2018-2019 samples: Each 100 µL sample was mixed with 300 µL methanol and centrifuged with 15,000 rpm, 10 min. 100% methanol centrifuged through Monospin C18 spin columns with 5000× g, 2 min. Then, 75% methanol centrifuged through the same columns with 5000× g, 2 min. These processes were pretreatment for column equilibration. Then, the supernatant of the sample was centrifuged through the same columns with 5000× g, 2 min. After that, the supernatant of the sample was filtered with a 0.2-µm filter. A mock sample of ultrapure water was prepared with the same procedure, and was used to evaluate and remove background noise contained in the sample preparation and/or LC-MS analysis.
All of the methanol was obtained from FUJIFILM Wako Pure Chemical Corporation.

LC-MS Analysis of 2014-2017 Samples
LC−MS analysis was performed with a combination of Agilent 1200 series [23] and Thermo fisher scientific LTQ ORBITRAP XL [24]. The parameters of measurement are summarized in Supplementary Materials File S1.
After converting raw data (obtained from LTQ ORBITRAP XL) to a text file with the use of ProteoWizard [25], LC-MS data were analyzed using PowerGet ver. 3.5.7 [26] with the following procedure to attribute each MS peak to a chemical formula:

1.
Empirical detection of compound peaks, calculation of accurate mass, calculation of compound peak intensity.

2.
Differentiation of simultaneous elution peaks with respect to the profile of adduct ion peaks, ionization mode, and natural 13C isotopic compound peaks.

3.
Matching between MS peaks and MS/MS data, calculation of 13C/12C isotope ratio with ion intensity in order to estimate C number in each compound, and estimation of ionization mode.

4.
Aggregation and sorting of compound peaks with respect to the elution time, accurate mass, and MS/MS patterns for all samples.

5.
Matching of calculated mean accurate mass with monoisotopic compounds in public databases [20,21] with the use of MF Searcher [27] and derivation of a corresponding chemical formula. 6.
Truncate the compound peaks with less than 2 times intensity of the mock sample.
The parameters of these analyses are summarized in Supplementary Materials File S1 and Table 1.  [28] and Q Exactive (Thermo Fisher Scientific) [29]. Samples were analyzed 3 times for each sample.
After converting raw data (obtained from Q Exactive) to a text file with the use of ProteoWizard, LC-MS data were analyzed using PowerGetBatch [30] with the following procedure to attribute each MS peak to a chemical formula:

1.
Empirical detection of compound peaks, calculation of accurate mass, calculation of compound peak intensity 2.
Alignment of compound peaks 4.
Matching of calculated mean accurate mass with monoisotopic compounds in public database with the use of MF Searcher and derivation of a corresponding chemical formula The parameters of these analyses are summarized in Supplementary Materials File S1 and Table 2. As a premise, this metabolome analysis is a result of projecting measured exact masses onto the public database, so there is not sufficient resolution for the accurate distinction between structural isomers.

Integration of Metabolite Data of 2014-2019 Samples
Based on the exact mass detected, MF Searcher [27] was used to match the same compound every year with an error of 1 ppm in KEGG database, and integrated the metabolomic data of all years. The same procedure was also performed in Flavonoid Viewer [31] to compare the total estimated amount of flavonoids (see Supplementary Materials File S5). When multiple structural isomers were detected from the KEGG database, the intensity was taken as the average value for each detected year, because these cannot be distinguished at the present resolution of LC-MS.

Biological and Technical Replicate
The sample tea leaves were obtained from the mixed harvest over entire fields, at a total of 3 times mixing (triple homogenization) at the time of harvesting, kneading, and drying, so it was not possible to sample each small area or individual tea tree as biological replicates. Therefore, this was regarded as a representative of the entire field for each year, and biological variance was greatly averaged through harvesting and processing in this study. Since the tea leaves were inevitably homogenized through the processing, it was not possible to take plant-wise or area-wise biological replicates for the product analyzed.
LC-MS analysis was performed once for the 2014-2017 and three times for the 2018-2019 samples. To assess the homogeneity of the tea products and reproducibility of measurement, the 2014-2019 samples were extracted three times with the same procedure of sample extraction as "product replicate", and the temporal changes in absorbance of the samples were measured with a spectrophotometer. There was almost no difference in the extraction and no change after a certain period of time, 6 h at the maximum (1/40-1/10,000 error of the absorbance). The summary of the sample replicates and LC-MS analysis method are shown in Table 1.

Statistical Analysis
Normality of all the formulae were tested with the Shapiro-Wilk test, and 41 out of 342 Syneco formulae and 51 out of 342 Conv formulae had passed under 5% significance threshold. The Welch's t-test (applicable even if the normally distributed data do not have homoscedasticity) and the Brunner-Munzel test [32] (that does not assume either homogeneity of variance nor normality) were performed with Syneco 2014-2019 and Conv 2015-2019 for the LC-MS raw and logarithmic intensity values of the identified compounds. For the 2018 and 2019 samples, the mean values of the three measurements on the same samples were used. Welch's t-test was performed with Microsoft Excel ver. 16.16.21, and a Brunner-Munzel test was performed with statistical analysis software R ver. 3.5.0.
The variance value representing the magnitude of the year-to-year variation of each compound was calculated, and the distribution of the variances was compared by F-test between Syneco and Conv samples. For the 2018 and 2019 samples, the mean values of the three measurements on the same samples were used. F-test was performed with Microsoft Excel ver. 16.16.21. These results were summarized in Figure 1.
In order to investigate the overall effects of culture conditions on the metabolic state, we performed the principal component analysis (PCA) after the renormalization of the intensity values for each compound. Then, PCA was applied repeatedly by increasing the number of compounds having a high positive/negative loading to a PC to see the separability of Syneco and Conv samples on the PC plane (hereafter denoted as LS-PCA, meaning linear separation with PCA). As a result, the top 65 compounds of positive/negative loadings concerning PC3 were extracted (in total 130 compounds), which were enough to thoroughly separate the two culture conditions (Figures 2 and 3). The LS-PCA was performed by a statistical analysis software R ver. 3.5.0. The Shapiro-Wilk test was performed on the three distributions D1, D2, and D3, to examine their normality.
Besides LRM, the logarithmic ratio of intensity variance LRV: = log (intensity variance of Syneco compounds)-log (intensity variance of Conv compounds) was calculated for all compounds, and used for the characterization in Figure 1. . So if these values are positive, it means that the Syneco sample is larger than the Conv sample in the mean intensity or the variance. The green circles represent PC3 negative loadings in PCA (as shown in Figure 3), and the orange circles represent PC3 positive loadings. As an exception, in the t-test of the intensity of Syneco vs. Conv, a compound with p < 0.05 is represented by a triangle instead of a circle. The yellow and light blue hollow circles represent Syneco-distinctive loadings and Conv-distinctive loadings, respectively, which are the top 130 loadings that can completely separate the two culture conditions with PCA (see Figure 3). The saltires represent the compounds with p < 0.01 in the F-test of Syneco vs. Conv, and the crosses represent the compounds with p < 0.05. . So if these values are positive, it means that the Syneco sample is larger than the Conv sample in the mean intensity or the variance. The green circles represent PC3 negative loadings in PCA (as shown in Figure 3), and the orange circles represent PC3 positive loadings. As an exception, in the t-test of the intensity of Syneco vs. Conv, a compound with p < 0.05 is represented by a triangle instead of a circle. The yellow and light blue hollow circles represent Syneco-distinctive loadings and Conv-distinctive loadings, respectively, which are the top 130 loadings that can completely separate the two culture conditions with PCA (see Figure 3). The saltires represent the compounds with p < 0.01 in the F-test of Syneco vs. Conv, and the crosses represent the compounds with p < 0.05. In other words, the errors that can occur in our method were aggregated in PC1 and PC2, and PC3 was considered to represent a robust characteristic that separates Syneco and Conv samples. As shown in Figure 3, The Syneco and Conv samples were completely separated with LS-PCA using the top 65 compounds of PC3 negative/positive loadings (in total 130 compounds). Hereafter we call these top 130 negative and positive loading parameters as Syneco and Conv-distinctive parameters, respectively. We used these 130 distinctive compounds for the ontological categorization using KEGG BRITE and KEGG PATHWAY databases.
Using the MF searcher, 97 flavonoid compounds were listed by matching the metabolome data The top border of the grey area represents the number of overlaps of Syneco and Conv samples (right Y-axis) when LS-PCA is performed, which means the degree of inseparability between the two culture conditions. When the top 65 negative and positive loadings or more were applied to LS-PCA, the number of overlap is equal to 0, which means the Syneco and Conv samples were completely separated. Hereafter we call these 65 negative and positive loadings "Syneco-distinctive loadings" and "Conv-distinctive loadings" which are plotted with yellow and light blue hollow circles, respectively. All structural isomers of the 130 distinctive loadings on KEGG were projected onto the "map01110 Biosynthesis of secondary metabolites" of KEGG PATHWAY(see the Supplementary Materials File S17).
In addition, the logarithmic ratio of mean intensity LRM: = log (mean intensity of Syneco compounds)-log (mean intensity of Conv compounds) was calculated for all compounds, and these data were divided into the following three sets in

Metabolome Categorization
We projected the top 130 (65 positive and 65 negative) of PC3 loadings to KEGG BRITE (Table 2 and Figure 5) and KEGG PATHWAY (Table 3), and looked to the KEGG BRITE compound classification. The negative PC3 loadings that characterized Syneco samples expressed more diversity of allelochemicals than Conv samples such as phytochemicals, alkaloids, phenylpropanoids, and steroids. The "Phytochemicals" include the subcategories of alkaloids, flavonoids, phenylpropanoids, shikimate/acetate-malonate pathway derived compounds, terpenoids, polyketides, fatty acids related compounds, amino acid related compounds, and others, according to the notation in KEGG BRITE database. The Shapiro-Wilk test was performed on the three distributions D1, D2, and D3, to examine their normality.
Besides LRM, the logarithmic ratio of intensity variance LRV: = log (intensity variance of Syneco compounds)-log (intensity variance of Conv compounds) was calculated for all compounds, and used for the characterization in Figure 1.

Metabolome Categorization
The list of 130 chemical formulae obtained with LS-PCA was projected to KEGG (Kyoto Encyclopedia of Genes and Genomes) [20,21] databases to annotate possible physiological functions. KEGG API [33] was used to mine the KEGG BRITE [34] database and KEGG PATHWAY [35] in order to categorize the compounds according to the functional classification. Each chemical formula was attributed to the hierarchical ontology of these databases including the matching with structural isomers, as an extensive interpretation of obtained metabolome data on known physiological functions.
Welch's t-test and Brunner-Munzel test were performed in each category of KEGG BRITE and KEGG PATHWAY, with Syneco 2014-2019 and Conv 2015-2019 compounds intensity and logarithmic intensity, in order to investigate compound category-wise intensity differences between Syneco and Conv. The mean value of the intensity of the same compound and each value of the intensity were used for each test.

Statistical Analysis
Using the MF searcher, the exact masses of 342 compound peaks were matched with known compounds in the KEGG database (see Supplementary Materials File S4). Among them, the average raw intensity of each 125 compounds was greater for Syneco samples than Conv samples, and 217 compounds for Conv samples than Syneco samples. The result of Welch's t-test for the raw intensity showed that only one compound (C19H32O8) was significantly different with a 5% significance level (p = 0.0143), but the others were not significant. The result of Brunner-Munzel test for the raw intensity showed that only four compounds (C33H40O21, C19H32O8, C5H11N3O2, C16H18O8) was significantly different with a 5% significance level (p = 0.0123, 0.00852, 0.0161, 0.0336, respectively), but the others were not significant.
The result of Welch's t-test for the logarithmic intensity showed that only one compound (C5H11N3O2) was significantly different with a 5% significance level (p = 0.0166), but the others were not significant. The result of the Brunner-Munzel test for the logarithmic intensity showed that only 4 compounds (C33H40O21, C19H32O8, C5H11N3O2, C16H18O8; the same as for the raw intensity) were significantly different with a 5% significance level (p = 0.0123, 0.00852, 0.0161, 0.0336, respectively), but the others were not significant.
The results of F-test between the yearly variances of the samples for each of the 342 compounds showed that 58 compounds were significantly different between the Syneco and Conv samples with the significance level of 5%.
In terms of the variance values representing the yearly fluctuation for each compound, 132 compounds in Syneco were greater than Conv, and 210 compounds in Conv were greater than Syneco. The F-test comparing the variances between Syneco and Conv for the intensity distribution of 342 compounds for all years of sampling showed a statistically significant difference (p = 7.84 × 10 −49 ). Welch's two-sided t-test was also performed on the difference between the mean values of the variances, but there was no significant difference (p = 0.651).
The result of PCA in Figure 2 revealed that PC3 could linearly separate Syneco and Conv completely (horizontal axis in Figure 2b In other words, the errors that can occur in our method were aggregated in PC1 and PC2, and PC3 was considered to represent a robust characteristic that separates Syneco and Conv samples. As shown in Figure 3, The Syneco and Conv samples were completely separated with LS-PCA using the top 65 compounds of PC3 negative/positive loadings (in total 130 compounds). Hereafter we call these top 130 negative and positive loading parameters as Syneco and Conv-distinctive parameters, respectively. We used these 130 distinctive compounds for the ontological categorization using KEGG BRITE and KEGG PATHWAY databases.
Using the MF searcher, 97 flavonoid compounds were listed by matching the metabolome data to the Flavonoid Viewer (see Supplementary Materials File S5). The total intensity of the detected flavonoids in all 2014-2019 samples were compared between Syneco and Conv samples. Although Conv samples tended to contain more flavonoids, it showed no statistically significant difference (p = 0.786, two-sided Welch's t-test) in total intensity of flavonoids.
With the distribution analysis of LRM, the distribution of Syneco-intrinsic compounds D1 and Conv-intrinsic compounds D2 were regarded as normal distribution with the normality test, and the common compounds D3 was not (p-value of D1: 0.9691, D2: 0.4873, D3: 3.498 × 10 −9 , Shapiro-Wilk test). The histograms and estimated probability density of D1, D2, and D3 are drawn in Figure 4. The compounds that were expressed only in Syneco or Conv (D1, D2) showed distinctive offsets of mean values compared to D3 (with Welch's two-sided t-test p = 6.193 × 10 −26 for D1 > D3; p = 3.829 × 10 −38 for D2 < D3; Brunner-Munzel test is not applicable because D1, D2, and D3 are completely separated), which implies that D1 and D2 are based on the qualitative shift of gene expressions compared to the quantitative difference in D3.

Metabolome Categorization
We projected the top 130 (65 positive and 65 negative) of PC3 loadings to KEGG BRITE (Table 2 and Figure 5) and KEGG PATHWAY (Table 3), and looked to the KEGG BRITE compound classification. The negative PC3 loadings that characterized Syneco samples expressed more diversity of allelochemicals than Conv samples such as phytochemicals, alkaloids, phenylpropanoids, and steroids. The "Phytochemicals" include the subcategories of alkaloids, flavonoids, phenylpropanoids, shikimate/acetate-malonate pathway derived compounds, terpenoids, polyketides, fatty acids related compounds, amino acid related compounds, and others, according to the notation in KEGG BRITE database.   Table 2.   Table 2.
The common 4 compounds of Syneco tended to have larger variance values in Syneco than Conv samples (F-test, p = 0.0795), while the common 8 compounds of Conv had significantly larger variance values in Conv than Syneco samples (F-test, p = 2.576 × 10 −13 ).
In addition, epigallocatechin gallate, aromadendrin, pesticide compounds (framprop, aldicarb or butocarboxim), and amino acids that might potentially be derived from fertilizers were detected as Conv-distinctive compounds.
The two-sided Welch's t-test and Brunner-Munzel test were performed in each category of KEGG BRITE and KEGG PATHWAY (KEGG category-wise tests), with the use of total intensity and total logarithmic intensity of Syneco2014-2019 and Conv2015-2019 samples. The mean value of intensity of the same chemical formula and each value of the intensity were used for the tests.    The categories of "1. Metabolism," "1.4 Nucleotide metabolism," "1.5 Amino acid metabolism," "1.6 Metabolism of other amino acids," "Abietanes," "Drugs with new active ingredients," "Guaianolide," "M MUSCULO-SKELETAL SYSTEM," "M01 ANTIINFLAMMATORY AND ANTIRHEUMATIC PRODUCTS," "M01A ANTIINFLAMMATORY AND ANTIRHEUMATIC PRODUCTS, NON-STEROIDS," "map00300 Lysine biosynthesis," "map00310 Lysine degradation," "map00330 Arginine and proline metabolism," "map00830 Retinol metabolism," "map01100 Metabolic pathways," "N-glycosides," "New drug approvals in Japan [br08318]," "PR0109 Retinoids," and "Sinapate derivatives," were significantly different(p < 0.05) in at least one of the tests (Tables 4 and 5). Notably, the results showed that Conv samples expressed significantly greater total intensity in the categories of amino acids-and nucleotide-related primary metabolites. Table 4. Results of KEGG PATHWAY category-wise tests with p < 0.05: The difference of intensity in each KEGG PATHWAY category (column "Category in KEGG PATHWAY") is tested with two-sided Welch's t-test and Brunner-Munzel test ("Test"). The signs of p-values (in "p-Value") represent the magnitude relationship of mean intensity between Syneco and Conv ("Magnitude Relationship"); negative sign means Conv was greater than Syneco, and positive sign means Syneco was greater than Conv. The column "Scale" indicates whether the intensity was used in linear or logarithmic scale for the tests. The intensity value was used both as chemical formula-wise aggregated mean and as separated intensity peaks (indicated as Formula and NA, respectively, in the column "Averaging"). The column "# Formulae" represents the number of chemical formulae estimated in each category.

Discussion
In the pre-analysis with a Welch's t-test and Brunner-Munzel test for each compound, only one and four over 342 detected compound peaks had significant differences between Syneco and Conv samples, respectively, concerning the 5% threshold on the p-value. Therefore, the simple comparison of compound-wise intensity does not support any statistically significant differences, since about 17 peaks (5% of 342) could stochastically take p-values under the 5% threshold in random data. Considering also biological fluctuations and technical errors, the results of compound-wise statistical tests did not seem relevant enough to require further investigation.
However, the variation between different years for each compound measured with F-test was significantly different between the two categories (for 58 compounds, about 17% of 342), suggesting that the in cultura samples had larger yearly variance. This result supports the relative consistency of metabolic profile in ecological optimum compared to the larger deviation in monoculture data, as pointed out in a previous study [2]. It means that the human intervention under the in cultura culture conditions introduces larger fluctuations in the metabolic state of the crop than the in natura self-organized growth under ecological optimum conditions. The observed facts may be relevant to the general relationship between biodiversity and ecological resilience: the in cultura culture condition with low biodiversity may behave less stable in the consistency of its metabolic profile because of insufficient ecological feedbacks that are abundant in the in natura culture condition.
Previous study has pointed out that there exist statistical invariant features of metabolic profiles that distinguished between the in cultura and in natura culture conditions [2]. The principal component analysis in this study, which reflected the entire LC-MS measurements over six years of repeated production, also supports this notion. The distinctive metabolic profiles were found with respect to the PC3 loading section, which was classified with the ontologies in KEGG BRITE and PATHWAY databases.
Syneco samples produced in the in natura condition contained a larger number of allelochemicals such as phytochemicals, alkaloids, phenylpropanoids, and steroids, according to the classification in KEGG BRITE (Table 2), especially for the compounds with top 130 PC3 loadings that provided complete linear separation between the in cultura and in natura samples ( Figure 2b).
As for the KEGG PATHWAY classification of the top 130 PC3 loadings (Table 3), 9 compounds of in natura samples and five compounds of in cultura samples were categorized into "map01120 Microbial metabolism in diverse environments" (see the Supplementary Materials Files S9 and S10). This category is related to the metabolism of microorganisms, which implies that the in natura samples were raised in more complex microbiological interactions than in cultura samples. In this study, the microbiological interaction of tea plants should mainly come from soil microbiota. Indeed, it is reported that Synecoculture largely promotes soil microbial diversity and activities [36]. These results suggest that the in natura samples tend to contain more diverse compounds related to the interaction with soil microorganisms.
Plants are known to synthesize repellent and attractant substances called allelochemicals in competitive and cooperative interactions with other plants and insects [37]. Allelochemicals such as alkaloids, phenylpropanoids, steroids, and flavonoids, may be harmful to the human body in excessive amounts, but are known to have anti-inflammatory, anti-cancer, and antioxidant effects with appropriate dose [38]. Our results suggest that the in natura culture condition associated with rich interactions between species could support a more diverse production of such health-protective compounds.
Concerning the diversity of allelochemicals, the only exception in this study was the total intensity of flavonoids that was larger in Conv than in Syneco in the total comparison of 2014-2019 samples. This result was different from the previous study that reported larger flavonoids expression in Syneco 2014-2015 samples [39]. Actually, the total amount of flavonoids was superior in Syneco 2014-2016 to Conv 2015-2016 samples (results not shown). For the Syneco 2017-2019 and Conv 2017-2019 samples, the total amount of flavonoids was not effective as a distinctive parameter that constantly separates the two culture conditions. Since flavonoids can be enhanced not only by ecological interactions but also by physical environmental stress such as sunlight, various flavonoids may have been produced in response to the environmental particularity in Syneco 2014-2015 samples.
Among the three PCA groups (2014-2017, 2018-2019, and 2014-2019), compounds such as glucosylpyridoxine, rutin, heptyloxyphenol, and methylcatechol were mutually estimated in Syneco samples. Glucosylpyridoxine, a glycoside of vitamin B6, was detected as an exclusively characteristic compound of in natura samples. Furthermore, rutin is known as an allelopathic chemical that improves plant disease resistances and inhibits the growth of other competing plants [40,41]. These in natura distinctive compounds can be interpreted as the results of the enhanced ecological interactions in Synecoculture.
Vitamin B6 is known to be synthesized by soil microorganisms of the genus Aspergillus. It is required for controlling the immune response, and its deficiency is known to cause immune system disorders and a decrease in antibody production [42,43]. In recent years, the diversity of human gut microbiota in city environments has been reduced due to the abuse of antibiotics and pesticides [44][45][46], and non-infectious immune-related diseases have become serious in many countries around the world. Also, the risk of vitamin B2 and B6 deficiency has been reported in vegan populations [47]. Vitamin B6 and B12 have been reported to be effective in treating Alzheimer's disease [48][49][50]. Our results suggest that vitamin B6 is contained more in the tea samples cultivated under the in natura culture condition in which the topsoil ecosystem is not disturbed.
On the other hand, theasinensin A, l-tyrosine, guanidinobutyric acid, theogallin, isovitexin, coumaroylquinic acid, l-phenylalanine and heptylparaben were detected as the characteristic compounds common to conventional samples. Amino acids were detected as a strong characteristic of in cultura samples, which may be derived from the application of synthetic fertilizers. This is in line with the previous study that also reported amino acids were one of the characteristics of the conventionally cultured tea product [39]. The results of the KEGG category-wise tests in Tables 4 and 5 also support this notion.
The overall results suggest that the distinction between in natura and in cultura conditions only becomes possible at the distribution level of metabolome, beyond single-component comparison: in the results of the normality test for the distribution of the intrinsic and common compounds for each culture condition (Figure 4), both in cultura and in natura intrinsic compounds showed normality in the distribution of intensity, with distinctive offsets of mean values. Each culture condition thus comprised the qualitatively different expression of culture-specific compounds.

Conclusions
Culture conditions strongly relate to the health of both environment and humans through our diet as an interface. In this article, the results suggest that the tea plants grown in in natura condition produce more diverse and abundant allelochemicals than in in cultura conditions, which is expected to contribute to the health of the plants themselves and those consumers. Such enhancement of secondary metabolite in tea plant may be considered as an example of the generic interaction between crops and field biodiversity, especially soil microbiota.
In the formation of ecological niches, it is known that some species specifically rely on the symbiosis with other species and does not tolerate single isolated culture (e.g., [51]). This implies the presence of the "in natura effects" that maintain the coexistence of various species in natural environments and associated particular expression patterns of metabolite, which occurs only within the complex interactions of the self-organized plant and animal communities. Similarly, our results suggest that there exists a metabolite-level property that is particularly based on the in natura state of culture condition, which has been generally treated as the irreplaceable support of the whole food quality that cannot be produced with synthetic nutritional supplements. Generally, the health effects of plant products cannot be totally evaluated with a single component, but should be considered within the whole diet composed of many compounds acting synergistically on human health [52,53]. The intrinsic distribution of compounds particular to in natura metabolite should further be considered when evaluating the sustainability of food systems toward the resolution of the health-diet-environment trilemma [3].