1. Introduction
China ranks among the world’s top three core tomato-producing regions and is the world’s largest exporter of tomato products. Xinjiang leads the nation in tomato processing volume [
1], having become a vital component of the region’s “red industry.” Raw materials are the first link in the production chain of tomato processing enterprises, and the requirements for raw materials include high levels of lycopene, soluble solids, and low acidity [
2]. High soluble solids content not only increases product yield but also significantly reduces material consumption ratios [
3]. Meanwhile, high lycopene content improves product color and nutritional value, enhancing market competitiveness [
4]. Total sugar and total acidity are key indicators for evaluating the taste of tomato products. Tomatoes contain significant amounts of vitamin C and lycopene, both important antioxidants, and their content changes during different processing methods are also a major focus in the tomato processing field [
5,
6]. These factors determine the color, flavor, and viscosity of processed products, serving as crucial indicators for assessing the quality and processing characteristics of processed tomatoes. Currently, the excessive pursuit of high-product-yielding varieties in tomato processing raw material production often leads to a decline in fruit quality, subsequently increasing production costs for enterprises and reducing product quality. Cultivating high-yielding processing tomato varieties with excellent overall quality is crucial for the healthy development of the industry [
7]. Quality analysis in the variety selection and breeding phase is also essential.
Germplasm resources not only serve as the material foundation for genetic improvement but also act as the parental sources for harnessing hybrid vigor. Breakthrough advancements in plant breeding invariably necessitate core germplasm resources. Prolonged use of parent materials with similar genetic backgrounds can lead to narrowed genetic diversity, ultimately limiting improvements in yield, quality, and resistance [
8]. The comprehensive selection of germplasm resources with distinct and specific traits is a critical link in variety improvement [
9].
Currently, comprehensive evaluation methods combining genetic diversity analysis, correlation analysis, principal component analysis (PCA), and cluster analysis have been widely applied in the quality assessment of various fruit and vegetable varieties such as tomatoes [
10], peppers [
11], potatoes [
12], garlic [
13], eggplants [
14], and melons [
15]. While previous studies have extensively reported on the comprehensive evaluation of tomato germplasm resources [
16,
17,
18], research on the integrated assessment of quality traits in high-generation inbred lines of processing tomatoes, specifically tailored to regions like Xinjiang with temperate continental climates (characterized by abundant sunlight, significant diurnal temperature variation, and arid conditions), remains relatively scarce. This study utilized 113 processing tomato germplasm resources systematically bred under long-term Xinjiang ecological conditions. Key quality traits of mature fruits (lycopene, soluble solids, total sugar, total acidity, and vitamin C) were measured. Utilizing genetic diversity analysis, correlation analysis, principal component analysis, and cluster analysis, a comprehensive evaluation of the differences in quality traits of the 113 processing tomato germplasm resources was conducted. The aim was to establish a comprehensive evaluation system covering processing efficiency, nutritional quality, and flavor characteristics. Ultimately, this approach enables the classification and selection of locally suitable parental materials, providing material selection strategies and core germplasm resources for processing tomato germplasm innovation and new variety breeding.
2. Materials and Methods
2.1. Experimental Materials
The 113 tomato germplasm resources used in the experiment were provided by the processing tomato breeding research group at Shihezi University (
Table 1). All materials were high-generation inbred lines obtained through systematic selection of commercial varieties and breeding lines extensively cultivated in China’s Xinjiang region. The tested materials were all of the determinate growth type, with a fruit shape index > 1.0 and large red fruit.
2.2. Experimental Method
The experimental materials were grown in 72-cell seedling trays in March 2023. At 50 days old and the four-leaf-one-heart stage, the seedlings were transplanted into open-field experimental plots at the Agriculture College Experimental Station of Shihezi University (44°18′ N, 85°59′ E), located at an altitude of 450 m. During the growing season, temperatures ranged from 3 to 40 °C, relative humidity from 25 to 65%, and rainfall totaled 124.9 mm. The region receives an average annual sunshine duration of 2610 h, with a frost-free period of 147–191 days. The soil is predominantly loam, with a pH of 7.5. All materials received identical agronomic management practices—including irrigation, fertilization, and pest control—during the same period. This minimized environmental errors from management variations, ensuring differences in tomato fruit quality traits originated solely from the strains themselves. Row spacing: 1.2 m; 90 cm plastic mulch; double-row planting under single mulch; row spacing on mulch: 50 cm; plant spacing: 35 cm. Each test material had 3 biological replicates, each replicate being a 5 m long plot with 30 plants, arranged in a completely randomized block design. At full red fruit stage (approximately 60 days after flowering), five red-ripe fruits of consistent maturity were randomly harvested from the front, middle, and rear positions of each plot (collected from multiple plants near each position), totaling 15 fruits per plot, to eliminate microenvironmental variation effects. Fresh tomatoes were transported to the Key Laboratory of Specialized Fruit and Vegetable Cultivation Physiology and Germplasm Resource Utilization within the Xinjiang Production and Construction Corps within 2 h of harvest. The peel and pulp of all 15 fruits from the same plot were blended into a pulp mixture. This mixture was rapidly frozen in liquid nitrogen for 15 min and stored at −80 °C for subsequent quality parameter measurements. Each quality parameter measurement yielded a single value per biological replicate (plot).
2.3. Measurement Indices and Methods
Lycopene content was determined using ultra-high-performance liquid chromatography (LC-2010, Shimadzu Corporation, Kyoto, Japan), with slight modifications to GB/T 41133-2022 [
19]. Chromatographic conditions were as follows: C18 column (150 × 4.6 mm, 5 μm); Methanol–Acetonitrile (7:3) as mobile phase; flow rate, 1.2 mL/min; detection wavelength, 472 nm; injection volume, 20 μL; column temperature, 35 °C. Lycopene standards and samples were analyzed under these conditions. The entire process was conducted under light-shielded conditions.
Soluble solid content was measured using a portable digital refractometer (PAL-1, ATAGO Corporation, Tokyo, Japan) as follows: After zeroing the device in a horizontal position, place 2–3 drops of tomato juice on a clean prism. Press the START button; the display will show “---” before displaying the measured value.
Total sugar content was determined using the anthrone–sulfuric acid method [
20]: Weigh 0.5 mL of tomato homogenate into three separate graduated test tubes. Add 10 mL of distilled water to each, seal with plastic film, and extract in boiling water for 30 min (repeat extraction twice). Cool and filter the extract into a 100 mL volumetric flask. Rinse the test tubes and residue repeatedly, then dilute to the mark. Transfer 0.5 mL of sample extract into a 20 mL test tube (repeat three times), add 5 mL anthrone reagent, shake thoroughly, and immediately place the test tubes in a boiling water bath. Ensure each tube maintains a constant temperature for 1 min. Remove and allow to cool naturally to room temperature. For the blank tubes, replace the extract with an equal volume of distilled water. Measure the absorbance at 625 nm. Standard curve: Y = 0.007x (μg/mL) + 0.0236, where Y is the OD value and x is the glucose concentration.
Total acid content was determined using acid–base titration (GB 12456-2021) [
21]: Weigh 25 mL of tomato homogenate (accurate to 0.01 g) into a 150 mL conical flask fitted with a condenser. Add approximately 50 mL of carbon-dioxide-free water at 80 °C, mix thoroughly, and boil in a water bath for 30 min (shaking 2–3 times to ensure complete dissolution of organic acids in the sample). Remove the sample, cool it to room temperature, dilute to 250 mL with carbon-dioxide-free water, filter it through rapid-flow filter paper, and collect the filtrate for analysis. Transfer 25 mL of the filtrate to a 250 mL conical flask. Add 2–4 drops of (10 g/L) phenolphthalein indicator solution. Titrate with 0.1 mol/L sodium hydroxide standard titration solution until a faint pink color persists for 30 s without fading. Record the volume of 0.1 mol/L sodium hydroxide standard titrant consumed. Express the result in citric acid equivalents.
Vitamin C content was determined using the 2,6-dichlorophenol indophenol redox titration method (GB 5009.86-2025) [
22]: Weigh 100 mL of tomato homogenate and add 100 mL of oxalic acid solution. Accurately weigh 10 g to 40 g of homogenate sample (to the nearest 0.01 g) into a beaker. Transfer the sample to a 100 mL volumetric flask using oxalic acid solution, dilute to the mark, mix thoroughly, and filter. Accurately pipette 10 mL of the filtrate into a 50 mL conical flask. Titrate with a calibrated 2,6-dichlorophenolindophenol solution until the solution turns pink and remains so for 15 s. Conduct a blank test simultaneously. Perform the entire procedure under light-shielded conditions.
In the results, lycopene and vitamin C content are expressed in milligrams per 100 g of fresh weight (mg/100 g FW); soluble solids, total sugars, and total acidity are expressed as percentages of fresh weight (%, w/w, i.e., g/100 g FW).
2.4. Data Processing and Analysis
Multivariate statistical analysis methods were employed. Data processing was performed using EXCEL 2019, correlation analysis and principal component analysis were conducted with SPSS 26.0, and cluster analysis was carried out using ORIGIN 2024 to generate a circular cluster diagram. Using SPSS 26.0, we tested whether significant differences existed among the various clusters obtained from clustering on different quality traits. With cluster as the fixed factor, we conducted one-way ANOVA for each quality trait. When ANOVA indicated a significant cluster effect (p < 0.05), post hoc tests were performed using the Waller–Duncan test (k-ratio = 100) for post hoc analysis, with a significance level set at α = 0.05.
The coefficient of variation (CV) for each trait was calculated based on the following formula.
In the formula, σ represents the standard deviation and μ represents the mean.
The genetic diversity index (
H’) for quantitative traits was calculated with reference to the method of Zhang et al. [
23]. First, the data for each trait across all materials were divided into 10 grades based on the mean value and standard deviation, with each grade spanning 0.5σ (where σ is the standard deviation). The index was then calculated according to the following formula.
In the formula, Pi represents the percentage of materials in the i-th grade for a given trait relative to the total number of materials.
The comprehensive evaluation of quality traits in processing tomato germplasm resources was conducted using the membership function value method from fuzzy mathematics [
24].
The calculation formula for the membership function value in fuzzy mathematics is
In the formula, Xj represents the score of the j-th principal component within a given principal component, while Xmin and Xmax denote the minimum and maximum scores of that principal component obtained from the principal component analysis.
The weight calculation formula is as follows:
In the formula, Wj represents the importance (i.e., weight) of the j-th comprehensive indicator among all evaluation indicators; Pj denotes the contribution rate of the j-th comprehensive indicator for each tomato germplasm resource obtained through principal component analysis.
The comprehensive evaluation D-value calculation formula is as follows:
In the formula, the D-value represents the comprehensive evaluation score obtained for each tomato germplasm resource through the assessment of five quality trait composite indicators [
25].
3. Results
3.1. Genetic Diversity Analysis of Quality Traits in Processing Tomato
The five fruit quality traits of the tested germplasm resources all exhibited substantial genetic variation (
Table 2). The range of variation differed among traits, with coefficients of variation (CVs) ranging from 12.21% to 39.04% and an average of 26.55%. Total sugar (39.04%), vitamin C (32.02%), and lycopene (31.27%) showed higher coefficients of variation, indicating greater variability potential and enhanced prospects for genetic improvement within the tested population. In contrast, soluble solids (12.21%) and total acid (18.22%) demonstrated relatively lower coefficients of variation, suggesting more stable genetic expression.
The genetic diversity index (H′) results revealed rich genetic diversity across all five quality traits, with H′ values ranging from 1.899 to 2.064 and a mean of 1.996. Both soluble solids and total sugar displayed H′ values exceeding 2.000, while lycopene, vitamin C, and total acid also approached H′ = 2.000, further confirming the substantial genetic diversity present in these traits.
All quality traits reached extremely significant F-values (p < 0.01), indicating considerable genetic divergence among the tested materials. This provides a reliable genetic foundation for subsequent selection of superior parental lines and targeted trait improvement.
3.2. Correlation Analysis of Quality Traits in Processing Tomato Germplasm Resources
To investigate the degree of correlation and mutual influence among quality traits, a correlation analysis was conducted. The results are presented in
Table 3. Lycopene showed highly significant positive correlations with both soluble solids and total sugar (r = 0.340, r = 0.372,
p < 0.01). Soluble solids also demonstrated highly significant positive correlations with total sugar and total acid (r = 0.400, r = 0.227,
p < 0.01). Vitamin C exhibited weak correlations with other traits but displayed a highly significant negative correlation with total sugar (r = −0.234,
p < 0.01). The correlation patterns for total acid were relatively limited, showing significant associations only with soluble solids.
3.3. Principal Component Analysis
Principal component analysis was conducted on five quality traits of 113 processed tomato germplasm resources. The results are shown in
Table 4. The eigenvalues of the first three principal components were 1.813, 1.175, and 0.884, respectively. Although the eigenvalue of the third principal component (PC3) was slightly less than 1, its cumulative contribution rate reached 77.435%. Subsequent eigenvalues flattened, and this component exhibited high loadings on total acidity. Thus, it represents the vast majority of information for the measured quality traits in the test materials.
The first principal component (PC1) contributed 36.257% of the variance, with soluble solids exhibiting the highest loading at 0.803, while lycopene and total sugar also showed relatively high loadings. The second principal component (PC2) contributed 23.503%, with vitamin C exhibiting the highest loading value of 0.816. PC3 contributed 17.675%, primarily influenced by the total acidity trait, which had a loading value of −0.718.
3.4. Cluster Analysis
A systematic cluster analysis was conducted on five quality traits of the germplasm resources. Materials with similar traits were grouped together, and the tested materials could be classified into six major groups at a Euclidean distance of 15 (
Figure 1). One-way ANOVA indicated that these six major clusters exhibited extremely significant differences in all quality traits (
p < 0.001; specific F-values are shown in
Table 5). Post hoc multiple comparisons using the Waller–Duncan test clarified the quality trait characteristics of each cluster (
Table 5).
Group I comprised 53 germplasm materials, accounting for 46.90% of the total. The lycopene content (10.28 ± 2.78 mg/100 g FW) in this group was significantly lower than that in Groups II and III. The soluble solids content (5.08 ± 0.48%) was significantly lower than that in Group IV but significantly higher than in Group VI, both falling within the medium range. Total sugar (2.70 ± 0.99%) and total acid (0.36 ± 0.05%) were well-balanced, with a sugar–acid ratio of 7.81 ± 3.19, indicating harmonious flavor. The vitamin C content (14.75 mg/100 g FW) was significantly higher than in Groups V and VI.
Group II consisted of 43 germplasm materials, representing 38.05% of the total. This group exhibited a total sugar content (4.47 ± 0.84%) significantly higher than that in Groups I, V, and VI. Its lycopene content (15.39 ± 3.17 mg/100 g FW) was the second highest, only surpassed by Group III. The soluble solids content (5.45 ± 0.49%) was moderate and significantly higher than in Group VI. However, the total acid content (0.36 ± 0.05%) was significantly lower than in Groups IV and V, while the vitamin C content (10.41 ± 3.47 mg/100 g FW) was significantly lower than in Groups III and IV.
Group III included eight germplasm materials, accounting for 7.08% of the total. This group demonstrated a lycopene content (18.01 ± 2.04 mg/100 g FW) significantly higher than in Groups I, IV, and VI, and a vitamin C content (18.48 ± 3.64 mg/100 g FW) significantly higher than in Groups II, V, and VI. Additionally, both soluble solids (5.61 ± 0.87%) and total sugar (2.89 ± 0.87%) contents were significantly higher than in Group VI. The acidity level (0.43 ± 0.06%) was significantly lower than in Group V but higher than in Group VI.
Group IV contained five germplasm materials, representing 4.42% of the total. This group was characterized by significantly high soluble solids, high acidity, high sugar, high vitamin C, and low lycopene. Its soluble solids content (6.24 ± 0.47%) was significantly higher than in Groups I and VI; total sugar (3.68 ± 1.41%) was significantly higher than in Groups V and VI; total acid (0.51 ± 0.05%) was significantly higher than in Groups I, II, and VI; and vitamin C (17.51 ± 5.14 mg/100 g FW) was significantly higher than in Groups II, V, and VI. These traits are crucial for improving tomato paste yield and developing a strong flavor in ketchup production. However, the lycopene content (10.17 ± 1.91 mg/100 g FW) was significantly lower than in Groups II and III.
Group V consisted of two germplasm materials, accounting for 1.77% of the total. This group was characterized by low sugar and high acidity. The total sugar content (1.90 ± 1.08%) was significantly lower than in Groups II and IV, while the total acid content (0.55 ± 0.05%) was significantly higher than in Groups I, II, III, and VI, which may result in an excessively sour flavor. The lycopene content (14.39 ± 0.27 mg/100 g FW) was significantly higher than in Group VI, but the vitamin C content (7.76 ± 1.12 mg/100 g FW) was significantly lower than in Groups I, III, and IV.
Group VI included two germplasm materials, representing 1.77% of the total. This group showed significantly low values across all quality traits. The lycopene content (7.95 ± 0.06 mg/100 g FW) was significantly lower than in Groups II, III, and V; soluble solids (3.55 ± 0.21%) were significantly lower than all other groups; total sugar (0.61 ± 0.01%) was significantly lower than in Groups I, II, III, and IV; total acid (0.31 ± 0.05%) was significantly lower than in Groups III, IV, and V; and vitamin C (8.22 ± 0.47 mg/100 g FW) was also significantly lower than in Groups I, III, and IV.
3.5. Comprehensive Evaluation of Tested Germplasm Resources Using the Membership Function Method
The eigenvalues and loading matrix (loading matrix/square root of the eigenvalue corresponding to the i-th principal component) of the three principal components serve as the weight coefficients for each trait indicator within each principal component. Based on the weight coefficients of each trait within each principal component and the standardized quality data (weight coefficient × standardized quality data), the scores for the three principal components (F1, F2, F3) were obtained for each processed tomato variety.
Using Equations (3)–(5), the membership function values [u(Xj)] for the three principal component composite indices of each material, the weights (Wj) for each composite index, and the composite D-values for each material were calculated.
Based on the obtained comprehensive evaluation D-values, the materials were ranked in descending order. A higher D-value indicates better overall performance of the germplasm resource. The top 10 processing tomato germplasm resources are listed in
Table 6. Among them, material No. 104 achieved the highest D-value (0.89), indicating that it exhibited the best overall performance among the 113 germplasm resources.
Based on the cluster analysis results, the most outstanding representative materials from each group were selected. In Group I, material No. 4 achieved the highest D-value (0.67). In Group II, material No. 16 recorded the highest D-value (0.81). In Group III, material No. 6 obtained the highest D-value (0.88). In Group IV, material No. 104 showed the highest D-value (0.89). In Group V, material No. 9 attained the highest D-value (0.55). In Group VI, material No. 97 demonstrated the highest D-value (0.18).
4. Discussion
Abundant germplasm resources form the foundation for breeding new varieties. The evaluation and identification of these resources enable the targeted selection of parental lines for new cultivar development, thereby enhancing breeding efficiency. Phenotypic variation serves as an effective approach to reveal the genetic diversity of species. Therefore, a systematic assessment of the phenotypic characteristics and genetic variation in processing tomato germplasm is crucial for exploiting and utilizing their genetic potential [
26]. This study conducted a genetic diversity analysis of five quality traits across 113 processing tomato germplasm resources. The results showed that the coefficients of variation (CVs) ranged from 12.21% to 39.04%. The CV reflects the genetic diversity of traits—a higher CV indicates greater potential for genetic improvement. When the CV exceeds 20%, the trait is considered highly variable [
27]. In this study, lycopene, total sugar, and vitamin C all exhibited CVs greater than 20%. This finding is consistent with the research results of Ma et al. [
28]. These traits exhibit a broad range of selection possibilities during hybrid breeding and represent key targets for breeding improvement. In contrast, soluble solids and total acid had relatively low CVs, indicating more stable genetic expression and less susceptibility to external factors. The CV values for lycopene and soluble solids in this study were slightly higher than those reported by Tian et al. [
29] using 30 major processing tomato varieties (lines) from Xinjiang as test materials. This discrepancy may be attributed to differences in the number of test materials and the genetic backgrounds of the varieties. The genetic diversity index (
H′) ranged from 1.899 to 2.064, with all values being close to or above 2.000. Specifically, the
H′ values for lycopene (1.987), soluble solids (2.043), and total sugar (2.064) were higher than those reported by Li [
30]: lycopene (1.94), soluble solids (1.67), and total sugar (1.98). However, the
H′ for total acid (1.899) was slightly lower than that in Li’s study (2.07), while the
H′ for vitamin C (1.988) was higher than that reported by Lu et al. [
31] for vitamin C (1.38). These discrepancies may be attributed to differences in the genetic background and number of tested varieties. Overall, the processing tomato germplasm resources in this study exhibited substantial diversity in quality traits, with considerable variation among traits, indicating strong potential for selective breeding.
Correlation analysis is a crucial step in breeding, enabling indirect selection of target traits by studying associations between traits. Lycopene, soluble solids, total sugar, total acid, and vitamin C content are all key indicators influencing tomato fruit quality. This study revealed a highly significant positive correlation between total sugar content and soluble solids content, consistent with previous research findings [
28,
32,
33,
34]. At the same time, a highly significant positive correlation was observed between total sugar content and lycopene content, aligning with the results reported by Li et al. [
35]. This indicates that materials with higher sugar content tend to exhibit deeper fruit coloration, higher nutritional levels, and greater concentrations of overall flavor compounds. In contrast, total sugar content showed a highly significant negative correlation with vitamin C content. Although Li et al. [
35] also observed a negative trend between these traits, it did not reach statistical significance in their study. Furthermore, soluble solids content showed highly significant positive correlations with both lycopene and total acid content, consistent with findings by Yang et al. [
36]. In Zhao’s [
37] study, soluble solids content showed a highly significant positive correlation with lycopene content but a highly significant negative correlation with total acidity. This discrepancy may be attributed to differences in genetic backgrounds among varieties and variations in harvest maturity. According to the established classification of correlation coefficient absolute values [
38]—0.00–0.19 (very weak), 0.20–0.39 (weak), 0.40–0.59 (moderate), 0.60–0.79 (strong), and 0.80–1.00 (very strong)—the correlations identified in this study predominantly ranged from weak to moderate intensity. This pattern aligns with the general characteristics of complex quantitative traits, indicating that these five quality traits are regulated by multiple factors (such as genetic background and environmental conditions) rather than simple linear dependencies. Consequently, in breeding practice, it is not feasible to effectively improve correlated traits through simple selection for a single characteristic.
Principal component analysis (PCA) serves as a core analytical method in germplasm resource research. By applying PCA for dimensionality reduction, overlapping information among indicators can be eliminated, effectively reducing data redundancy and thereby simplifying the indicator system [
39]. In this study, PCA was employed to extract three principal components from five processed tomato fruits quality indicators, with a cumulative contribution rate of 77.435%. Among them, PC1 had an eigenvalue of 1.813 and a contribution rate of 36.257%. This component showed high positive loadings on lycopene (0.679), soluble solids (0.803), and total sugar (0.714), indicating that PC1 primarily integrates information on tomato fruit color and flavor compounds. In practical breeding, selecting tomato materials with high PC1 scores holds promise for simultaneously enhancing fruit sugar content, color, and flavor intensity. This aligns with the correlation analysis results in this study and the core evaluation criteria for raw materials used by enterprises. However, this differs from the findings of Li et al. [
40], where the four extracted principal components corresponded to plant height, soluble solids, vitamin C, and soluble sugars. This discrepancy likely stems from variations in the genetic backgrounds of the study materials and the specific traits measured. PC2 had an eigenvalue of 1.175 and a contribution rate of 23.503%. This component exhibited high positive loadings on vitamin C (0.816), while displaying a moderate negative loading on total sugar (−0.367). This indicates that PC2 mainly represents information pertaining to the nutritional value and sweet–sour flavor profile of tomato fruits. PC3 had an eigenvalue of 0.884 and a contribution rate of 17.675%. This component showed the highest loadings on total acid (−0.718), indicating a negative correlation between total acid and PC3. Tomato materials with higher PC3 scores exhibit lower total acid content. Excessively high total acidity not only renders tomato products overly sour but also necessitates additional sugar addition during processing to adjust the sugar-to-acid ratio, thereby increasing investment costs. Consequently, the PC3 score serves as an effective indicator for selecting “low-acid” materials. In breeding practice, germplasm resources exhibiting outstanding performance in specific principal components can be selectively utilized according to different market demands. This also indicates that the weights and D-value rankings derived from combining fuzzy membership functions with principal component analysis in this study exhibit high consistency with priority judgments in breeding practice.
Through cluster analysis, the tested germplasm resources were classified into six distinct groups with characteristic quality traits at a Euclidean distance of 15. Unlike most studies employing 10–15 agronomic and quality traits [
18,
35,
41], this research clustered five indicators directly related to the quality of processed tomato products. This approach enables the clustering results to more directly reflect the processing suitability of germplasm resources. Among these, Group III and Group IV demonstrated superior comprehensive nutritional quality, making them suitable candidate parental materials for breeding new high-quality processing tomato varieties. Group IV exhibits the highest soluble solids content (average 6.24%). According to the industry standard “Processing Tomatoes” (NY/T 1517-2007) [
42], premium-grade raw materials typically require a soluble solids content of no less than 5%. Materials in this group exceed this benchmark. In industrial processing, based on a soluble solids content of 5%, each 1% increase in soluble solids content in the raw material can boost ketchup yield by over 20% [
43] while significantly reducing energy consumption and evaporation costs. Therefore, Group IV materials serve as ideal parental sources for breeding high-yield, low-energy-consumption processing varieties. Group III exhibits the highest lycopene content (average 18.01 mg/100 g FW). Lycopene is not only a vital nutrient but also a key indicator determining ketchup color ratings. The market demonstrates clear preference for products with high lycopene content. Thus, breeding with Group III materials can effectively enhance the nutritional profile and market competitiveness of tomato products. Group II exhibited high sugar and lycopene content, while Group V, characterized by low sugar, low vitamin C, and high acidity, requires targeted utilization for specific breeding objectives. All five quality traits in Group VI showed significantly low values. These results provide both material selection and crossing strategies for the targeted parental combination and specialized breeding of new processing tomato varieties.
Fuzzy membership functions and principal component analysis have been widely applied in analyzing the genetic diversity of plant germplasm resources [
36,
44]. These methods provide a comprehensive, objective, and reliable assessment of germplasm quality. Converting the “performance value” of each material for each specific traits and the comprehensive evaluation into membership degrees between 0 and 1, while utilizing the contribution rates of each principal component extracted by PCA to derive the weights of the comprehensive evaluation value (D-value), enhances the scientific rigor and reproducibility of the comprehensive evaluation results. In this study, an integrated approach combining principal component analysis and the fuzzy membership function was employed to comprehensively evaluate five quality traits across 113 processing tomato germplasm resources. This led to the identification of ten elite advanced inbred lines with the highest scores: Nos. 104, 6, 39, 16, 107, 23, 2, 105, 3, and 46. Furthermore, by integrating the results of cluster analysis, the most outstanding representative materials from each group were selected.
In summary, this study conducted a comprehensive evaluation of processed tomato germplasm resources based on phenotypic traits, offering a simple, low-cost approach with intuitive results. The expression of crop traits results from the combined effects of genotype (G), environment (E), and their interaction (G × E), a principle widely observed across various crops [
45,
46]. When evaluating germplasm resources, the potential influence of the environment must typically be considered. In recent years, genetic evaluations based on molecular markers (such as SSR [
47] and SNP [
48]) have facilitated the revelation of genetic differences and phylogenetic relationships among germplasm at the DNA level. These methods offer advantages of high stability and independence from environmental interference, holding promise for early, supplemental selection of target traits. Therefore, germplasm innovation efforts should focus on integrating comprehensive phenotypic evaluation systems with molecular-marker-based genetic evaluation systems. This approach will accelerate the breeding process for new breakthrough varieties that aggregate multiple desirable traits.