Genome-Wide Association Studies of Fiber Content in Sugarcane
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis study looks into the genetic factors behind fiber content in sugarcane, using a GWAS approach across five different environments. The aims of this study are clear and results are interest. I think the finding QTLs that stay consistent across environments is a meaningful contribution. That said, there are still a few scientific points that need to be clarified before the paper’s ready for publication.
For my major concert, fibers in plant are not a pure material, but made up of cellulose, hemicellulose, and lignin, and each of those plays a different pathway and role when it comes to processing and agronomic traits. The authors analyzed the fiber to show the phenotypic variation in the first part of results. However, it is hard to link between the phenotypes and candidate genes. At the least, this limitation should be called out more clearly, and the authors should try to connect their results to earlier studies that looked at the individual components.
Second, the authors grow the sugarcanes for phenotypic analysis in this study. They grow sugarcanes at Zhanjiang for 3 years and Wengyuan for two years. The authors indicate it means “five environments”. The environmental design is also somewhat narrow and easy confusion for readers. In the results, three values from Zhanjiang (13.60% (ZJ20), 11.34% (ZJ21), and 12.00% (ZJ22)) are quiet divergence. However, no climate factors showed or correlated in this part. The authors need to figure out the different conditions of climate factors, including rainfall, temperature, etc. A more explicit discussion of how local conditions, such as rainfall and soil characteristics, may have influenced the results would make the findings easier to interpret.
Third, the stats of GWAS need more detail in the materials and methods section. The authors indicate the significance threshold they chose (P = 1.5e-6), but isn’t explained. It leaves readers that do not study on GWAS wondering whether any correction for multiple testing was applied. More clear descript will be good for the readers that do not study on GWAS.
Forth, assigning candidates based on a 20 kb window is acceptable as a starting point, but in a polyploid genome like sugarcane, especially when considering different species such as S. officinarum, S. spontaneum, and cultivated hybrids, this approach is rather difficult. The authors should be more transparent about this limitation. It would also strengthen the paper if they linked their candidate genes to available expression data or functional studies in related crops. Even without conducting new experiments, adding this kind of context would make the gene discussion more compelling.
Fifth, the breeding application section comes across as too general. Simply stating that high-fiber markers can be used for marker-assisted selection overlooks the well-known trade-off between fiber and sugar yield, which needs to be addressed directly. It’s also important to consider that sugarcane’s photosynthetic products mainly consist of fiber and sugar. Since the study focuses on fiber alone, without reporting total biomass, it’s hard to tell whether the increase in fiber reflects a shift in allocation or an overall rise in productivity. This is especially relevant across different sugarcane species like S. officinarum, S. spontaneum, and cultivated hybrids, where physiological traits and breeding priorities may differ. Without this context, the practical relevance of the findings remains limited.
Minor suggestion
1. The terminology for plant materials alternates between “clones” and “cultivars”. It would be clearer to choose one term and apply it consistently throughout the manuscript.
2. The authors describes the panel as “globally sourced (Line 16 and 307),” but since most of the materials come from Chinese breeding programs, that claim feels a bit overstated and should be phrased more carefully.
3. The figures and tables are mostly clear, but it would be helpful to show more detail on how fiber content varies across cultivars like Figure 1, Figure 2, and Table 1 . A histogram with thresholds for high- and low-fiber groups would make the data easier to interpret.
4. In the Discussion 4.2, the authors showed references to previous studies, especially those based on U.S. GWAS panels, could be expanded. the authors figure out the samples of prior fiber studies are almost from Louisiana or Florida breeding programs, but don't include the samples using in this study. However, no more discussion in this part is a pity point. More comparsion would help highlight what’s new or unique about the Chinese panel used in this study.
5. Line 109. The geographic coordinates for the Zhanjiang site include 110°25’ 89” E. Since seconds should range from 0-59 in general. However, I know there are several systems using in different areas. Please verify it is correct or not.
6. Lines 175–176 and Table 1. The mean fiber content for ZJ22 is reported as 12.00% in Table 1 but described as 12.50% in the text. Inconsistencies in numerical values should be corrected. Please check which one is correct.
7. Line 162. The figure legend uses “(wilcoxon-test)”, while the methods section states “Wilcoxon rank-sum tests” (line 127 and 154). The description of statistical testing is inconsistent. Please standardize the phrasing to “Wilcoxon rank-sum test(s)”.
8. Line213-214. In Figure 4, the panel legends for the Q-Q plots are inconsistent: panels (a–d) use “plot” (singular) whereas panel (e) uses “plots” (plural). Please ensure consistency.
9. Line 319 “The gene Sspon.02G0041160-2C encodes CESA, a critical enzyme in
the cellulose syn-319 thesis pathway [47–51].” The CESA means protein and does not italic.
10. Line 345-349 “Although many F-box proteins remain functionally uncharacterized, some, such as AtFBX92, have been identified as regulators of vegetative growth. AtFBX92 negatively regulates plant growth, not by directly influencing cell cycle genes, but by modulating hormone signaling path- ways [19,52].” The “AtFBX92” and “AtFBX92” means proteins and do not italic.
11. Line 352-354 “In Arabidopsis thaliana, MYB46 and MYB83 are functionally redundant transcription factors that serve as master regulators of secondary cell wall biosynthesis.” The “MYB46” and “MYB83” means proteins and do not italic.
The manuscript is understandable overall, but the English expression requires significant improvement. The manuscript is written in reasonably clear English, but there are several issues that reduce fluency, consistency, and professional polish. If the manuscript polish by native English speaker will be helpful to increase the quality of this manuscript. One recurring problem is the inconsistent use of tense. Results are sometimes reported in the present tense, sometimes in the past, which makes the text feel uneven. Results should be consistently written in the past tense, while general statements can remain in the present.
Articles are often missing or misused. For instance, the phrase “utilized to investigate fiber content” should be “utilized to investigate the fiber content.” There are also occasional typographical errors such as “in in China” that must be corrected. Such issues give the impression of insufficient proofreading.
Sentence structure is another weakness. Several sentences are unnecessarily long and packed with clauses. The Abstract in particular contains sentences that would be much clearer if split into two. Repetition is also common: terms like “fiber content” and “sugarcane” appear multiple times in quick succession. Using pronouns or rephrasing would make the text less redundant.
The style is not fully consistent. Gene names such as Sspon.02G0041160-2C are not always italicized, and numerical precision varies (some values with one decimal place, others with two). Abbreviations such as GWAS, QTL, and SNP are not handled consistently either; once defined, they should be used in abbreviated form throughout.
Author Response
Comments 1: For my major concert, fibers in plant are not a pure material, but made up of cellulose, hemicellulose, and lignin, and each of those plays a different pathway and role when it comes to processing and agronomic traits. The authors analyzed the fiber to show the phenotypic variation in the first part of results. However, it is hard to link between the phenotypes and candidate genes. At the least, this limitation should be called out more clearly, and the authors should try to connect their results to earlier studies that looked at the individual components.
Response 1: Agree. We have discussed the reason of the limitation about the link between the phenotypes and candidate genes, and compared the results in this study with those in earlier studies. Lines 295-320 (the reason), and Lines 426-433 (the comparison) in the revised manuscript, these changes could be found.
Comments 2: Second, the authors grow the sugarcanes for phenotypic analysis in this study. They grow sugarcanes at Zhanjiang for 3 years and Wengyuan for two years. The authors indicate it means “five environments”. The environmental design is also somewhat narrow and easy confusion for readers.
Response 2: Agree. We have revised the title into “Genome-wide association studies of fiber content in sugarcane”. Title and Lines 295-320 (the explanation) in the revised manuscript, these changes could be found.
Comments 3: In the results, three values from Zhanjiang (13.60% (ZJ20), 11.34% (ZJ21), and 12.00% (ZJ22)) are quiet divergence. However, no climate factors showed or correlated in this part. The authors need to figure out the different conditions of climate factors, including rainfall, temperature, etc. A more explicit discussion of how local conditions, such as rainfall and soil characteristics, may have influenced the results would make the findings easier to interpret.
Response 3: Agree. We have given a more explicit discussion about climate factors, including rainfall and temperature, might influenced the fiber. Lines 283-293 in the revised manuscript, these changes could be found.
Comments 4: Third, the stats of GWAS need more detail in the materials and methods section. The authors indicate the significance threshold they chose (P = 1.5e-6), but isn’t explained. It leaves readers that do not study on GWAS wondering whether any correction for multiple testing was applied. More clear descript will be good for the readers that do not study on GWAS.
Response 4: Agree. We have given a more clear explanation about the significance threshold (P = 1.5e-6) in the materials and methods section. Lines 149-152 in the revised manuscript, these changes could be found.
Comments 5: Forth, assigning candidates based on a 20 kb window is acceptable as a starting point, but in a polyploid genome like sugarcane, especially when considering different species such as S. officinarum, S. spontaneum, and cultivated hybrids, this approach is rather difficult. The authors should be more transparent about this limitation. It would also strengthen the paper if they linked their candidate genes to available expression data or functional studies in related crops. Even without conducting new experiments, adding this kind of context would make the gene discussion more compelling.
Response 5: Agree. We have given a more clear explanation about the 20 kb window in the materials and methods section. Lines 155-157 in the revised manuscript, this change could be found. We have added the discussion about candidate genes to available expression data. Lines 434-437 in the revised manuscript, this change could be found.
Comments 6: Fifth, the breeding application section comes across as too general. Simply stating that high-fiber markers can be used for marker-assisted selection overlooks the well-known trade-off between fiber and sugar yield, which needs to be addressed directly. It’s also important to consider that sugarcane’s photosynthetic products mainly consist of fiber and sugar. Since the study focuses on fiber alone, without reporting total biomass, it’s hard to tell whether the increase in fiber reflects a shift in allocation or an overall rise in productivity. This is especially relevant across different sugarcane species like S. officinarum, S. spontaneum, and cultivated hybrids, where physiological traits and breeding priorities may differ. Without this context, the practical relevance of the findings remains limited.
Response 6: Agree. We have reconsidered the well-known trade-off between fiber and sugar yield, and made corresponding revision, and given a more clear description about the breeding application of these fiber markers. Abstract, Lines 65-68, Lines 76-78, Lines 315-320, Lines 448-449, in the revised manuscript, this change could be found. Adding another relevant reference.
Minor suggestion
Comments 7: 1. The terminology for plant materials alternates between “clones” and “cultivars”. It would be clearer to choose one term and apply it consistently throughout the manuscript.
Response 7: Agree. We have revised “cultivars” to “clones” for plant materials. In the revised manuscript, corresponding changes could be found.
Comments 8: 2. The authors describes the panel as “globally sourced (Line 16 and 307),” but since most of the materials come from Chinese breeding programs, that claim feels a bit overstated and should be phrased more carefully.
Response 8: Agree. We have deleted “globally sourced (Line 17)”, and revised “globally sourced” to “derived from 11 countries involved in sugarcane cultivation and breeding (Line 360 and 361)”. In the revised manuscript, corresponding changes could be found.
Comments 9: 3. The figures and tables are mostly clear, but it would be helpful to show more detail on how fiber content varies across cultivars like Figure 1, Figure 2, and Table 1 . A histogram with thresholds for high- and low-fiber groups would make the data easier to interpret.
Response 9: Agree. We need more time to do more analysis about how fiber content varies across cultivars. The deadline of the revised manuscript is 12 September 2025.
Comments 10: 4. In the Discussion 4.2, the authors showed references to previous studies, especially those based on U.S. GWAS panels, could be expanded. the authors figure out the samples of prior fiber studies are almost from Louisiana or Florida breeding programs, but don't include the samples using in this study. However, no more discussion in this part is a pity point. More comparsion would help highlight what’s new or unique about the Chinese panel used in this study.
Response 10: Agree. We can not get the detail about the samples of prior fiber studies, but none of the Chinese panel used in this study are from Louisiana or Florida breeding programs.
Comments 11: 5. Line 109. The geographic coordinates for the Zhanjiang site include 110°25’ 89” E. Since seconds should range from 0-59 in general. However, I know there are several systems using in different areas. Please verify it is correct or not.
Response 11: Agree. We have revised the geographic coordinates for both the two site. Line118-119, in the revised manuscript, this change could be found.
Comments 12: 6. Lines 175–176 and Table 1. The mean fiber content for ZJ22 is reported as 12.00% in Table 1 but described as 12.50% in the text. Inconsistencies in numerical values should be corrected. Please check which one is correct.
Response 12: Agree. We have revised the mean fiber content for ZJ22 reported as 12.50% in the text to 12.00%. Lines 187, in the revised manuscript, this change could be found.
Comments 13: 7. Line 162. The figure legend uses “(wilcoxon-test)”, while the methods section states “Wilcoxon rank-sum tests” (line 127 and 154). The description of statistical testing is inconsistent. Please standardize the phrasing to “Wilcoxon rank-sum test(s)”.
Response 13: Agree. We have revised the figure legend “(wilcoxon-test)” to “(Wilcoxon rank-sum tests)”. Line 174, in the revised manuscript, this change could be found.
Comments 14: 8. Line213-214. In Figure 4, the panel legends for the Q-Q plots are inconsistent: panels (a–d) use “plot” (singular) whereas panel (e) uses “plots” (plural). Please ensure consistency.
Response 14: Agree. We have revised the panel (e) “plots” (plural) to “plot” (singular). Line 2125. In Figure 4, in the revised manuscript, this change could be found.
Comments 15: 9. Line 319 “The gene Sspon.02G0041160-2C encodes CESA, a critical enzyme in the cellulose syn-319 thesis pathway [47–51].” The CESA means protein and does not italic.
Response 15: Agree. We have revised the “CESA” to not italic. Line 376 “The gene Sspon.02G0041160-2C encodes CESA, a critical enzyme in the cellulose synthesis pathway [47–51].” in the revised manuscript, this change could be found.
Comments 16: 10. Line 345-349 “Although many F-box proteins remain functionally uncharacterized, some, such as AtFBX92, have been identified as regulators of vegetative growth. AtFBX92 negatively regulates plant growth, not by directly influencing cell cycle genes, but by modulating hormone signaling path- ways [19,52].” The “AtFBX92” and “AtFBX92” means proteins and do not italic.
Response 16: Agree. We have revised the two “AtFBX92” to not italic. Line 403-404 “Although many F-box proteins remain functionally uncharacterized, some, such as AtFBX92, have been identified as regulators of vegetative growth. AtFBX92 negatively regulates plant growth, not by directly influencing cell cycle genes, but by modulating hormone signaling path- ways [19,52].” in the revised manuscript, this change could be found.
Comments 17: 11. Line 352-354 “In Arabidopsis thaliana, MYB46 and MYB83 are functionally redundant transcription factors that serve as master regulators of secondary cell wall biosynthesis.” The “MYB46” and “MYB83” means proteins and do not italic.
Response 17: Agree. We have revised the “MYB46” and “MYB83” to not italic. Line 410 “In Arabidopsis thaliana, MYB46 and MYB83 are functionally redundant transcription factors that serve as master regulators of secondary cell wall biosynthesis.” in the revised manuscript, this change could be found.
Comments on the Quality of English Language
Comments 18: The manuscript is understandable overall, but the English expression requires significant improvement. The manuscript is written in reasonably clear English, but there are several issues that reduce fluency, consistency, and professional polish. If the manuscript polish by native English speaker will be helpful to increase the quality of this manuscript. One recurring problem is the inconsistent use of tense. Results are sometimes reported in the present tense, sometimes in the past, which makes the text feel uneven. Results should be consistently written in the past tense, while general statements can remain in the present.
Response 18: Agree. The manuscript was already polished by Australian sugarcane expert——Phillip A. Jackson. In the revised manuscript, results was consistently written in the past tense, and general statements remained in the present.
Comments 19: Articles are often missing or misused. For instance, the phrase “utilized to investigate fiber content” should be “utilized to investigate the fiber content.” There are also occasional typographical errors such as “in in China” that must be corrected. Such issues give the impression of insufficient proofreading.
Response 19: Agree. In the revised manuscript, the phrase “utilized to investigate fiber content” was corrected as “utilized to investigate the fiber content.” And “in in China” was also corrected.
Comments 20: Sentence structure is another weakness. Several sentences are unnecessarily long and packed with clauses. The Abstract in particular contains sentences that would be much clearer if split into two.
Response 20: Agree. In the revised manuscript, corresponding change could be found.
Comments 21: Repetition is also common: terms like “fiber content” and “sugarcane” appear multiple times in quick succession. Using pronouns or rephrasing would make the text less redundant.
Response 21: Agree. In the revised manuscript, corresponding change could be found.
Comments 22: The style is not fully consistent. Gene names such as Sspon.02G0041160-2C are not always italicized, and numerical precision varies (some values with one decimal place, others with two).
Response 22: Agree. In the revised manuscript, gene names such as Sspon.02G0041160-2C are always italicized, and numerical precision is consistent (all values with two decimal place).
Comments 23: Abbreviations such as GWAS, QTL, and SNP are not handled consistently either; once defined, they should be used in abbreviated form throughout.
Response 23: Agree. In the revised manuscript, abbreviations such as GWAS, QTL, and SNP were handled; according to Agronomy, these abbreviations should be defined respectively in the Abstract and the Main Text, and after defined, they were used in abbreviated form throughout.
Reviewer 2 Report
Comments and Suggestions for AuthorsMaterials and Methods, lines 110-111. The correct name of the experimental design is: randomized complete block design.
Results. Apparently, the authors made a severe cutoff to the list of candidate genes, leaving only six remaining. It is well known that the formation of the cell wall involves a multiplicity of genes with complex interactions; therefore, please consider the possibility that all genes with significant association and annotated function should be considered as candidate genes and disregard previous research results to have a truly original contribution.
Results, lines 369-370 (last paragraph of that section). The authors mention that additional studies are needed to verify the biological function of the detected candidate genes. It is suggested that the CRISPR/Cas-9 system be specifically mentioned as a straightforward strategy for silencing genes and thus inferring their function.
Conclusion. It is suggested that the first three lines (372-374) be deleted, as the information does not conform to conclusions but to materials and methods.
Author Response
Comments 1: Materials and Methods, lines 110-111. The correct name of the experimental design is: randomized complete block design.
Response 1: Agree. We have corrected the name “a completely random design”. Line 119, in the revised manuscript, this change could be found.
Comments 2: Results. Apparently, the authors made a severe cutoff to the list of candidate genes, leaving only six remaining. It is well known that the formation of the cell wall involves a multiplicity of genes with complex interactions; therefore, please consider the possibility that all genes with significant association and annotated function should be considered as candidate genes and disregard previous research results to have a truly original contribution.
Response 2: Agree. We have given a more clear explanation about the significance threshold (P = 1.5e-6) in the materials and methods section. Lines 149-152 in the revised manuscript, these changes could be found. We have given a more clear explanation about the 20 kb window in the materials and methods section. Lines 155-157 in the revised manuscript, this change could be found. We have suggested all genes with significant association and annotated function should be considered as candidate genes. Furtherly, according to the previous studies, among these candidate genes, five located in four QTL regions were proposed as more critical candidates. In the revised manuscript, corresponding changes could be found.
Comments 3: Results, lines 369-370 (last paragraph of that section). The authors mention that additional studies are needed to verify the biological function of the detected candidate genes. It is suggested that the CRISPR/Cas-9 system be specifically mentioned as a straightforward strategy for silencing genes and thus inferring their function.
Response 3: Agree. We need more time to verify the biological function of these candidate genes. The deadline of the revised manuscript is 12 September 2025. In the revised manuscript, there was no such mention as that additional studies are needed to verify the biological function of the detected candidate genes was deleted.
Comments 4: Conclusion. It is suggested that the first three lines (372-374) be deleted, as the information does not conform to conclusions but to materials and methods.
Response 4: Agree. In the revised manuscript, there was no such words.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have revised the manuscript in line with most of my previous suggestions, and the overall presentation is now much clearer. I only have two remaining minor suggestions:
While the manuscript now mentions fiber components (cellulose, hemicellulose, lignin), the treatment remains largely descriptive. It would strengthen the discussion to include a more explicit comparison between the present “total fiber” GWAS results and previous GWAS focusing on individual components such as cellulose or lignin, with direct references to those studies.
Although the candidate gene section is enriched with citations, the manuscript does not clearly acknowledge the methodological limitation of applying a 20 kb window in the context of a highly polyploid sugarcane genome. A brief statement on this constraint would provide a more balanced and realistic perspective.
Author Response
Comments 1: While the manuscript now mentions fiber components (cellulose, hemicellulose, lignin), the treatment remains largely descriptive. It would strengthen the discussion to include a more explicit comparison between the present “total fiber” GWAS results and previous GWAS focusing on individual components such as cellulose or lignin, with direct references to those studies.
Response 1: Agree. We have given a more explicit comparison between the present “total fiber” GWAS results and previous GWAS focusing on individual components. Lines 426-467 in the revised manuscript, this change could be found.
Comments 2: Although the candidate gene section is enriched with citations, the manuscript does not clearly acknowledge the methodological limitation of applying a 20 kb window in the context of a highly polyploid sugarcane genome. A brief statement on this constraint would provide a more balanced and realistic perspective.
Response 2: Agree. We have given a brief statement on this constraint. Lines 468-481 in the revised manuscript, this change could be found.
