Next Article in Journal
Modeling the Spatial Distribution of Acacia decurrens Plantation Forests Using PlanetScope Images and Environmental Variables in the Northwestern Highlands of Ethiopia
Next Article in Special Issue
Determining Suitable Sampling Times for Soil CO2 and N2O Emissions Helps to Accurately Evaluate the Ability of Rubber-Based Agroforestry Systems to Cope with Climate Stress
Previous Article in Journal
Freezing-Rain- and Snow-Induced Bending and Recovery of Birch in Young Hemiboreal Stands
Previous Article in Special Issue
Current Achievements and Future Challenges of Genotype-Dependent Somatic Embryogenesis Techniques in Hevea brasiliensis
 
 
Article
Peer-Review Record

Functional Analysis of the HbREF1 Promoter from Hevea brasiliensis and Its Response to Phytohormones

Forests 2024, 15(2), 276; https://doi.org/10.3390/f15020276
by Lin-Tao Chen 1,2,†, Dong Guo 2,3,†, Jia-Hong Zhu 2,3, Ying Wang 2,3, Hui-Liang Li 2,3, Feng An 4, Yan-Qiong Tang 1,* and Shi-Qing Peng 1,2,3,*
Reviewer 1:
Reviewer 2: Anonymous
Forests 2024, 15(2), 276; https://doi.org/10.3390/f15020276
Submission received: 6 December 2023 / Revised: 27 January 2024 / Accepted: 29 January 2024 / Published: 1 February 2024
(This article belongs to the Special Issue Stress Resistance of Rubber Trees: From Genetics to Ecosystem)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Due to the excellent properties natural rubber still has differential uses and can’t be replaced by synthetic rubber. Here, author unveiled the functional properties of Rubber Elongation Factor (REF) in Hevea brasiliensis. Most importantly they showed that HbREF1 promoter contains various potential cis-acting elements that involves with the regulation of phytohormones. This study is quite interesting and science is strong. However following minor issues need to address before consideration of the manuscript.

1. Author suggested to write the word functional instead of function in title.

2. In abstract line line 29 “these……predicted out- comes…..: suggested to rewrite.

3. In discussion author suggested to write some specific roles of cis elements regulated phytohormones in rubber quality.

Comments on the Quality of English Language

Author suggested to check the grammatical error if there is any. 

Author Response

Review 1

Comments and Suggestions for Authors

Due to the excellent properties natural rubber still has differential uses and can’t be replaced by synthetic rubber. Here, author unveiled the functional properties of Rubber Elongation Factor (REF) in Hevea brasiliensis. Most importantly they showed that HbREF1 promoter contains various potential cis-acting elements that involves with the regulation of phytohormones. This study is quite interesting and science is strong. However following minor issues need to address before consideration of the manuscript.

 

  1. Author suggested to write the word functional instead of function in title.

Response: function has been changed to functional.

  1. In abstract line line 29 “these……predicted out- comes…..: suggested to rewrite.

Response: Thank you for your valuable suggestion. We have rewritten this sentence.

  1. In discussion author suggested to write some specific roles of cis elements regulated phytohormones in rubber quality.

Response: Thank you very much for your suggestion. We have added relevant content to our discussion.

 

Reviewer 2 Report

Comments and Suggestions for Authors

Wet lab experiment are very time- and money consuming. Hence, bioinformatics analysis is substantially cheaper and faster. That is the reason why bioinformatics analysis should use at least modern source data such as motif library for potential regulatory binding site of transcription factors, derived from modern whole genome massive sequencing approaches like ChIP-seq or DAP-seq, and modern analysis tools should applied. These all should be done even if you analyze a short piece of DNA of one gene.

 

55, 263

In vitro, in vivo -> italic

 

74-75

analyzed by bioinformatics -> analyzed by bioinformatics approach

the function was identified -> their function was identified

75-76

… the regulation of transcriptional activity was studied after treatment with abscisic acid (ABA), ethylene  (ET), methyl jasmonate (MeJA), and salicylic acid (SA)…

the term ‘hormone’ is totally missed in the introduction, hence, the task of study came out of nowhere. What are hormones? Which of them is suspected as important and why?

270

…Plant hormones were a group of disparate small molecule that acted as chemical messengers to coordinate the activities of plant cells [27].

Part of this paragraph should be transferred to Introduction. Introduction should represent a state of art before your study, you should describe what anyone done in your field. Discussion should represent your results in the connection with this state of art.

 

Table 1. Predictions of main cis-elements cis-acting element in the HbREF1 promoter

Note that you used 6-mers and 8mers, they are expected by chance per 4kb and 64 kb (presuming equal content of A C G T nucleotides), you see that at least any 6 mer can be found in any promoter of 2 kb in two strands.

 

 

These databases are too old,

 

https://bioinformatics.psb.ugent.be/webtools/plantcare/html/

https://www.dna.affrc.go.jp/PLACE/?action=newplace

Lescot, Magali et al. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.” Nucleic acids research vol. 30,1 (2002): 325-7.

Higo, K., Ugawa, Y., Iwamoto, M. and Korenaga, T. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Research, Volume 27, Issue 1, 1999, Pages 297-300.

Why these references are missed and Methods section is lacked bioinformatics analysis?

 

Even human genome was not normally sequenced 20 year ago, and ChIP-seq/DAP-seq technologies were absent too. Motifs for TF binding sites 20 year ago were deduced from separate genes, they were too scarce, e.g. for TF ARFs Ulmasov et al. (1999) found TGTCTC motif, but now O’Malley et al. (2016) and Galli et al (2018) with DAP-seq proved that TGTCGG/TGTCCC are stronger sites for ARF TFs.

Ulmasov, T., Hagen, G., & Guilfoyle, T. J. (1999). Dimerization and DNA binding of auxin response factors. The Plant journal : for cell and molecular biology19(3), 309–319.

O'Malley, R.C., Huang, S.C., Song, L., Lewsey, M.G., Bartlett, A., Nery, J.R., Galli, M., Gallavotti, A., Ecker, J.R. (2016) Cistrome and epicistrome features shape the regulatory DNA landscape. Cell 165, 1280–1292.

Galli, M., Khakhar, A., Lu, Z., Chen, Z., Sen, S., Joshi, T., Nemhauser, J. L., Schmitz, R. J., & Gallavotti, A. (2018). The DNA binding landscape of the maize AUXIN RESPONSE FACTOR family. Nature communications, 9(1), 4526. https://doi.org/10.1038/s41467-018-06977-6

 

Use these sources of modern TFBS motifs for plants

Plant Cistrome Database               http://neomorph.salk.edu/PlantCistromeDB

JASPAR2024       https://jaspar.elixir.no/

Cis-BP   http://cisbp.ccbr.utoronto.ca/index.php

You should use position frequency matrices (PFM), derive position weight matrices (PWM), set thresholds with a whole genome promoter set (you can take it here https://epd.expasy.org/epd) and predict potential sites in your short piece of gene promoter. I hope as people with experimentally background you suspect that your motif prediction PWM should not possess hits in promoters of all genes of genome, you may use 500-1000-2000 bp of each gene, as you wish, and you should predict about 5-10-20 % of all genes, i.e. about one site per 5000-10000 bp.

Note that you can use Arabidopsis, rice or any other plant species to take PFMs and set the thresholds of PWMs. You may use different service like http://cisbp.ccbr.utoronto.ca/TFTools.php https://meme-suite.org/meme/tools/fimo or write your code.

 

163

TATA-box (-150 bp),

Everyone should know that TBP binds directly or indirectly near -30 -35 relative to transcription start site (TSS), you should propose multiple TSS, or anything else to explain TATA-boxes out of ~ -30 positions.

The column has ambiguous meaning, MYB= TF, ABRE = responsive element, etc. All your sites must have potential Transcription factors, terms like CAAT-box were good 20 year ago.

234

…Promoter sequences are located upstream of the 5' end of the structural gene, which  can be recognized and bound by RNA polymerase and initiate gene transcription…

You should mention transcription factors here

Strader, L., Weijers, D., & Wagner, D. (2022). Plant transcription factors - being in the right place with the right company. Current opinion in plant biology65, 102136. https://doi.org/10.1016/j.pbi.2021.102136

 

244-246

…In addition, the promoter also contains a large number of light-regulated elements, hormone response elements, mechanical damage response elements, stress response elements, and it is speculated that the transcription of HbREF1 in rubber trees may be regulated by multiple factors.

->

In addition, the promoter also contains a large number of potential light-regulated elements, hormone response elements, mechanical damage response elements, stress response elements, and it is speculated that the transcription of HbREF1 in rubber trees may be regulated by multiple transcription factors…

Any gene in a genome of any eukaryotic species is regulated by multiple transcription factors.

 

273

All abbreviations like JA, ABA, SA should be deciphered on first appearance.

 

285

P-1758, P-1300 and P-718 were strongly activated to induce the LUC gene expression, but P-583 and P-200   -

->

The promoter fragments P-1758, P-1300 and P-718 were strongly activated to induce the LUC gene expression, but the promoter fragments P-583 and P-200

Comments on the Quality of English Language

English is not excellent, but it is satisfactory

Author Response

Review 2:

Comments and Suggestions for Authors

Wet lab experiment are very time- and money consuming. Hence, bioinformatics analysis is substantially cheaper and faster. That is the reason why bioinformatics analysis should use at least modern source data such as motif library for potential regulatory binding site of transcription factors, derived from modern whole genome massive sequencing approaches like ChIP-seq or DAP-seq, and modern analysis tools should applied. These all should be done even if you analyze a short piece of DNA of one gene.

 55, 263

In vitro, in vivo -> italic

Response: In vitro and in vivo have corrected to italic.

74-75

analyzed by bioinformatics -> analyzed by bioinformatics approach

the function was identified -> their function was identified

Response: We have now changed these two sentences.

75-76

… the regulation of transcriptional activity was studied after treatment with abscisic acid (ABA), ethylene  (ET), methyl jasmonate (MeJA), and salicylic acid (SA)…

the term ‘hormone’ is totally missed in the introduction, hence, the task of study came out of nowhere. What are hormones? Which of them is suspected as important and why?

Response: We appreciate it very much for this good suggestion. We have added the content about the effect of hormones on natural rubber yield in the introduction.

270

…Plant hormones were a group of disparate small molecule that acted as chemical messengers to coordinate the activities of plant cells [27].

Part of this paragraph should be transferred to Introduction. Introduction should represent a state of art before your study, you should describe what anyone done in your field. Discussion should represent your results in the connection with this state of art.

Response: Thank you very much for your suggestion. We have transferred this part to introduction.

 Table 1. Predictions of main cis-elements cis-acting element in the HbREF1 promoter

Note that you used 6-mers and 8mers, they are expected by chance per 4kb and 64 kb (presuming equal content of A C G T nucleotides), you see that at least any 6 mer can be found in any promoter of 2 kb in two strands.

These databases are too old,

 https://bioinformatics.psb.ugent.be/webtools/plantcare/html/

https://www.dna.affrc.go.jp/PLACE/?action=newplace

Lescot, Magali et al. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences.” Nucleic acids research vol. 30,1 (2002): 325-7.

Higo, K., Ugawa, Y., Iwamoto, M. and Korenaga, T. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Research, Volume 27, Issue 1, 1999, Pages 297-300.

Why these references are missed and Methods section is lacked bioinformatics analysis?

Response: These two references have been supplemented in manuscript.

Even human genome was not normally sequenced 20 year ago, and ChIP-seq/DAP-seq technologies were absent too. Motifs for TF binding sites 20 year ago were deduced from separate genes, they were too scarce, e.g. for TF ARFs Ulmasov et al. (1999) found TGTCTC motif, but now O’Malley et al. (2016) and Galli et al (2018) with DAP-seq proved that TGTCGG/TGTCCC are stronger sites for ARF TFs.

Ulmasov, T., Hagen, G., & Guilfoyle, T. J. (1999). Dimerization and DNA binding of auxin response factors. The Plant journal : for cell and molecular biology19(3), 309–319.

O'Malley, R.C., Huang, S.C., Song, L., Lewsey, M.G., Bartlett, A., Nery, J.R., Galli, M., Gallavotti, A., Ecker, J.R. (2016) Cistrome and epicistrome features shape the regulatory DNA landscape. Cell 165, 1280–1292.

Galli, M., Khakhar, A., Lu, Z., Chen, Z., Sen, S., Joshi, T., Nemhauser, J. L., Schmitz, R. J., & Gallavotti, A. (2018). The DNA binding landscape of the maize AUXIN RESPONSE FACTOR family. Nature communications, 9(1), 4526. https://doi.org/10.1038/s41467-018-06977-6

Use these sources of modern TFBS motifs for plants

Plant Cistrome Database               http://neomorph.salk.edu/PlantCistromeDB

JASPAR2024       https://jaspar.elixir.no/

Cis-BP   http://cisbp.ccbr.utoronto.ca/index.php

You should use position frequency matrices (PFM), derive position weight matrices (PWM), set thresholds with a whole genome promoter set (you can take it here https://epd.expasy.org/epd) and predict potential sites in your short piece of gene promoter. I hope as people with experimentally background you suspect that your motif prediction PWM should not possess hits in promoters of all genes of genome, you may use 500-1000-2000 bp of each gene, as you wish, and you should predict about 5-10-20 % of all genes, i.e. about one site per 5000-10000 bp.

Note that you can use Arabidopsis, rice or any other plant species to take PFMs and set the thresholds of PWMs. You may use different service like http://cisbp.ccbr.utoronto.ca/TFTools.php https://meme-suite.org/meme/tools/fimo or write your code.

Response:Thank you very much for the good suggestions provided by the reviewer to improve the quality of our manuscript. We used PlantCARE and PLACE to predict cis-acting elements in the promoter, focusing on discovering hormone-regulated elements and conducting hormone induction analysis, with the aim of identifying hormone-regulated elements. We also tried using tools recommended by reviewer, such as JASPAR2024, and predicted many transcription factors that can bind to the promoter. Many different transcription factors can also bind to the same site. In the subsequent experiments, we will use yeast one-hybrid technology to screen for upstream transcription factors regulating the transcription of HbREF1.

163

TATA-box (-150 bp),

Everyone should know that TBP binds directly or indirectly near -30 -35 relative to transcription start site (TSS), you should propose multiple TSS, or anything else to explain TATA-boxes out of ~ -30 positions.

Response: We have added the content about the prediction of transcription start site (-118 bp) in manuscript.

The column has ambiguous meaning, MYB= TF, ABRE = responsive element, etc. All your sites must have potential Transcription factors, terms like CAAT-box were good 20 year ago.

Response: MYB has been corrected to MBS, MYC has been corrected to G-box.

234

…Promoter sequences are located upstream of the 5' end of the structural gene, which  can be recognized and bound by RNA polymerase and initiate gene transcription…

You should mention transcription factors here

Strader, L., Weijers, D., & Wagner, D. (2022). Plant transcription factors - being in the right place with the right company. Current opinion in plant biology65, 102136. https://doi.org/10.1016/j.pbi.2021.102136

Response: Thank you for your suggestion. We have re-written this part according to Reviewer’s suggestion.
244-246

In addition, the promoter also contains a large number of light-regulated elements, hormone response elements, mechanical damage response elements, stress response elements, and it is speculated that the transcription of HbREF1 in rubber trees may be regulated by multiple factors.

->

In addition, the promoter also contains a large number of potential light-regulated elements, hormone response elements, mechanical damage response elements, stress response elements, and it is speculated that the transcription of HbREF1 in rubber trees may be regulated by multiple transcription factors…

Any gene in a genome of any eukaryotic species is regulated by multiple transcription factors.

Response: We have re-written this part according to Reviewer’s suggestion.

273

All abbreviations like JA, ABA, SA should be deciphered on first appearance.(76-77)

Response: Thanks a lot for the reviewer’s comments. We have already deciphered it when the abbreviation first appeared.

285

P-1758, P-1300 and P-718 were strongly activated to induce the LUC gene expression, but P-583 and P-200   -

->

The promoter fragments P-1758, P-1300 and P-718 were strongly activated to induce the LUC gene expression, but the promoter fragments P-583 and P-200

Response: We have re-written this part according to Reviewer’s suggestion.

 

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

I can not agree to publish this manuscript without a proper analysis of potential cis elements in a gene promoter, see Table 1.

Major

PlantCARE and New PLACE are bad sources of potential cis-elements. Authors ignored my main comments concerning prediction of potential binding sites of TFs, I repeat that without application of modern tools like position weight matrices prepared from motifs (position frequency matrices) from JASPAR2024, CISBP and Plant Cistrome https://jaspar.elixir.no/ http://cisbp.ccbr.utoronto.ca/index.php https://neomorph.salk.edu/PlantCistromeDB I will recommend rejecting the manuscript.

Table 1 should be completely redesigned.

You should use tools like FIMO https://meme-suite.org/meme/tools/fimo

OR perform it manually with your own scripts, read this to construct them Wasserman, W., Sandelin, A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5, 276–287 (2004). https://doi.org/10.1038/nrg1315

 

Note that you should take care about the recognition threshold,

e.g. see advanced options in FIMO, How should matches be filtered before output? Match p-value

 

Specifically, I can also recommend compiling a short list of potentially involved TFs related to regulation mediated by hormones or something related to your gene. Than you can check known motifs for these TFs and predict potential TFBS.

 

 

Minor

Paragraph in lines 70-79 seems to be alien in surrounding text below and above. Try to be more consistent. Any subsequent text should be related to previous one.. E.g. many hormones are not introduced as hormones, ABA and other hormones are introduced twice., lines 76-77, 86-87. Logical relation between lines 69 and 70 is missed.

175

...The comprehensive analysis results show that HbREF1 175 promoter sequence not only has the basic elements of a typical eukaryotic promoter, such 176 as enhancer element CAAT-box (-221 bp) and promoter core sequence TATA-box (-150 177 bp),

Only TATA-box as TBP binding site is the universal eukaryotic motif with conservative function and positioning in gene promoters, as concern CAAT-box it is not either positional or functionally universal, see Gnesutta N, Chiara M, Bernardini A, Balestra M, Horner DS, Mantovani R. The Plant NF-Y DNA Matrix In Vitro and In VivoPlants. 2019; 8(10):406. https://doi.org/10.3390/plants8100406

 

261 263

…The analysis of cis-acting elements shows that the core elements such as TATA-box and CAAT-box, which are necessary for transcription, exist at -150 bp and -221 bp upstream of the translation initiation site, indicating that the sequence conforms to the typical characteristics of eukaryotic promoters

CAAT box is not so universal as TATA-box, moreover you even do not know what TF it binds. So delete the mention of CAAT box, or provide adequate reference for TATA box / TBP, e.g. Vo Ngoc, L., Wang, Y. L., Kassavetis, G. A., & Kadonaga, J. T. (2017). The punctilious RNA polymerase II core promoter. Genes & development, 31(13), 1289–1301. https://doi.org/10.1101/gad.303149.117

 

264

In addition, the promoter also contains a large -> In addition, the promoter also may contain a large

 

Comments on the Quality of English Language

satisfactory

Author Response

Major

PlantCARE and New PLACE are bad sources of potential cis-elements. Authors ignored my main comments concerning prediction of potential binding sites of TFs, I repeat that without application of modern tools like position weight matrices prepared from motifs (position frequency matrices) from JASPAR2024, CISBP and Plant Cistrome https://jaspar.elixir.no/ http://cisbp.ccbr.utoronto.ca/index.php https://neomorph.salk.edu/PlantCistromeDB I will recommend rejecting the manuscript.

Table 1 should be completely redesigned.

You should use tools like FIMO https://meme-suite.org/meme/tools/fimo

OR perform it manually with your own scripts, read this to construct them Wasserman, W., Sandelin, A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5, 276–287 (2004). https://doi.org/10.1038/nrg1315

Note that you should take care about the recognition threshold,

e.g. see advanced options in FIMO, How should matches be filtered before output? Match p-value

Specifically, I can also recommend compiling a short list of potentially involved TFs related to regulation mediated by hormones or something related to your gene. Than you can check known motifs for these TFs and predict potential TFBS.

Response:Thank you once again. We recognize the importance of employing modern tools for predicting TFBS, and we have conducted an analysis of TFBS on the HbREF1 promoter using JASPAR2024. Table 1 has been completely redesigned undergone a comprehensive redesign to better meet standards and expectations. Your guidance is invaluable, and we appreciate the opportunity to improve the quality of our work.

Minor

Paragraph in lines 70-79 seems to be alien in surrounding text below and above. Try to be more consistent. Any subsequent text should be related to previous one.. E.g. many hormones are not introduced as hormones, ABA and other hormones are introduced twice., lines 76-77, 86-87. Logical relation between lines 69 and 70 is missed.

Response:In the manuscript, we added "The use of plant horses and other stimuli to increase NR yield by chemical stimulation has always been an important content of rubber production and theoretical research." to make the context logically relevant. We have also eliminated unnecessary repetition of hormones in the text.

175

...The comprehensive analysis results show that HbREF1 175 promoter sequence not only has the basic elements of a typical eukaryotic promoter, such 176 as enhancer element CAAT-box (-221 bp) and promoter core sequence TATA-box (-150 177 bp),

Only TATA-box as TBP binding site is the universal eukaryotic motif with conservative function and positioning in gene promoters, as concern CAAT-box it is not either positional or functionally universal, see Gnesutta N, Chiara M, Bernardini A, Balestra M, Horner DS, Mantovani R. The Plant NF-Y DNA Matrix In Vitro and In VivoPlants. 2019; 8(10):406. https://doi.org/10.3390/plants8100406

261 263

…The analysis of cis-acting elements shows that the core elements such as TATA-box and CAAT-box, which are necessary for transcription, exist at -150 bp and -221 bp upstream of the translation initiation site, indicating that the sequence conforms to the typical characteristics of eukaryotic promoters

CAAT box is not so universal as TATA-box, moreover you even do not know what TF it binds. So delete the mention of CAAT box, or provide adequate reference for TATA box / TBP, e.g. Vo Ngoc, L., Wang, Y. L., Kassavetis, G. A., & Kadonaga, J. T. (2017). The punctilious RNA polymerase II core promoter. Genes & development, 31(13), 1289–1301. https://doi.org/10.1101/gad.303149.117

Response: We have deleted the mention of CAAT-box in the manuscript.

264

In addition, the promoter also contains a large -> In addition, the promoter also may contain a large

Response: We have made corrections to improve accuracy.

 

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

I appreciate the update of motif recognition in Table 1, JASPAR matrices are quite good. But PWMs as any other TFBS recognition model require the proper threshold setting, so that your columns ‘Score’ and even ‘Relative score’ do not support the reliability of your hits, I used software package MCOT (see below) to test all JASPAR2024 motifs for plants. Among 805 JASPAR2023 matrices for plant I found 33 that have expected frequencies of hits (the best hits of each motifs) less than 1 per 1000 nt, here the list of them

MA2394.1, MA0064.1, MA1272.3, MA1071.2, MA1016.2, MA1017.2, MA0983.2, MA0984.2, MA0020.2, MA0987.2, MA0989.2, MA1088.2, MA1736.2, MA1022.2, MA2062.1, MA2056.1, MA2069.1, MA2058.1, MA2085.1, MA2070.1, MA2073.1, MA2081.1, MA2082.1, MA2084.1, MA1755.2, MA0053.1, MA0981.2, MA0982.2, MA2373.1, MA2356.1, MA2379.1, MA2381.1, MA2383.1,

See an example https://jaspar.elixir.no/matrix/MA2394.1/

In this short list of the worst motifs, I found one of yours https://jaspar.elixir.no/matrix/MA2082.1/, so you certainly will find its best hits (Relative score =1) in any promoter of length about 1000 nt. At least for this motif your analysis is not correct.

But, this refers only to best scores of matrices (Relative score =1), every motif may provide false results for certain thresholds in the rage of Relative scores < 1. So any of your motifs with Relative score < 1 may also respect to false positive results.

See details for this issue here

Touzet, H., Varré, JS. Efficient and accurate P-value computation for Position Weight Matrices. Algorithms Mol Biol 2, 15 (2007). https://doi.org/10.1186/1748-7188-2-15 https://github.com/ge11232002/TFMPvalue

This paper proposes an approach computing for each PWM score a P-value. This P-value is … P-value of a score, which is the probability that the background model can achieve a score larger than or equal to the observed value… = the ratio between a number of words of the PWM with equal or higher score and the total number of words, the latter is = 4 for the PWM of length 1 bp, 16 for 2 bp, 64 for 3 bp etc. This approach implemented in Hocomoco https://hocomoco12.autosome.org/, it check all possible words (k-mers) of lengths equal to the motif length.

The first option is to apply this package to deduce a p-value for each of your Scores

The second option I mentioned earlier - FIMO with its p-value. But it do not provides the straightforward solution as a dependence Score(P-value).

The third option, and the best among the others, use the MCOT package (Levitsky, V., Zemlyanskaya, E., Oshchepkov, D., Podkolodnaya, O., Ignatieva, E., Grosse, I., Mironova, V., & Merkulova, T. (2019). A single ChIP-seq dataset is sufficient for comprehensive analysis of motifs co-occurrence with MCOT package. Nucleic Acids Res, 47(21), e139. https://doi.org/10.1093/nar/gkz800),

The recent source code https://github.com/parthian-sterlet/mcot-kernel/ and the web server https://webmcot.sysbio.cytogen.ru/.

It in its preliminary step this tool for given motifs (PFM, position frequency matrix), MCOT takes the range of expected recognition rates (ERRs) and computes for each of them respective PWM scores (Relative ones). ERR is a recognition rate of a motif for the whole-genome set of promoters of protein-coding genes. ERR = 0.001 mean that one motif is found per 1000 nt. Thus, the term P-value here is replaced by the term ERR, the motif frequency in all promoters. This approach is more close to potentially analyzed data (real sequences) than approach of Touzet and Varre.

In the MCOT web server, you can enter a set of DNA sequences (set of peaks from Chip-seq, but you can use anything, a tool require hundreds of sequences, you add to them your sole input sequence, your promoter), set ‘One Partner’ option, enter two motifs and look to output results.

PWM predictions you can find in supplemental file (link ‘Download additional data’), recognition profiles are *_thr5 files. Another output file is ‘Thresholds vs ERRs’ err*.txt, see its format here  https://github.com/parthian-sterlet/mcot-kernel/blob/master/examples/pro/GSM2827249_CREB1_hg38_pwm.dist  = a table of Relative scores (from 1 and below) and their ERR values.

Unfortunately, there is no united rules for PFM (nucleotide frequencies) -> PWM (nucleotide weights) transformation, so you can  apply a supplementary program from the MCOT github site, https://github.com/parthian-sterlet/mcot-kernel/blob/master/readme.md#generation-of-partner-library which can compute a file ‘Thresholds vs ERRs’ for your PWMs. his program has a parameter of PFM but it is not used for computation, so the dependence of ERR from PWM score is deduced from weights (PWM file). You can apply promoters of Arabidopsis, they are provided there too. T

I propose to take the ERR threshold, e.g. 5E-4 and left in Table 1 hits providing less ERR values.

 

Replace bZIP(A), bZIP(D) & bZIP(I) to the bZIP, notation, I guess letters A, D & refer to families but they are not important for your study.

Comments on the Quality of English Language

OK

Author Response

Dear Editor and Reviewer,

Thank you for providing detailed suggestions on addressing the shortcomings in our TFBS analysis. We highly value your feedback and have engaged in extensive discussions to implement the proposed solutions.

Firstly, we utilized the Hocomoco web tool (https://hocomoco12.autosome.org/) for online analysis. However, due to its primary focus on human and mouse transcription factors, we only identified five TFBSs that match JASPA2024, lacking specific data on plant transcription factors.

Subsequently, we employed WebMCOT (https://webmcot.sysbio.cytogen.ru/app) for online analysis, which also predominantly features human and mouse transcription factors. Unfortunately, none could be found using the Arabidopsis database.

Finally, we used FIMO, as recommended by the reviewer, performing combined site predictions with a threshold p-value of ≤1e-5 on PlantRegMap (http://plantregmap.gao-lab.org/bindingsiteprediction.php). In total, we identified 13 TFBSs matching JASPA2024 (Table 1).

PlantRegMap aggregates binding motifs from PlantCistromeDB, CIS-BP, JASPAR, UniPROBE, TRANSFAC (public 7.0), as well as experimentally derived motifs from literature and MEME-ChIP analyses of ChIP-seq peaks. For TFs with multiple motifs, priority was given to manually selecting the best match determined in vivo, showing greater similarity to other motifs of the same TF. They filtered out low-quality motifs (these with information content less than 4.5), aiming to enhance the credibility of results.

Modification of bZIP annotations: We agree with your suggestion to replace bZIP(A), bZIP(D), and bZIP(I) with a simpler bZIP notation. Indeed, this will enhance the clarity of the table, especially for readers less interested in family information.

We appreciate the reviewer's insightful comments and thorough review. We will promptly incorporate your suggestions to ensure the reliability and persuasiveness of the identified motifs in the revised manuscript.

Sincerely,

Dong Guo

Back to TopTop