Minimizing Cohort Discrepancies: A Comparative Analysis of Data Normalization Approaches in Biomarker Research
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis is an interesting study and has great importance in metabolomics. Normalization plays an important role in metabolomics and needs to be done very carefully. I have a few points to clarify from the authors before recommending this manuscript to be published in this journal.
1. How one can come to a conclusion about the best method for a particular type of biological sample as there are several factors that can change the pathways based on the normalization?
2. How to choose the most appropriate method for carrying out a metabolomics study? Expand the conclusion section highlighting this.
3. Which one is the most common method being used for metabolomics extensively? Site the most recent and prominent articles.
4. Make a comparative table of the discussed methods with advantages and disadvantages.
Author Response
This is an interesting study and has great importance in metabolomics. Normalization plays an important role in metabolomics and needs to be done very carefully. I have a few points to clarify from the authors before recommending this manuscript to be published in this journal.
- How one can come to a conclusion about the best method for a particular type of biological sample as there are several factors that can change the pathways based on the normalization?
Answer: The following text was added to the Limitations, future prospects and suggestions section: “Determining the best method for a particular type of biological sample can be a complex process that requires careful consideration of various factors that can influence the experimental outcomes. When it comes to normalization methods in biological sample analysis, several key factors should be taken into account to reach a reliable conclusion: sample features, analytical technique, biological variability, infrastructure and resources, statistical considerations. In particular, the statistical assumptions underlying different normalization methods should be carefully evaluated. Some methods may introduce bias or distort the data if they are applied incorrectly or if the data does not meet the method's assumptions. To come to a conclusion about the best method for a particular type of biological sample, it is essential to conduct a thorough evaluation of these factors and perform a meta-analysis to assess the performance of different normalization approaches. Ultimately, the choice of normalization method should be driven by the specific characteristics of the samples and the research objectives to ensure accurate and reliable results.”.
- How to choose the most appropriate method for carrying out a metabolomics study? Expand the conclusion section highlighting this.
Answer: Thank you for your suggestion. We formulated three key points: quality of created model; robustness of created model across datasets; adequate changing of biological marker pattern; and rated the VSN, PQN and MRN methods according to these points. Thus, a conclusion section was changed, accordingly: “Evaluating the quality of normalization methods in biological sample analysis involves considering several key points. Firstly, the quality of the created model is crucial, with VSN normalization showing promise by generating the best model based on OPLS-model metrics. Secondly, the robustness of the model across datasets is essential for reliability, and methods such as VSN, PQN, and MRN demonstrate the ability to create re-liable models across different datasets. Finally, the adequacy in capturing changes in biological marker patterns is a significant factor to consider. VSN normalization has been shown to significantly alter the distribution of biomarker importance, potentially revealing enriched pathways associated with specific conditions like HIE.
In comparison, PQN and MRN methods maintain a closer alignment in biomarker distribution between themselves and the raw distribution, highlighting their consistency in handling biomarker patterns. In conclusion, VSN normalization emerges as an attractive choice for routine use in reducing between-study biological variation due to its ability to create robust models, significant impact on biomarker importance distribution, and potential for uncovering relevant biological pathways associated with specific conditions like HIE.”
- Which one is the most common method being used for metabolomics extensively? Site the most recent and prominent articles.
Answer: The following text was added to the Introduction section: “Based on recent comparative studies conducted by Brix F. et al. [10] and Chua A E et al.[11], it is evident that PQN, VSN, and quantile normalization are three commonly employed methods in the field of metabolomics [10, 11]. These normalization techniques have been extensively utilized in various research studies to enhance the accuracy and reliability of metabolomic data analysis.”
- Make a comparative table of the discussed methods with advantages and disadvantages.
Answer: Thank you for your suggestion. Comparative table of the discussed methods with advantages and disadvantages was added as a Table 2 to the Discussion section.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have presented Data-driven normalization method for investigations. The overall of manuscript is good but it needs minor corrections after acceptance.
1. Prepare a list of abbreviations and put them at the beginning of the article.
2. Provide a table of the methods that have been presented and compared and mention the advantages and disadvantages of each.
3. Please provide a section titled limitations, future prospects and suggestions before the conclusion section.
Author Response
The authors have presented Data-driven normalization method for investigations. The overall of manuscript is good but it needs minor corrections after acceptance.
- Prepare a list of abbreviations and put them at the beginning of the article.
Answer: Thank you for your recommendation. List of the abbreviations was added before the Introduction section.
- Provide a table of the methods that have been presented and compared and mention the advantages and disadvantages of each.
Answer: The table of the methods with advantages and disadvantages was added as a Table 2 to the Discussion section.
- Please provide a section titled limitations, future prospects and suggestions before the conclusion section.
Answer: Thank you for your suggestion. The section was added: “This study is subject to certain limitations that warrant consideration. Firstly, it focused solely on a single task (HIE/health), utilizing only two distinct datasets for analysis. This narrow scope may restrict the generalizability of the findings and the applicability of the normalization methods across diverse biological contexts. Furthermore, the balance of "case/control" samples is a critical factor in ensuring the representativeness and reliability of the data. It is imperative to acknowledge the potential impact of sample proportion imbalance on the quality of normalization procedures. Addressing these limitations and conducting further studies to evaluate the effects of sample proportions on normalization quality are crucial steps toward enhancing the robustness and reliability of data normalization practices in routine research and clinical applications.
Determining the best method for a particular type of biological sample can be a complex process that requires careful consideration of various factors that can influence the experimental outcomes. When it comes to normalization methods in biological sample analysis, several key factors should be taken into account to reach a reliable conclusion: sample features, analytical technique, biological variability, infrastructure and resources, statistical considerations. In particular, the statistical assumptions underlying different normalization methods should be carefully evaluated. Some methods may introduce bias or distort the data if they are applied incorrectly or if the data does not meet the method's assumptions. To come to a conclusion about the best method for a particular type of bio-logical sample, it is essential to conduct a thorough evaluation of these factors and per-form a meta-analysis to assess the performance of different normalization approaches. Ultimately, the choice of normalization method should be driven by the specific characteristics of the samples and the research objectives to ensure accurate and reliable results.”