Differential Expression of Hypothalamic Genes in Laying Hens Housed in Caged and Cage-Free Systems Under Commercial Conditions in the Tropics
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. Overall evaluation
This manuscript compares the hypothalamic transcriptomes of laying hens reared in conventional cages (CC) and in cage-free (CF) systems under commercial tropical conditions, aiming to investigate how different housing systems affect neuroendocrine and welfare-related pathways in the hypothalamus. The topic aligns well with the scope of Animals, which emphasizes the intersection of production systems, animal welfare, and performance. The overall study design is clear, the anatomical landmarks used for hypothalamic sampling are described in a reasonably standardized way, and the RNA-seq analysis pipeline is generally appropriate.
However, there are several key issues that need to be addressed: (1) the RNA-seq sample size is small (only three birds per group), which substantially limits the reliability of differential expression and pathway enrichment analyses; (2) the reported log2FoldChange and padj values in the differentially expressed gene (DEG) table appear clearly unreasonable.
Taken together, I recommend that the editor consider a decision of Major Revision.
2. Major comments
2.1 Study design, sample size, and scope of inference
The central question of this study is whether different housing systems (CC vs. CF) under tropical commercial conditions lead to differences in the hypothalamic expression profiles of laying hens. This is a very meaningful question. However, for the RNA-seq part, each treatment group includes only three hens (CC: n = 3, CF: n = 3).
In a real production setting, this sample size is understandable, but from a statistical perspective, n = 3 per group severely limits the power and robustness of differential expression and pathway enrichment analyses. The authors briefly mention in the Results that “due to the small number of biological replicates, we focus only on large-effect differences and treat DEGs as candidate genes,” but this statement is not emphasized strongly enough and is not consistently reflected in the way the results are interpreted.
Suggestions:
-
In the Abstract, Results, and Discussion, clearly position this work as exploratory/hypothesis-generating rather than confirmatory.
-
Add a dedicated “Limitations” paragraph at the end of the Discussion or as a separate subsection, specifically explaining:
-
How n = 3 per group affects the detection power for DEGs, the stability of GO/KEGG enrichment, and the generalizability of the conclusions;
-
That the current results are more appropriate to be regarded as candidate genes and candidate pathways, which require validation in larger sample sizes and over longer time frames.
-
-
Soften some of the wording throughout the manuscript, changing phrases such as “the housing system significantly altered …” to more cautious statements such as “our results suggest that the housing system may influence …”.
2.2 Misuse and confusion between FST and FSH
In most parts of the manuscript, the authors correctly describe FST (the follistatin gene) and cite literature on its role in regulating activin and FSH. However, in the Discussion there is a sentence:
“the increased expression of FSH in the CF group …” (Line 369)
Given the context, this should clearly refer to FST (gene expression) rather than FSH (hormone).
Suggestions:
-
Perform a systematic search for all occurrences of “FSH” in the manuscript and carefully check which ones are intended to refer to the measured gene (FST) and which ones truly refer to the hormone FSH.
-
Correct all misused instances to “FST” and, at the first mention, clarify that “the FST gene encodes follistatin, which can modulate the activin/FSH axis.”
-
If the authors wish to discuss peripheral FSH hormone levels, they should explicitly state that FSH was not directly measured in this study and that any statements about FSH are indirect inferences based on gene expression and literature.
3. Minor comments
3.1 Redundancy between Discussion and Conclusions
The final paragraph of the Discussion and the Conclusions section are highly repetitive, with several sentences nearly duplicated.
Suggestion:
-
Condense the Conclusions section to 3–4 concise sentences that summarize the main findings and implications.
-
Keep the more detailed synthesis and future perspectives in the closing part of the Discussion to avoid redundancy between the two sections.
3.2 Figures and figure legends
-
It is recommended to explicitly indicate the sample size per group (n = 3) in the figure legends so that readers can quickly understand the experimental design.
-
In the volcano plots, it would be helpful to draw dashed lines to indicate the padj and log2FC thresholds, and to describe these thresholds clearly in the legend.
3.3 Reference checking
Please carefully check that all in-text citation numbers correspond correctly to the entries in the reference list, and avoid incorrect or missing citations. For example, Reference 1 appears to be empty in the reference list (line 477), which should be corrected.
Author Response
Reviewers responses.
Reviewer # 1
The manuscript should be revised in accordance with the suggestions provided below.
Comments 1: Major comments. Study design, sample size, and scope of inference
The central question of this study is whether different housing systems (CC vs. CF) under tropical commercial conditions lead to differences in the hypothalamic expression profiles of laying hens. This is a very meaningful question. However, for the RNA-seq part, each treatment group includes only three hens (CC: n = 3, CF: n = 3).
Response 1: The results form part of a holistic investigation into the potential effects of different production systems on bird performance, egg quality, and gene expression in laying hens housed in two commercial-scale egg production systems. Several articles arising from this research have already been published, reporting differences in productivity, egg quality, and gene expression across various avian tissues. In the present study, the sample size was limited to meet minimum statistical requirements. Accordingly, we have reframed the transcriptomic work as exploratory and hypothesis-generating, and applied conservative analytical and interpretive criteria. Additionally, previously published studies have demonstrated that next-generation sequencing (NGS) read depth enables the generation of high-quality data even with constrained sample numbers. Moreover, as this research focuses on animal welfare, all procedures were designed in accordance with the 3Rs principles.
Suggestions:
Comments 2: In the Abstract, Results, and Discussion, clearly position this work as exploratory/hypothesis-generating rather than confirmatory.
Responses 2; Thank you for your suggestion. The exploratory nature of this work has been explicitly included in the Abstract, Results, and Discussion sections.
Comments 3: Add a dedicated “Limitations” paragraph at the end of the Discussion or as a separate subsection, specifically explaining:
Responses 3: The paragraph was included in the final part of the discussion with the following text:
A major limitation of this study is the relatively small sample size (n = 3 per group), which reduces the statistical power and could limit the detection of differentially expressed genes with moderate effect sizes. The small number of biological replicates may also affect the stability of the dispersion estimates and the robustness of the GO and KEGG enrichment analyses. Therefore, the transcriptomic findings should be interpreted as exploratory and hypothesis-generating, rather than as definitive evidence of causal biological mechanisms.
Comments 4: Misuse and confusion between FST and FSH
In most parts of the manuscript, the authors correctly describe FST (the follistatin gene) and cite literature on its role in regulating activin and FSH. However, in the Discussion there is a sentence:“the increased expression of FSH in the CF group …” (Line 369) Given the context, this should clearly refer to FST (gene expression) rather than FSH (hormone).
Responses 4: Thanks for the suggestion, it was a mistake, the text has been adjusted.
Comments 5: Perform a systematic search for all occurrences of “FSH” in the manuscript and carefully check which ones are intended to refer to the measured gene (FST) and which ones truly refer to the hormone FSH, Correct all misused instances to “FST” and, at the first mention, clarify that “the FST gene encodes follistatin, which can modulate the activin/FSH axis.” If the authors wish to discuss peripheral FSH hormone levels, they should explicitly state that FSH was not directly measured in this study and that any statements about FSH are indirect inferences based on gene expression and literature.
Responses 5: Given that follistatin (FST) is a regulatory glycoprotein involved in the modulation of follicle-stimulating hormone (FSH), it is necessary to discuss its potential effect in a species like laying hens, however, the document was adjusted for greater clarity for the reader.
Comments 6: Condense the Conclusions section to 3–4 concise sentences that summarize the main findings and implications.
- Keep the more detailed synthesis and future perspectives in the closing part of the Discussion to avoid redundancy between the two sections.
Responses 6: The text was adjusted according to the reviewer's suggestion.
3.2 Figures and figure legends
- Comments 7: It is recommended to explicitly indicate the sample size per group (n = 3) in the figure legends so that readers can quickly understand the experimental design.
- In the volcano plots, it would be helpful to draw dashed lines to indicate the padj and log2FC thresholds, and to describe these thresholds clearly in the legend
Responses 7: The text in the figure legends was adjusted according to the reviewer's suggestion indicate the sample size per group (n = 3).
3.3 Reference checking
Comments 8: Please carefully check that all in-text citation numbers correspond correctly to the entries in the reference list, and avoid incorrect or missing citations. For example, Reference 1 appears to be empty in the reference list (line 477), which should be corrected.
Responses 8: The references were reviewed and adjusted
Reviewer 2 Report
Comments and Suggestions for AuthorsI think the serious issue for this manuscript is too small number of replications (n=3 per system). Sampling only three hens per system is insufficient to represent a large population, so these results should be used solely for preliminary screening before conducting actual experiments. Please address all the issues listed below.
General Comments:
- The details of housing should be added. For examples, the CC and CF environments, such is ventilation, stocking density, building type, and potentially unmanaged factors, attributing transcriptomic differences to “welfare”.
- The DE results (Table 2) showed very high log2FC. The value seems impossible. Please carefully make interpretation and discussion with support references.
- Welfare conclusions are overstated from very small sample size without supported phenotypes. Biological interpretation should be substantially more cautious.
- Methods for functional analysis need details for reproducibility and clarity (normalization choices, thresholds, sequence editing/duplication, mathematical formula, etc.).
Specific Comments:
- (Page 3, lines 98–108). The author mentioned managing a large number of animals to justify proposing a large-scale study. However, only three animals were actually included in each group, which raises concerns about possible misinterpretation of the data. The discussion should focus more directly on the real sample size and compare the effects of caged versus cage-free rearing conditions.
- (Page 3, lines 115). Give more details on sample selection of “three clinically normal hens per system”. What are the criteria for clinically normal, and how did the authors select 6 animals from 60,000 hens across housing.
- (Page 4, lines 166–167; Page 7, lines 223–224). Contradiction between DEG criteria and presented statistics. The authors define DEGs as padj <0.05 and |log2FC|>1, but Table 2 shows adjusted p-values inconsistent with that.
- (Page 7, lines 223–224) Table 2 appears biologically impossible log2FC; impossible p/padj values. Please check calculation methodology.
- (Page 5, Figure 2; Page 8, Figure 4). Correlations and heatmap are overinterpretation based on n=3 without any validation.
- (Page 12, lines 511). The author mention "increased expression of FSH", but the gene discussed and listed in Table 2 is FST (Follistatin). These are different molecules. Please correct this.
- (Page 12, lines 521-527). The discussion on LHX4 mutations and combined pituitary hormone deficiency in other species is too long. Please focus on how its downregulation in CF or upregulation in CC relates to hen welfare.
Author Response
Reviewer # 2
The manuscript should be revised in accordance with the suggestions provided below.
General Comments:
Comments 1: The details of housing should be added. For example, the CC and CF environments, such as ventilation, stocking density, building type, and potentially unmanaged factors, attributing transcriptomic differences to “welfare”.
Responses 1: Thank you for your suggestion. However, the manuscript (Materials and Methods) already includes information on stocking density per bird per system, ventilation type, and general characteristics of the evaluated systems. Nevertheless, we have added additional information on temperature, humidity, and lighting to supplement the data “The environmental temperature and humidity were recorded hourly using a Hobo data logger (Onset Corp., Cape Cod, MA, USA) for 26 wks (24 h/7 days), obtaining 10,145 data points. Temperatures varied slightly, with a CF mean and SD of 24.45 ± 2.80 °C and CC mean of 24.7± 2.81 °C. The humidity varied by 7% between production systems throughout the study. The lighting program followed the Hy-Line Brown Commercial Management Guide, using cool white LED fixtures with an intensity of 15–20 lux at bird head level, maintaining a 14L:10D photoperiod during the production phase”
Comments 2: The DE results (Table 2) showed very high log2FC. The value seems impossible. Please carefully make interpretation and discussion with support references.
Responses 2: Thank you for the suggestions. We agree and have corrected this. Table 2 and the Supplementary DEG tables have been re-generated directly from R outputs, and the displayed genes now consistently meet the stated criteria (BH-adjusted p-value threshold and log2FC threshold). Any previously inconsistent entries were due to the same formatting/export issue addressed above.
Comments 3: Welfare conclusions are overstated from very small sample sizes without supported phenotypes. Biological interpretation should be substantially more cautious.
Responses 3: The conclusion was adjusted to avoid overstated. However, we want to clarify that these results form part of a holistic investigation into the potential effects of different production systems on bird performance, egg quality, and gene expression in laying hens housed in two commercial-scale egg production systems, for which we already have published phenotype results.
Comments 4: Methods for functional analysis need details for reproducibility and clarity (normalization choices, thresholds, sequence editing/duplication, mathematical formula, etc.).
Responses 4: Resolved as above. Differential expression statistics were recomputed in R using DESeq2 and re-exported in a robust format (CSV) to avoid spreadsheet parsing artifacts. We additionally report shrinkage-stabilized log2FC for interpretability. Table 2 and supplements fully replaced. The document was adjusted for better data clarity; the tables were adjusted.
Specific Comments:
Comments 5: (Page 3, lines 98–108). The author mentioned managing a large number of animals to justify proposing a large-scale study. However, only three animals were actually included in each group, which raises concerns about possible misinterpretation of the data. The discussion should focus more directly on the real sample size and compare the effects of caged versus cage-free rearing conditions.
Responses 5: Thank you for your comment. This study is derived from a larger, holistic, commercial-scale investigation that evaluated multiple parameters, including egg production, feed intake, feed conversion ratio, hens housed (HH), eggs per hen, egg size, and mortality in cage-free and cage-free egg systems. Data were collected daily, and hen weight was measured weekly from week 22 to week 82. Similarly, temperature and humidity were measured hourly for 60 weeks, serum corticosterone levels were assessed, and welfare was evaluated according to the Welfare Quality® protocol. These data are currently under review for another journal. In addition, egg quality, Salmonella status, and fatty acid profile were evaluated. These data have already been published in this same journal (Animals) with a sample of more than 3,900 eggs during the 60 weeks of the commercial production cycle (Rodríguez-Hernández, R.; Rondón-Barragán, I.S.; Oviedo-Rondón, E.O. Egg quality and fatty acid profiles of egg yolks from laying hens housed in conventional caged and cage-free production systems in the Andean tropics). Animals 2024, 14, 168. https://doi.org/10.3390/ani14010168.
However, within the same holistic research, investigations based on the tissues of birds of interest for production and welfare were derived. Nevertheless, since this was a performance and welfare study, we applied the principles of the three Rs to reduce the impact on the animals. Likewise, other publications derived from the main study have been produced with organs of productive interest such as the liver and intestine: Herrera-Sánchez, M. P., Rodríguez-Hernández, R., and Rondón-Barragán, I. S. (2025). Comparative Transcriptome Analysis of Hens’ Livers in Conventional Cage vs. Cage-Free Egg Production Systems. Veterinary Medicine International, 2025(1), 3041-254, and Lozano-Villegas, K. J., Herrera-Sánchez, M. P., Rondón-Barragán, I. S., and Rodríguez-Hernández, R. (2025). Gene Expression of Feed Intake-Regulating Peptides in the Gut-Brain Axis of Laying Hens Housed Under Two Different Egg Production Systems. Animals, 15(21), 31-27. This work is one of the few that compare these two egg production systems at a commercial level: conventional cage vs. a cage-free system, which are the most widely used in Colombia.
Comments 6: (Page 3, lines 115). Give more details on sample selection of “three clinically normal hens per system”. What are the criteria for clinically normal, and how did the authors select 6 animals from 60,000 hens across housing.
Responses 6: As this was a commercial-scale project, the birds were randomly distributed in each of the systems, forming groups in the CC (conventional cages) of 12 cages per replica (15 replicates) to 180 cages in total. Each replicate was distributed throughout the pyramidal system. In the case of CF (cage free), the birds were randomly distributed in 15 rooms (replicates) in two houses. For our study, clinically healthy hens were those that visually presented no characteristic clinical signs of disease, such as changes in the coloration of the combs, eyelids and eyes without inflammation or discharge, nostrils without discharge, and the absence of sneezing or panting, among other common signs of disease in birds. The hens selection was made completely randomly among the replicas using Random Number Generators from the Excel program to select the three replicas per system, from which one bird per replica was obtained chosen at random.
Comments 7: (Page 4, lines 166–167; Page 7, lines 223–224). Contradiction between DEG criteria and presented statistics. The authors define DEGs as padj <0.05 and |log2FC|>1, but Table 2 shows adjusted p-values inconsistent with that.
Responses 7: Thank you for your comments; the table has been reviewed and the typographical errors corrected.
Comments 8: (Page 7, lines 223–224) Table 2 appears biologically impossible log2FC; impossible p/padj values. Please check calculation methodology.
Responses 8: Thank you for your comments; it was an error in the table, which has now been adjusted.
Comments 9: (Page 5, Figure 2; Page 8, Figure 4). Correlations and heatmap are overinterpretation based on n=3 without any validation.
Responses 9: The maps and correlations are created based on the results obtained from the samples, always informing the number of samples. Overinterpretation is not made. The reader of the adjusted document is clear about the limitations of the study due to the number of samples (it was included in the manuscript corrections). Studies have used a minimum of three samples for statistical purposes since RNA-seq has a large reading depth and the only thing they report is that it can decrease the detection of DEGs but it is valid (Schurch, New Jersey, Schofield, P., GierliÅ„ski, M., Cole, C., Sherstnev, A., Singh, V., ... and Barton, GJ (2016). How many biological replicates are needed in an RNA sequencing experiment and which differential expression tool should be used? Arn, 22(6), 839-851), RNA-Seq experiments are often limited to a small number of biological replicates. Likewise, our team has already published articles derived from the holistic investigation of the two systems, in which RNA-seq was performed on liver tissue with N=3 (Herrera-Sánchez, M. P., Rodríguez-Hernández, R., & Rondón-Barragán, I. S. (2025). Comparative Transcriptome Analysis of Hens’ Livers in Conventional Cage vs. Cage-Free Egg Production Systems. Veterinary Medicine International, 2025(1), 3041-254).
Comments 10: (Page 12, lines 511). The author mention "increased expression of FSH", but the gene discussed and listed in Table 2 is FST (Follistatin). These are different molecules. Please correct this.
Responses 10: Thank you for your comment. The error has been corrected in the text.
Comments 11: (Page 12, lines 521-527). The discussion on LHX4 mutations and combined pituitary hormone deficiency in other species is too long. Please focus on how its downregulation in CF or upregulation in CC relates to hen welfare.
Responses 11: However, we believe that the text allows the reader a better understanding of the importance of LHX4 and its effect on stress response, production and growth in birds, therefore, an additional line was included to improve the context of the result.
Reviewer 3 Report
Comments and Suggestions for AuthorsHighlight the gap in research field and the objective of the study. Also, authors may expand the introduction
Lines 89-93: Provide details on environmental conditions (light, humidity)
Lines 114-124: It is unclear how hens were randomly selected from different cages and how much sampes were euthanisized
Lines 126-128: Small sample size and may limit statistical power
Lines 168-175: Provide details on gene analysis and statistical analysis
Lines 261-287: Link these behavioral and physiological observations with the transcriptomic analyses
Lines 284-2877: Τhis statement is general
Lines 391-397: The mention of lncRNAs is brief and non-specific
Lines 419-427: Consider emphasizing how current transcriptomic results could guide future multi-omics studies
In conclusion, provide study limitations, discuss how these findings could inform welfare practices and suggest future directions in the research field
Comments on the Quality of English Language
The manuscript is well written and understandable, but some sentences are long and contain minor grammatical errors.
Author Response
Reviewer #3
Comments 1: Highlight the gap in research field and the objective of the study. Also, authors may expand the introduction
Response 1: Thanks for the suggestion, the introduction was slightly adjusted.
Comments 2: Lines 89-93: Provide details on environmental conditions (light, humidity)
Response 2: Thank you for your suggestion, we have added additional information on temperature, humidity, and lighting to supplement the data “The environmental temperature and humidity were recorded hourly using a Hobo data logger (Onset Corp., Cape Cod, MA, USA) for 26 wks (24 h/7 days), obtaining 10,145 data points. Temperatures varied slightly, with a CF mean and SD of 24.45 ± 2.80 °C and CC mean of 24.7± 2.81 °C. The humidity varied by 7% between production systems throughout the study. The lighting program followed the Hy-Line Brown Commercial Management Guide, using cool white LED fixtures with an intensity of 15–20 lux at bird head level, maintaining a 14L:10D photoperiod during the production phase”
Comment 3: Lines 114-124: It is unclear how hens were randomly selected from different cages and how much samples were euthanisized
Response 3: The hens selection was made completely randomly among the replicas using Random Number Generators from the Excel program to select the three replicas per system, from which one bird per replica was obtained chosen at random. The manuscript was adjusted and included the hens selection information.
Comment 4: Lines 126-128: Small sample size and may limit statistical power
Response 4: We agree that n=3 per group limits statistical power and the scope of inference. However, this was a welfare study, we applied the principles of the three Rs to reduce the impact on the animals, additionally, studies have used a minimum of three samples for statistical purposes since RNA-seq has a large reading depth (Schurch, New Jersey, Schofield, P., Gierliński, M., Cole, C., Sherstnev, A., Singh, V., ... and Barton, GJ (2016). How many biological replicates are needed in an RNA sequencing experiment and which differential expression tool should be used? Arn, 22(6), 839-851).
Comment 5: Lines 168-175: Provide details on gene analysis and statistical analysis
Response 5: Implemented. We expanded the RNA-seq bioinformatics and statistical methods (Sections 2.4–2.5) to specify the reference genome and annotation; read-level QC using FastQC with aggregated reporting via MultiQC; adaptor/quality trimming parameters; alignment/quantification and gene-level count generation; and the low-abundance filtering rule. Differential expression is now described in sufficient detail to replicate the analysis, including DESeq2 normalisation and modelling, the hypothesis testing approach, Benjamini–Hochberg FDR adjustment, shrinkage-based log2 fold changes, and the exact DEG thresholds applied. We also clarified the functional enrichment workflow, including the gene universe/background, significance cut-offs, and multiple-testing correction.
Comment 6: lines 261-287: Link these behavioral and physiological observations with the transcriptomic analyses.
Response 6: Thank you for your comment. This study is derived from a larger, holistic, commercial-scale investigation that evaluated multiple parameters, including egg production, feed intake, feed conversion ratio, hens housed (HH), eggs per hen, egg size, and mortality in cage-free and cage-free egg systems. Data were collected daily, and hen weight was measured weekly from week 22 to week 82. Similarly, temperature and humidity were measured hourly for 60 weeks, serum corticosterone levels were assessed, and welfare was evaluated according to the Welfare Quality® protocol. These data are currently under review for another journal. In addition, egg quality, Salmonella status, and fatty acid profile were evaluated. These data have already been published in this same journal (Animals) with a sample of more than 3,900 eggs during the 60 weeks of the commercial production cycle (Rodríguez-Hernández, R.; Rondón-Barragán, I.S.; Oviedo-Rondón, E.O. Egg quality and fatty acid profiles of egg yolks from laying hens housed in conventional caged and cage-free production systems in the Andean tropics). Animals 2024, 14, 168. https://doi.org/10.3390/ani14010168.
Comment 7: Lines 284-287: Τhis statement is general
Response 7: We tightened this statement by specifying the exact candidate pathways (GO/KEGG terms) and by clarifying that enrichment results are exploratory given limited replication and potential environmental confounding.
Comment 8: Lines 391-397: The mention of lncRNAs is brief and non-specific
Response 8: We slightly expanded the discussion on lncRNA to provide context; however, further study of this type of long non-coding RNA is needed in birds and in the hypothalamus.
Comment 9: Lines 419-427: Consider emphasizing how current transcriptomic results could guide future multi-omics studies.
Response 9: We strengthened on the manuscript the forward-looking section to position the candidate genes/pathways as prioritisation inputs for future multi-omics designs (e.g., epigenomics, proteomics, metabolomics) integrated with welfare phenotyping under commercial microclimate monitoring.
Comment 10: In conclusion, provide study limitations, discuss how these findings could inform welfare practices and suggest future directions in the research field
Response 10: Thank you for the suggestion; a paragraph outlining the study's limitations has been added, and the conclusion has been adjusted.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe author has revised the manuscript according to my suggestions, and I recommend accepting it.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have addressed the revisions.

