Next Article in Journal
Three Spectrin-Sensitive Dielectric Relaxations in RBC Membrane: Relation to RBC Deformability and Surface Properties
Previous Article in Journal
Evaluation of Nucleoprotein-Based Multiepitope DNA Vaccine Constructs Against CCHFV: Insights from Immunoinformatics and In Vivo Challenges
 
 
Article
Peer-Review Record

Multi-Platform Expression Analyses Reveal a Putative INHBA-SERPINE2-SDF2L1 Co-Regulated Module in the Bovine Cumulus–Oocyte Complex

Appl. Biosci. 2026, 5(2), 26; https://doi.org/10.3390/applbiosci5020026
by Beatriz Elena Castro-Valenzuela 1, Tannia Janeth Vega-Montoya 2, Blanca Sánchez-Ramírez 2, Álvaro Vargas-Cázares 1, Moisés Armides Franco-Molina 3 and M.Eduviges Burrola-Barraza 2,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Appl. Biosci. 2026, 5(2), 26; https://doi.org/10.3390/applbiosci5020026
Submission received: 28 January 2026 / Revised: 11 March 2026 / Accepted: 27 March 2026 / Published: 2 April 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Manuscript Title: Secretome Screening and Multi-Level Expression Analyses Reveal a Coordinated INHBA-SERPINE2-SDF2L1 Axis within the Bovine Cumulus–Oocyte Complex

Journal: Applied Biosciences

Type of Manuscript: Article

 

Overall Evaluation

This manuscript aims to identify candidate paracrine mediators within the bovine cumulus–oocyte complex (COC) through an integrated strategy combining in silico secretome screening, validation using public RNA-seq datasets, multi-tissue RT-qPCR profiling, correlation/network analysis, and promoter motif prediction. The authors propose a coordinated INHBA–SERPINE2–SDF2L1 regulatory module potentially linking endocrine (FSH), paracrine (activin/SMAD), extracellular matrix remodeling, and oocyte ER proteostasis during folliculogenesis.

The study has several strengths:

(1) Integration of multiple data layers (EST screening → RNA-seq validation → tissue expression profiling → network analysis → promoter motif scanning).

(2) Cross-validation using two independent GEO RNA-seq datasets.

(3) A biologically coherent narrative linking endocrine and paracrine signaling.

(4) Identification of a plausible transcriptional co-regulation module for further functional exploration.

However, in its current form, the manuscript presents several methodological limitations, statistical weaknesses, and instances of overinterpretation of computational findings. In particular, the proposed “coordinated axis” lacks functional validation and relies heavily on correlation and motif prediction.

For these reasons, I recommend Major Revision.

Major Comments

  1. 1. Secretome Screening Strategy Is Outdated and Potentially Non-Reproducible

The study relies on the now-retired NCBI EST database as the initial source for candidate genes. EST-based screening is considerably outdated compared to current full-transcriptome and annotated genome resources.

Concerns include:

  • Limited completeness and representativeness of EST data.
  • Reduced reproducibility due to database retirement.
  • Lack of validation using current Bos taurus genome annotations.

Suggestions:

  • Provide justification for using ESTs instead of genome-wide annotation-based secretome prediction.
  • Perform at least a validation analysis using the current Bos taurus reference genome.
  • Provide a comprehensive supplementary table listing all screened genes and filtering steps.
  1. Conceptual Inconsistency: SDF2L1 Is Not a Secreted Protein

The manuscript acknowledges that:

SDF2L1 is predominantly an ER-resident protein retained via an HDEL motif and is not generally secreted under physiological conditions.

This creates a conceptual issue, since the study is framed as a “secretome screening” and proposes a coordinated paracrine axis including SDF2L1.

If one of the three core genes is not secreted, the model is no longer strictly a secreted-factor axis but rather a co-regulated transcriptional module.

Suggestions:

  • Modify the title and abstract to avoid implying that all three genes encode secreted factors.
  • Reframe SDF2L1 as a “secretory-pathway-associated regulator” rather than a secreted protein.
  • Clarify that the proposed axis reflects co-regulation rather than purely paracrine signaling.
  1. RNA-seq Statistical Analysis Is Oversimplified

The authors:

  • Use log2(FPKM + 0.5).
  • Apply unpaired Welch’s t-tests.
  • Use P < 0.05 without multiple testing correction.
  • Do not report effect sizes.

Major issues:

  • FPKM is not ideal for statistical comparison across samples.
  • No use of DESeq2 or edgeR frameworks.
  • No false discovery rate (FDR) correction.
  • No log2 fold change reporting.

Suggestions:

  • Reanalyze raw counts using DESeq2 or edgeR.
  • Report adjusted P-values (FDR).
  • Provide effect sizes and confidence intervals.
  • Include full statistical tables in supplementary materials.
  1. RT-qPCR Tissue Profiling Is Qualitative Rather Than Quantitative

Expression was defined as “present” (Ct < 35) or “absent” (Ct ≥ 35).

Limitations:

  • No ΔCt or ΔΔCt quantification.
  • 18S used only for integrity, not normalization.
  • No biological replicate statistics reported.

Presence/absence data do not support robust PCA and correlation analyses.

Suggestions:

  • Provide normalized quantitative expression values.
  • Clearly state biological replicate numbers.
  • Recalculate correlation matrices using quantitative expression data.
  1. Network Analysis Lacks Statistical Definition

The Cytoscape network:

  • Does not define correlation thresholds.
  • Does not report statistical significance thresholds.
  • Does not describe correction for multiple testing.
  • Labels INHBA as a “hub” without reporting network metrics.

Suggestions:

  • Define correlation cutoffs (e.g., r > X and adjusted p < Y).
  • Provide network topology metrics (degree, betweenness centrality).
  • Avoid strong “hub” terminology unless quantitatively suppor.

6. Promoter Motif Analysis Is Overinterpreted

FIMO motif calls were made at P ≤ 1×10⁻³.

Concerns:

  • High false-positive rate of motif scanning.
  • No background enrichment comparison.
  • No random promoter controls.
  • No ChIP-based validation.

Current data demonstrate the presence of potential motifs but do not establish shared regulatory control.

Suggestions:

  • Perform motif enrichment analysis (e.g., AME or HOMER).
  • Include random promoter controls.
  • Soften mechanistic language throughout discussion and conclusions.

7. Lack of Functional Validation

The proposed INHBA–SERPINE2–SDF2L1 axis is based solely on:

  • Co-expression patterns.
  • Promoter motif co-occurrence.
  • Literature-supported pathway plausibility.

No (or Lack):

  • Knockdown or inhibition experiments.
  • FSH stimulation assays.
  • Activin treatment experiments.
  • In vitro maturation (IVM) functional assays.

Therefore, the term “coordinated axis” should be replaced with “putative co-regulated module” unless experimental validation is provided.

 

Minor Comments

  • Section 3.2 contains duplicated text.
  • Inconsistent gene naming (SERPINE vs SERPINE2).
  • Minor formatting inconsistencies in Table 2 chromosomal coordinates.
  • Heatmap lacks clear correlation value scale.
  • Statistical software versions not consistently reported.

· Some repetitive sentences in the Discussion (e.g., duplicated interpretation of motif findings).

Author Response

We thank Reviewer 1 for the careful evaluation and constructive comments. The suggestions improved the clarity, rigor, and presentation of our study. In response, we thoroughly revised the manuscript, refined terminology and interpretation, expanded methodological details to enhance reproducibility, and added supporting information in the Supplementary Materials to address the reviewer’s specific points.

  1. Secretome Screening Strategy Is Outdated and Potentially Non-Reproducible:
  • Provide justification for using ESTs instead of genome-wide annotation-based secretome prediction.

We thank the reviewer for this valuable comment. When we started this project, the NCBI dbEST database was the primary public source of transcript sequences from bovine cumulus–oocyte complexes (COCs). Our goal was to focus on candidates expressed in COCs, not just a general list of predicted secreted proteins. Using an EST-first approach, we begin with transcripts observed in the correct biological context and then apply secretion prediction methods to that set. We want to clarify that although the EST database is no longer maintained as a separate resource within NCBI, the sequences and their associated metadata have not been lost. Instead, this information was integrated into other NCBI repositories, particularly BioSample (in the Nucleotide section). Therefore, the EST data used in this study remains publicly available and accessible to the scientific community. This ensures transparency and reproducibility, as all sequences can still be retrieved, verified, and reanalyzed by other researchers using their corresponding accession numbers in BioSample records.

Moreover, to make our results transparent and reproducible, we did not stop with dbEST. We mapped and checked every EST-derived candidate against the current Bos taurus reference genome and annotation at NCBI (ARS-UCD2.0). Each candidate was confirmed as an annotated gene in the latest genome build, and these updated identifiers are listed in Table II, which shows the exact gene matches for all secreted-protein candidates. So, while we started with ESTs because they were available and specific to COCs, our final candidate list is based on the current reference genome and can be reproduced using the gene identifiers and sequences in Table II.

We have clarified this point in the revised manuscript. We added a new paragraph in the Materials and Methods (Section 2.1, Bioinformatic analysis) to explain why we adopted an EST-first strategy at the outset of the study. We also describe how all EST-derived candidates were later mapped and validated against the current Bos taurus reference genome and annotation (ARS-UCD2.0), and we report the updated genome-anchored identifiers in Table II. You can find this addition in highlighted yellow lines 112–122 of the revised version.

  • Perform at least a validation analysis using the current Bos taurus reference genome.

Thank you for suggesting we validate our results against the current Bos taurus reference genome. In our study, we mapped and verified candidate proteins using the latest NCBI Bos taurus assembly and annotation (ARS-UCD2.0) to confirm that they matched annotated gene models. To clarify, we have updated the Materials and Methods section to describe this step, and Table II lists the genome-anchored identifiers and validation details for each candidate.

  • Provide a comprehensive supplementary table listing all screened genes and the filtering steps used.

Thank you for suggesting that we provide a comprehensive supplementary table listing all screened genes and filtering steps. We have now added Table S1 to the Supplementary Material. This table includes the protein sequences for all gene-derived candidates we screened and shows the results from each secretion-targeting prediction tool in our pipeline (SignalP, SecretomeP, TargetP, and TMHMM). Readers can follow how each sequence was kept or removed at each filtering stage.

  1. Conceptual Inconsistency: SDF2L1 Is Not a Secreted Protein
  • Modify the title and abstract to avoid implying that all three genes encode secreted factors.

Thank you for this valuable suggestion. To make sure it is clear that not all three genes encode soluble secreted factors, we have updated both the title and the Abstract. In the revised Abstract, we now say the screening focused on “genes related to secretion and the secretory pathway.” We also clarify the unique role of SDF2L1 by stating that “INHBA and SERPINE2 protein products are secreted, whereas SDF2L1 is an endoplasmic reticulum resident chaperone that supports the secretory pathway.”

  • Reframe SDF2L1 as a “secretory-pathway-associated regulator” rather than a secreted protein.

Thank you for this helpful suggestion. We have updated the manuscript to describe SDF2L1 as a secretory-pathway-associated regulator (ER-resident chaperone) instead of a soluble secreted protein. These changes are reflected in the revised Abstract (lines 25–27), in the results section 3.1 Bioinformatics analysis, where we now refer to the final prioritized set as “secreted or secretory-pathway associated candidates” (lines 301–302), and in the Discussion, where SDF2L1 is described as a “secretory-pathway-associated, endoplasmic-reticulum–resident chaperone

  • Clarify that the proposed axis reflects co-regulation rather than purely paracrine signaling.

Thank you for this helpful suggestion. We have revised the manuscript to clarify that not all prioritized candidates are soluble secreted proteins. Instead, we now describe the candidate set as including both secreted proteins and those associated with the secretory pathway, since our screening and analyses identify both extracellular factors and secretory pathway components. We updated this terminology throughout the title, abstract, Methods, Results, figure legends, and Discussion.

  1. RNA-seq Statistical Analysis Is Oversimplified
  • Reanalyze raw counts using DESeq2 or edgeR.
  • Report adjusted P-values (FDR).
  • Provide effect sizes and confidence intervals.
  • Include full statistical tables in supplementary materials.

We appreciate the reviewer’s suggestion. We obtained raw sequencing reads from the NCBI Sequence Read Archive (SRA) using the run accessions linked to the GSE99678 GEO record and reprocessed them with a standard raw-count approach. We measured transcript abundance with Salmon and used tximport to import the estimates for gene-level differential expression testing in DESeq2 (CC vs oocyte). We applied the Benjamini–Hochberg procedure for multiple testing correction and report adjusted P-values as FDR (padj). Effect sizes are shown as log2 fold changes (log2FC), and 95% confidence intervals were calculated from the DESeq2 model-based standard error (log2FC ± 1.96 × lfcSE). During reprocessing, we updated the quantification reference from the older UMD3.1-based annotation (used for the original FPKM calculations) to the Ensembl ARS-UCD1.2 transcriptome, which was used to build the Salmon index. With this updated reference, SRPX transcripts are not present or quantifiable in the indexed transcriptome, so no raw counts were generated for SRPX and it could not be tested in the DESeq2 framework. Because of this annotation difference between genome assemblies, we removed SRPX from the manuscript to keep our methods consistent and transparent. We have included complete statistical result tables (all genes) as Supporting Information, as well as a focused table for the genes highlighted in the manuscript.

For dataset GSE199210, although the raw data are available in the SRA repository, we faced ongoing technical issues when trying to download and process the full FASTQ files locally because of their large size and the high computational demands. We tried several times to use the SRA Toolkit to download and convert the .sra files, but we could not process the full FASTQ datasets with our available hardware. Because of this, we used the FPKM-based values from the GEO repository for this dataset as supporting information for our findings in the supplementary material (Figure S2). All analyses from these data were done transparently and are clearly described in the Methods section.

 

  1. RT-qPCR Tissue Profiling Is Qualitative Rather Than Quantitative
  • Presence/absence data do not support robust PCA and correlation analyses.

We thank the reviewer for this thoughtful and important comment. We respectfully wish to clarify that the objective of our RT-qPCR tissue profiling was not to quantify relative expression levels, but to determine tissue-specific expression occurrence in order to evaluate co-expression patterns among the studied genes.

Our analytical approach was intentionally designed to assess whether gene expression events co-occur across tissues, rather than to compare expression magnitudes. In this context, the presence/absence classification constitutes biologically meaningful information for identifying coordinated expression patterns and potential functional associations. Therefore, the correlation and PCA analyses were performed to explore relationships based on expression distribution across tissues, which is consistent with the qualitative objective of this study.

We agree that quantitative normalization methods such as ΔCt or ΔΔCt are appropriate when the goal is to compare expression levels or fold changes. However, applying those quantitative metrics in this case would address a different biological question and would shift the focus of the study from expression association to expression level quantification, which is beyond the intended scope of our work.

We would like to clarify that the 18S ribosomal RNA gene was not used solely to assess RNA integrity, but also served as the endogenous reference gene for normalization purposes. Due to its stable and constitutive expression across different tissues, 18S is widely accepted as a housekeeping gene for RT-qPCR normalization. Examples include the following papers:

  • 10.1262/jrd.2025-017
  • 10.3347/kjp.2016.54.1.39
  • 10.1371/journal.pone.0118458

We hope this clarification resolves the reviewer’s concern and improves the transparency of our methodological approach.

 

  • Provide normalized quantitative expression values.

Thank you for this helpful suggestion. Our main goal in the tissue profiling experiment was to screen for detectability across tissues and compartments, focusing on whether targets were present or absent to help prioritize secretome candidates. We agree that including normalized quantitative values adds clarity. We have now added normalized RT-qPCR quantification in the Supplementary Information, showing relative expression values after normalization to 18S rRNA (ΔCt) and using the 2^−ΔΔCt method (Livak and Schmittgen). These extra data show expression trends beyond the simple detection results in the main text.

  • Clearly state biological replicate numbers.

Thank you for your comment. We have updated the Methods and the relevant Results and figure legends to clearly describe the replicate structure. RT-qPCR measurements were taken from three independent biological replicates for each tissue or compartment, and each biological replicate was analyzed in three technical replicates. The revised text now clearly separates biological and technical replication.

  • Recalculate correlation matrices using quantitative expression data.

Thank you for your thoughtful suggestion. We agree that correlation matrices using normalized quantitative RT-qPCR values are useful when the goal is to study how transcript abundance varies across samples. In our study, though, the tissue profiling was designed as a detection-based screen. Our correlation and PCA analyses were meant to answer a different question: do candidate transcripts tend to be detected together across tissues or compartments, no matter their abundance? That is why we used a binary matrix (present = 1; absent = 0) based on a set Ct threshold, which shows co-occurrence rather than changes in abundance. If we recalculated the correlations using 2^−ΔΔCt values, the analysis would focus on abundance and would not match the original goal of the screening step. To address your concern and stay true to our study’s intent, we have (i) clarified in the manuscript that the PCA and correlation analyses reflect co-detection or co-occurrence, not quantitative co-expression, and (ii) included the normalized quantitative RT-qPCR values as Supplementary Information for transparency.

 

  1. Network Analysis Lacks Statistical Definition
  • Define correlation cutoffs (e.g., r > X and adjusted p < Y).

We reviewed the data in the Cytoscape Edge Table, focusing on the Corr.pcor, Corr.pval, and Corr.adj.pval columns. The partial correlation network was built using Corr.pcor values, and we considered associations with |Corr.pcor| ≥ 0.30 as moderate to strong for interpretation. In the Edge Table, three edges in the INHBA, SDF2L1, and SERPINE2 module showed nominal statistical significance, with the lowest Corr.pval at 0.0195. However, after correcting for multiple testing, none of the correlations remained significant, with the lowest Corr.adj.pval at 0.682. As a result, these associations should be seen as exploratory, but their strong and consistent correlation pattern in this module suggests they may have biological relevance.

  • Provide network topology metrics (degree, betweenness centrality).

We reviewed the network topology metrics in Cytoscape (Node Table / Network Analyzer). Degree and betweenness centrality were uniform across all nodes: the network contains 15 nodes, and each node had degree = 14, while betweenness centrality was 0 for every node. Network topology analysis therefore revealed a fully connected graph (15 nodes, each with degree = 14), resulting in a uniform degree distribution and betweenness centrality values of zero for all nodes. Therefore, no node met the criteria for hub status based on topological metrics.

  • Avoid strong “hub” terminology unless quantitatively suppor.

We thank the reviewer for this important observation. Following quantitative network topology analysis (degree and betweenness centrality), we confirmed that the network represents a fully connected graph in which all nodes display identical degree (14) and zero betweenness centrality. Therefore, no node meets objective criteria for hub designation based on topological metrics. In light of these findings, we have removed the term ‘hub’ throughout the manuscript and now describe the INHBA–SDF2L1–SERPINE2 module as a strongly correlated module rather than a topological hub.

  1. Promoter Motif Analysis Is Overinterpreted
  • Perform motif enrichment analysis (e.g., AME or HOMER).
  • Include random promoter controls.
  • Soften mechanistic language throughout discussion and conclusions.

Thank you for this important comment regarding potential overinterpretation of promoter motif scanning. In response, we complemented the FIMO site-level motif calls with a formal motif enrichment analysis using AME (MEME Suite), testing the INHBA, SERPINE2, and SDF2L1 promoter sequences as a foreground set against dinucleotide-preserving shuffled control sequences and the same curated JASPAR PWM libraries (FSH/cAMP-related factors and SMAD/SP/TBP families). Under this enrichment framework, no motif remained significantly enriched after multiple-testing correction, indicating that while putative motif occurrences can be detected, the current data do not provide robust statistical support for shared promoter enrichment across this limited promoter set. We have therefore revised the Results/Discussion/Conclusions to describe these sites as putative and hypothesis-generating, and to clarify that promoter motif analyses are consistent with potential upstream inputs rather than demonstrating shared regulatory control or a validated mechanistic pathway; we also note that direct experimental validation (e.g., ChIP-based assays and/or functional reporter tests) will be required in future work. Finally, to ensure transparency and reproducibility, we provide the complete AME enrichment outputs as Supplementary Table (Table S6).

 

  1. Lack of Functional Validation

Thank you for your helpful comment about the lack of functional validation. We agree that our evidence, including co-expression patterns, promoter motif co-occurrence, and support from the literature, suggests coordinated regulation but does not prove causality. In response, we have used less mechanistic language throughout the manuscript and replaced the term “coordinated axis” with more cautious wording, now referring to a “putative INHBA–SERPINE2–SDF2L1 co-regulated module.” We made these changes in the title, Abstract, Discussion, and Conclusions to keep our interpretation descriptive and clearly present it as a hypothesis-generating model that will need future experimental validation.

Minor Comments

  • Section 3.2 contains duplicated text.
  • Inconsistent gene naming (SERPINE vs SERPINE2).
  • Minor formatting inconsistencies in Table 2 chromosomal coordinates.
  • Heatmap lacks clear correlation value scale.
  • Statistical software versions not consistently reported.

Thank you for your helpful suggestion. We reviewed the entire manuscript to remove minor inconsistencies, such as duplicated text, gene names, and formatting issues in the tables. We also updated the heatmap by adding a clear color key, so the Pearson correlation coefficient range is easy to see and interpret. Finally, we revised the Discussion to remove repeated sentences and make it clearer and easier to read.

 

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors

The paper, applbiosci-4149620, entitled “Secretome Screening and Multi-Level Expression Analyses Reveal a Coordinated INHBA-SERPINE2-SDF2L1 Axis within the bovine cumulus-oocyte complex”, uses in silico tools combined with expression to identify an INHBA-SERPINE2-SDF2L1 axis with potential roles in the regulation of cumulus-oocyte interplay. Despite the quality of the results, the study provided RNA-seq of the candidates in only one stage of COCs' development and gene expression across different body tissue types, rather than across the different COCs developmental stages. This does not detract from the study, but an approach involving different developmental stages, e.g., before and after in vitro maturation, would be more enlightening. Further, the manuscript is mostly well-written; however, it may be improved in some points.

 

Following is a point-by-point analysis:

 

  • Line 17: Please replace “work” with” study”. Change also occurs across the text in lines 81 and 162.
  • Line 43: The term “works” could be a better choice instead of “function”
  • Line 49: “further studies are required to elucidate the extracellular signalling landscape” “further work is required to delineate the extracellular signalling landscape”
  • Lines 51 – 52: My suggestion is rewriting “A significant reason this communication is so effective is that it is mediated through multiple complementary mechanisms” as following “One of the main reasons for the effectiveness of this communication is that it is mediated by multiple complementary mechanisms.”
  • Introduction: The introduction is good, but my suggestion is to divide the first paragraph in 3 new ones as follows: at line 53, starting with “Direct contact” and at line 66, starting with “As follicles advance”.
  • Line 111. Please check if the signal “≤” is correct. Based on the figure 1, the signal here is the “<”.
  • Table 1:
    • In the column “Source,” you started and finished the cell type as the source, but in the row for LIBEST_028120, you used the tissue as a source for the biological information. My suggestion is to keep the cell type as the source, in this case “immature/matured oocyte, 2-4-8-16-cell embryo, blastocyst” or “oocyte, preimplantation embryo.”
  • How do you determine that the access number LIBEST_015406 corresponds to the oocyte?
  • Line 144: How did the authors classify outer and inner cumulus cells?
  • Line 156:
    • Why did the authors analyze all cumulus cells and all the oocytes together despite of their developmental stage?
    • And the author thinks the result would be if the analysis were done in each cell type (cumulus cell and oocyte) of immature COCs, and after in vitro maturation of COCs? I know that just guessing something about this is not a good choice, but are you considering some experiment in this way?
  • Figure 2:
    • The use of the patterns makes it difficult to read what is over each pattern. Consider using different shades of gray instead of the patterns, or writing it in textboxes outside the pie chart.
  • Table 2:
    • The  NCBI/protein search for NP_001071494.1 recovered only an obsolete version of this protein. Please revise the accession number and consider excluding the data for this gene/protein.
  • Figure 3:
    • The figure shows, in the first 5 graphics, the most expressed genes in cumulus cells. My suggestion, to make it easy to visualize, is dividing this into two subsets, A and B, with A being a column with the five most expressed genes in cumulus cells, and with B two columns with the other graphics, with five graphics in each column.
    • The names of genes, values of p and cell types should be written bigger and darker than they currently are.
  • Line 603: Please replace “pb” with “bp”. Are the authors referring to base pairs, right?

 

Regards

Author Response

 

The paper, applbiosci-4149620, entitled “Secretome Screening and Multi-Level Expression Analyses Reveal a Coordinated INHBA-SERPINE2-SDF2L1 Axis within the bovine cumulus-oocyte complex”, uses in silico tools combined with expression to identify an INHBA-SERPINE2-SDF2L1 axis with potential roles in the regulation of cumulus-oocyte interplay. Despite the quality of the results, the study provided RNA-seq of the candidates in only one stage of COCs' development and gene expression across different body tissue types, rather than across the different COCs developmental stages. This does not detract from the study, but an approach involving different developmental stages, e.g., before and after in vitro maturation, would be more enlightening. Further, the manuscript is mostly well-written; however, it may be improved in some points.

Thank you for your valuable and helpful comment. We agree that studying different COC developmental stages, such as before and after in vitro maturation, would provide more insight and help us further test and improve the proposed INHBA, SERPINE2, and SDF2L1 axis. This is an excellent suggestion and is the next step in our research. In this study, our main goal was to carry out a broad screening to find secreted or secretory-pathway proteins that might be involved in cumulus–oocyte communication and COC development. We used in silico integration with available expression data as a starting point to select candidates for future stage-specific studies and experimental validation.

Following is a point-by-point analysis:

  • Line 17: Please replace “work” with” study”. Change also occurs across the text in lines 81 and 162.

Thank you for your careful suggestion. We have updated the manuscript by replacing “work” with “study” throughout the text to keep the terminology consistent.

  • Line 43: The term “works” could be a better choice instead of “function”

Thank you for your helpful suggestion. We have updated the text by changing “function” to “works” to better capture the intended meaning.

  • Line 49: “further studies are required to elucidate the extracellular signalling landscape” “further work is required to delineate the extracellular signalling landscape”

We thank the reviewer for this valuable wording suggestion. We have revised the sentence accordingly, replacing “further studies are required to elucidate the extracellular signalling landscape” with “further work is required to delineate the extracellular signalling landscape.”

  • Lines 51 – 52: My suggestion is rewriting “A significant reason this communication is so effective is that it is mediated through multiple complementary mechanisms” as following “One of the main reasons for the effectiveness of this communication is that it is mediated by multiple complementary mechanisms.”

We thank the reviewer for this clear and helpful rewording suggestion. We have revised the sentence accordingly, adopting the proposed phrasing to improve clarity and readability.

  • Introduction: The introduction is good, but my suggestion is to divide the first paragraph in 3 new ones as follows: at line 53, starting with “Direct contact” and at line 66, starting with “As follicles advance”.

Thank you for your helpful suggestion to improve readability. We revised the Introduction by dividing the original first paragraph into three shorter ones, as you recommended.

  • Line 111. Please check if the signal “≤” is correct. Based on the figure 1, the signal here is the “<”.

Thank you for pointing this out. We have changed the symbol from “≤” to “<” so it matches Figure 1.

  • Table 1:
    • In the column “Source,” you started and finished the cell type as the source, but in the row for LIBEST_028120, you used the tissue as a source for the biological information. My suggestion is to keep the cell type as the source, in this case “immature/matured oocyte, 2-4-8-16-cell embryo, blastocyst” or “oocyte, preimplantation embryo.”

We thank the reviewer for this important observation and constructive suggestion. We agree that maintaining consistency in the description of the biological source is essential. Following the reviewer´s recommendation, the source for the library LIBEST_028120 has been revised to reflect the corresponding cell types rather than the tissue.

The term has been corrected to:

“Oocyte and preimplantation embryo”

This change has been incorporated into the revised version of Table 1 to ensure consistency and improve clarity.

 

  • How do you determine that the access number LIBEST_015406 corresponds to the oocyte?

We thank the reviewer for this important comment. After revisiting the original database record, we confirmed that the library LIBEST_015406 was generated from multiple cell types associated with the bovine female reproductive tract and early development, including oocytes, embryos, placental cells, and other reproductive cell types. Following the reviewer´s recommendation to maintain the cell type as the biological source, we have revised Table 1 and updated the description to:

“Oocyte, embryonic, placental, and reproductive tract cell types”

This modification ensures accuracy and consistency with the original database annotation and the reviewer´s suggestion. 

 

  • Line 144: How did the authors classify outer and inner cumulus cells?

We appreciate the reviewer’s request for clarification. The classification of outer and inner cumulus cells was not conducted by our group. This separation was originally performed by Biase et al. (2018) as part of their experimental design. We utilized their publicly available dataset, which is deposited in NCBI GEO under accession GSE99678. In summary, Biase et al. (2018) chemically dissected each cumulus-oocyte complex (COC) by initially exposing it to trypsin and removing the outer cumulus cell layer through gentle pipetting. The remaining COC, consisting of the oocyte and the cumulus layers adjacent to the zona pellucida, was washed and subjected to a second trypsin treatment. The remaining cumulus layers were then removed by gentle pipetting to isolate the inner cumulus cells.

Biase, F.H.; Kimble, K.M. Functional Signaling and Gene Regulatory Networks between the Oocyte and the Surrounding Cumulus Cells. BMC Genomics 2018, 19, doi:10.1186/s12864-018-4738-2.

  • Line 156:
    • Why did the authors analyze all cumulus cells and all the oocytes together despite of their developmental stage?

We appreciate the reviewer’s insightful question. In this study, we pooled cumulus cell and oocyte samples across developmental competence groups to conduct an initial, targeted screening of candidate secreted and secretory-pathway factors. Our aim was to determine whether the proposed INHBA–SERPINE2–SDF2L1 axis exhibits a robust compartment-specific expression pattern (cumulus versus oocyte) across two independent public datasets (GSE99678 and GSE199210). For GSE99678, we combined inner and outer cumulus fractions into a single “cumulus cells” group to facilitate a two-compartment comparison with oocytes. In GSE199210, we merged BCB+ and BCB− samples to enable a compartment-level analysis independent of developmental competence, thereby increasing statistical power and ensuring a consistent cross-study framework. We acknowledge that stratified analyses, such as comparing immature versus post-IVM COCs or analyzing BCB+ and BCB− samples separately, would provide greater biological resolution. We consider this an important direction for future analyses and experimental validation.

    • And the author thinks the result would be if the analysis were done in each cell type (cumulus cell and oocyte) of immature COCs, and after in vitro maturation of COCs? I know that just guessing something about this is not a good choice, but are you considering some experiment in this way?

We appreciate the reviewer’s question. We concur that implementing a stage-stratified design (immature versus post–in vitro maturation COCs), with separate analyses of cumulus cells and oocytes, would enhance biological resolution and may reveal maturation-dependent modulation of the proposed INHBA–SERPINE2–SDF2L1 module. Within the mechanistic framework outlined in the manuscript, one plausible scenario is that, in immature COCs, this axis favors coordinated cumulus–oocyte support and maintenance of a stable extracellular environment. For example, INHBA/activin-driven signaling in cumulus cells may occur alongside a relatively higher level of SERPINE2-mediated restraint of pericellular proteolysis. In contrast, following IVM, the module may transition to a remodeling-permissive state that facilitates cumulus expansion, potentially through reduced SERPINE2 constraint, while the oocyte increases its need for proteostasis and ER-associated support during the acquisition of developmental competence, potentially involving higher SDF2L1. We emphasize that these are testable hypotheses rather than definitive conclusions, and we are considering follow-up studies to address these questions.

  • Figure 2:
    • The use of the patterns makes it difficult to read what is over each pattern. Consider using different shades of gray instead of the patterns, or writing it in textboxes outside the pie chart.

Thank you for your helpful suggestion to improve the readability of Figure 2. We have updated the figure by using a clearer color scheme instead of patterned fills. This change makes the overlaid text easier to read and helps distinguish the categories more clearly.

  • Table 2:
    • The  NCBI/protein search for NP_001071494.1 recovered only an obsolete version of this protein. Please revise the accession number and consider excluding the data for this gene/protein.

Thank you for pointing out this issue. We checked accession NP_001071494.1 and have removed this gene/protein from Table 2. It has also been excluded from all later analyses and interpretations in the manuscript.

  • Figure 3:
    • The figure shows, in the first 5 graphics, the most expressed genes in cumulus cells. My suggestion, to make it easy to visualize, is dividing this into two subsets, A and B, with A being a column with the five most expressed genes in cumulus cells, and with B two columns with the other graphics, with five graphics in each column.
    • The names of genes, values of p and cell types should be written bigger and darker than they currently are.

Thank you for your helpful suggestion to improve the visualization of Figure 3. We have reorganized the figure into two clear panels, A and B, and increased the font size and contrast of gene names, values, and cell-type labels to improve readability.

  • Line 603: Please replace “pb” with “bp”. Are the authors referring to base pairs, right?

 Thank you for pointing this out. We are referring to base pairs and have updated the text by changing “pb” to “bp”.

 

 

 

Reviewer 3 Report

Comments and Suggestions for Authors
  1. Authors must standardize the RNA-seq processing. Re-process GSE99678 and GSE199210 raw data using a uniform processing pipeline against the latest ARS-UCD2.0 genome (which is already used in the promoter analysis). Relying on FPKM from one study and CPM from another, mapped to different assemblies, is not scientifically rigorous for a comparative computational analysis. Also, use TPM/CPM normalization instead of FPKM. FPKM fails to maintain a constant sum across samples and is not recommended by the bioinformatics community. 
  2. Currently, authors perform a targeted analysis of 15 candidate genes (Figure 3) using a public RNA-seq dataset and then apply a simple threshold of p < 0.05 for significance. In the context of high-throughput transcriptomic data, this introduces a selection bias and does not rule out the possibility that the expression differences among these genes are due to random chance. Authors must perform global DE analysis (using  DESeq2 or edgeR) on the reprocessed datasets and verify that the adjusted p-values (FDR) are significant for these genes. 
    1. Also, using an unpaired Welch t-test is inappropriate for matched-sample designs. Given a small sample size, using paired non-parametric testing is recommended.
  3. The correlation network analysis uses presence/absence data. This ignores the actual expression values and could lead to weak/misleading correlations. For a more rigorous analysis, consider network analysis using relative fold changes for each gene. 
  4. Computational motif analysis often has a high false-positive rate. Without functional assays, the conclusion statement “These endocrine and paracrine signals converge on a shared upstream transcriptional control module involving INHBA, SERPINE2, and SDF2L1”  should be softened as a potential mechanism/hypothesis rather than a proven mechanism. 
  5. Include the software version of all packages used (R, R packages, etc.) in the methods.

Minor comments:

Section 3.2. The opening sentence is repeated twice. 



Author Response

We thank the Reviewer for their time and constructive feedback. These comments have improved the clarity, rigor, and quality of our work. We have addressed each point and made the corresponding revisions throughout the manuscript.

1. Authors must standardize the RNA-seq processing. Re-process GSE99678 and GSE199210 raw data using a uniform processing pipeline against the latest ARS-UCD2.0 genome (which is already used in the promoter analysis). Relying on FPKM from one study and CPM from another, mapped to different assemblies, is not scientifically rigorous for a comparative computational analysis. Also, use TPM/CPM normalization instead of FPKM. FPKM fails to maintain a constant sum across samples and is not recommended by the bioinformatics community.

We thank the Reviewer for this important suggestion. We agree that using heterogeneous processing strategies (e.g., FPKM in one dataset and CPM in another, mapped to different assemblies) is not appropriate for comparative computational analyses. Accordingly, we standardized our RNA-seq processing across all samples for which raw data were available. For GSE99678, we retrieved the raw sequencing reads from the SRA and reprocessed them using a uniform pipeline based on the current ARS-UCD2.0 reference. Briefly, transcript abundance was quantified with Salmon and imported with tximport to generate gene-level count matrices for differential expression testing in DESeq2 (cumulus cells vs. oocyte). In addition, for visualization and descriptive purposes, we report normalized abundances using TPM/CPM rather than FPKM.

For GSE199210, we attempted to obtain the raw FASTQ files to apply the same reprocessing workflow; however, the raw reads could not be retrieved successfully from the public repository at the time of analysis. Therefore, we retained the expression values from the original study (FPKM) only as supporting information in the Supplementary Material (Figure 2S) and explicitly avoided using FPKM values for direct quantitative cross-study comparisons. We have clarified these points in the revised manuscript to make the harmonization strategy and its limitations transparent.

2. Currently, authors perform a targeted analysis of 15 candidate genes (Figure 3) using a public RNA-seq dataset and then apply a simple threshold of p < 0.05 for significance. In the context of high-throughput transcriptomic data, this introduces a selection bias and does not rule out the possibility that the expression differences among these genes are due to random chance. Authors must perform global DE analysis (using  DESeq2 or edgeR) on the reprocessed datasets and verify that the adjusted p-values (FDR) are significant for these genes. 

    1. Also, using an unpaired Welch t-test is inappropriate for matched-sample designs. Given a small sample size, using paired non-parametric testing is recommended.

We thank the Reviewer for raising this important concern. We agree that applying a nominal p < 0.05 threshold to a targeted gene set in the context of transcriptome-wide data can introduce selection bias and may overstate significance if multiple-testing is not properly controlled. To address this, we performed a transcriptome-wide differential expression analysis on the reprocessed GSE99678 dataset. Briefly, raw reads were reprocessed using a uniform pipeline against the current ARS-UCD2.0 reference, gene-level counts were generated, and differential expression testing (cumulus cells vs. oocyte) was conducted using DESeq2 with Benjamini–Hochberg correction to control the false discovery rate (FDR). We then verified the candidate genes in these global DESeq2 results and updated Figure 3 and the corresponding Supplementary Table S2 to report log2 fold-changes and adjusted p-values (FDR), rather than relying on a nominal p < 0.05 threshold from a targeted analysis. During this verification and standardization step in GSE99678, we also refined the candidate list to ensure that all genes can be consistently traced to current public annotations. TNC was removed because it is no longer retrievable/available in the corresponding NCBI record used in our annotation workflow, and we aimed to avoid reporting results for entries that cannot be reliably referenced. In addition, SRPX was removed because, with the updated quantification reference used to build the Salmon index for GSE99678, SRPX transcripts were not present/quantifiable in the indexed transcriptome, and therefore, no gene-level counts could be generated for DESeq2 testing. For GSE199210, we attempted to retrieve and reprocess the raw FASTQ files in the same manner; however, due to technical limitations in accessing the raw data from the public repository, we were unable to complete raw-read reprocessing and global differential expression analysis for this dataset. Therefore, we retained the expression values from the original study only as supporting information in the Supplementary Material and explicitly avoided using them for direct cross-study quantitative comparisons. Regarding the statistical test, we also agree that an unpaired Welch’s t-test is not appropriate when samples are matched/paired, and that small sample sizes warrant more robust approaches. Accordingly, we removed the unpaired Welch t-test–based inference and rely on the DESeq2 framework for statistical significance in GSE99678, emphasizing effect sizes and FDR-adjusted results.

3. The correlation network analysis uses presence/absence data. This ignores the actual expression values and could lead to weak/misleading correlations. For a more rigorous analysis, consider network analysis using relative fold changes for each gene. 

We thank the Reviewer for this thoughtful and important comment. We would like to clarify that the objective of our presence/absence-based analysis was not to quantify relative expression levels, but to determine whether candidate genes show tissue/compartment-specific expression occurrence and to evaluate co-occurrence patterns across the studied groups. Our analytical approach was intentionally designed to assess whether gene expression events tend to co-occur across samples/compartments, rather than to compare expression magnitudes. In this context, the presence/absence classification provides biologically meaningful information for identifying coordinated detection patterns and potential functional associations. We clarified in the revised manuscript that the network relationships should be interpreted as qualitative co-detection (co-occurrence) rather than fold–change–based co-expression.

In addition, regarding the suggestion to recalculate correlation matrices using quantitative expression data, our tissue profiling was designed as a detection-based screen rather than an abundance-based comparison. For this reason, the correlation/PCA analyses were performed to evaluate whether candidate transcripts tend to be detected together across compartments, using a predefined detection threshold; thus, we used a binary matrix (present = 1; absent = 0). Recomputing correlations using normalized quantitative values (e.g., ΔCt or 2^−ΔΔCt) would instead address transcript abundance variation across samples and shift the analysis away from the study’s central objective (co-detection/co-occurrence). Therefore, we retained the detection-based framework for the main analyses and included normalized quantitative RT-qPCR values in the Supplementary Information (Figure S1) to provide additional context.

4. Computational motif analysis often has a high false-positive rate. Without functional assays, the conclusion statement “These endocrine and paracrine signals converge on a shared upstream transcriptional control module involving INHBA, SERPINE2, and SDF2L1”  should be softened as a potential mechanism/hypothesis rather than a proven mechanism. 

We thank the Reviewer for this important observation regarding the potential for false positives in computational motif analyses. In response, we softened the conclusion to clearly frame this as a putative mechanism/hypothesis rather than a proven regulatory mechanism.

Revised conclusion sentence (as now written in the manuscript):
“These results suggest that endocrine and paracrine signals may converge on a shared upstream transcriptional control module involving INHBA, SERPINE2, and SDF2L1.”

This change can be found in the Conclusions section, lines 712–713 of the revised manuscript.

5. Include the software version of all packages used (R, R packages, etc.) in the methods.

We thank the Reviewer for this comment. We have now specified the RStudio version used for all analyses. This information has been added in the Materials and Methods section, where it now reads: “implemented in RStudio (Posit Software, PBC; version 2026.01.0+392) …”. The Reviewer can find this update around line 203 of the revised manuscript.

Minor comments:
Section 3.2. The opening sentence is repeated twice. 

Thank you for noting this minor issue. We have corrected it by removing the repeated opening sentence in Section 3.2 in the revised manuscript.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I believe this revised version is suitable for publication.

Author Response

We sincerely thank the reviewer for this positive evaluation and for the time and effort devoted to assessing our revised manuscript. We greatly appreciate the reviewer’s comment indicating that the revised version is suitable for publication.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors

 

Thank you for the revised version of the paper, applbiosci-4149620, entitled “Secretome Screening and Multi-Level Expression Analyses Reveal a Coordinated INHBA-SERPINE2-SDF2L1 Axis within the bovine cumulus-oocyte complex”. I observed that the authors have incorporated my suggestions into the text. Also, my questions were answered, and the requested information was added to the text. The figures were also improved, especially Figure 1, which is now easily read. The manuscript was greatly improved.

 

Regards

Author Response

We sincerely thank the reviewer for this thoughtful and encouraging assessment of our revised manuscript. We greatly appreciate the reviewer’s time, careful evaluation, and positive feedback. We are especially grateful for the recognition that the requested information was adequately incorporated, the questions were addressed, and the figures were improved, particularly Figure 1. The reviewer’s comments were very helpful in strengthening the quality and clarity of our manuscript.

     

Reviewer 3 Report

Comments and Suggestions for Authors

Minor comment: 
Software/Package versions are still missing in the methods. 
For reproducibility, indicate the version of R ( currently it shows v4. x) and R packages (DESeq2, readr,dplyr, etc.) 

Author Response

We sincerely thank the reviewer for this valuable observation. In response, we carefully revised the Materials and Methods section and added the version information for all software tools and packages used throughout the study. For clarity, all newly incorporated version details have been highlighted in yellow in the revised manuscript.

Back to TopTop