Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

MedSegNet10: A Publicly Accessible Network Repository for Split Federated Medical Image Segmentation

Bioengineering 2026, 13(1), 104; https://doi.org/10.3390/bioengineering13010104

by Chamani Shiranthika^*,†

, Zahra Hafezi Kafshgari^*,†

, Hadi Hadizadeh

and Parvaneh Saeedi

Reviewer 1: Anonymous

Reviewer 2:

Yeliz Karaca

Reviewer 3:

Yuanshen Zhao

Bioengineering 2026, 13(1), 104; https://doi.org/10.3390/bioengineering13010104

Submission received: 30 October 2025 / Revised: 26 December 2025 / Accepted: 10 January 2026 / Published: 15 January 2026

(This article belongs to the Special Issue Medical Imaging Analysis: Current and Future Trends)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

I have now completed the review of the manuscript, and I found the manuscript is interesting and, in general, fairly well-written.

However, I still have some suggestions to further improve the quality of the manuscript.

I would like to suggest that the authors address these limitations in the article, either by discussing them in the limitations section or, where feasible, by making the appropriate revisions:

1. Testing on only three medical imaging datasets (blastocysts, skin lesions, and polyps) is insufficient to demonstrate generalizability across the diverse landscape of medical imaging applications. The paper claims applications "extending beyond these examples" without validation.

2. Some recent findings could be stated in the introduction. For example, "Explainable AI in Clinical Decision Support Systems: A Meta-Analysis of Methods, Applications, and Usability Challenges" - This would complement your understanding of how AI models (including federated learning systems) can be made interpretable in clinical settings, which is crucial when deploying segmentation models in healthcare.

3. The Blastocyst dataset contains only 801 images, and KVASIR-SEG only 1,000 images. These are extremely small datasets by modern deep learning standards. The HAM10K dataset at 10,015 images provides more substantial data, but even this is modest. The repository's utility for real-world healthcare applications with larger-scale data remains undemonstrated.

4. Discussion would be extended by briefly mentioning latest research, to show readers future research possibilities. I would like to recommend authors to find some articles exploring the role of artificial intelligence in smart healthcare. There are some capability and function-oriented reviews on this thesis and mentioning it would provide broader context for how AI techniques like split federated learning fit into the larger smart healthcare ecosystem.

5. The random distribution of data across clients does not reflect realistic healthcare scenarios where different hospitals or institutions have systematically different patient populations, imaging equipment, or annotation practices. The paper does not test for data heterogeneity (non-IID data), which is one of the primary challenges in federated learning.

Thank you for your valuable contributions to our field of research. I look forward to receiving the revised manuscript.

Author Response

1. Summary
Thank you very much for taking time to review our manuscript. Please find the detailed responses below and the corresponding revisions/corrections highlighted in blue in the resubmitted file.

2. Point-by-point response to comments and suggestions for authors

Comment 1: Testing on only three medical imaging datasets (blastocysts, skin lesions, and polyps) is insufficient to demonstrate generalizability across the diverse landscape of medical imaging applications. The paper claims applications "extending beyond these examples" without validation.
Response 1: We understand the statement “extending beyond these examples” is ambiguous. To avoid overstating generalizability, we revised the text to describe the specific medical image types included in MedSegNet10. The updated statement now reads:

“MedSegNet10 provides a collection of pre-trained neural network architectures optimized for various medical image types, including microscopic images of human blastocysts, dermatoscopic images of skin lesions, and endoscopic images of lesions, polyps, and ulcers”

Thereby, our work already was tested on three distinct and commonly studied image types, which are most commonly available in the medical domain.

We also added an explicit clarification in the Limitations & Future Work section to acknowledge this restricted scope of our evaluation and the inherent scarcity of large open-source medical datasets.

“First, our evaluation is limited to three distinct and commonly studied publicly available image types. Although we considered both multi-class (Blastocyst) and binary (HAM10K and KVASIR-SEG) segmentation datasets with varying sample sizes and styles to broaden the scope of generalization, these datasets may still not fully reflect the diversity of imaging characteristics, modalities, and annotation practices encountered in large-scale clinical deployments. Consequently, the reported results should not be regarded as definitive evidence of cross-domain generalizability. Future work will involve evaluating MedSegNet10 across a wider range of imaging modalities (e.g., CT, MRI, and multimodal acquisitions), annotation practices, and datasets to more comprehensively assess its robustness.”

Comment 2: Some recent findings could be stated in the introduction. For example, "Explainable AI in Clinical Decision Support Systems: A Meta-Analysis of Methods, Applications, and Usability Challenges" - This would complement your understanding of how AI models (including federated learning systems) can be made interpretable in clinical settings, which is crucial when deploying segmentation models in healthcare.
Response 2: We thank the reviewer for this suggestion.
We cited this paper and added this to the introduction:
“Alongside these technical challenges, interpretability has become a parallel requirement for the successful deployment of AI systems in real clinical environments. Clinicians expect models not only to perform accurately but also to provide insights that are consistent with clinical reasoning. Meta-analyses such as \citep{abbas2025explainable, barragan2022towards} underscore that transparency, usability, and clinician-aligned explanations are crucial for trustworthy AI-assisted decision support. This growing emphasis on interpretability poses additional challenges for decentralized learning: if medical image segmentation models are to be trained collaboratively across institutions, they must also provide insight into how decisions are made, particularly in high-stakes medical settings.”
We believe this revision strengthens the introduction by connecting segmentation performance with clinical interpretability requirements.

Comment 3: The Blastocyst dataset contains only 801 images, and KVASIR-SEG only 1,000 images. These are extremely small datasets by modern deep learning standards. The HAM10K dataset at 10,015 images provides more substantial data, but even this is modest. The repository's utility for real-world healthcare applications with larger-scale data remains undemonstrated.
Response 3:

Response 3: We thank the reviewer for this observation. We acknowledge the fact that the datasets used in this research are modest in size. However, this setting reflects real-world medical scenarios, where some clients may have access to fewer samples while others have larger datasets. Within a Split Federated framework, clients of all sizes should be able to contribute meaningfully to the final global model. Moreover, many medical datasets are not publicly available due to strict privacy regulations, which limits access to large-scale collections. Therefore, we relied on publicly accessible datasets that represent common segmentation scenarios.

We modified the “Limitations & Future works” section as follows:

“Second, the datasets used in this study are modest in size. This limitation reflects the broader reality of medical imaging research- large open-source datasets are rare, expert-annotated data are costly, many image types are difficult to obtain, and privacy regulations further restrict access. Consequently, it is naturally infeasible to evaluate decentralized frameworks on large-scale public datasets simply because such resources do not exist for many medical imaging tasks. MedSegNet10 should therefore be viewed as a foundational resource rather than a demonstration of large-scale scalability. Future work will involve expanding MedSegNet10 using larger institutional datasets and multi-centre cohorts, enabling more rigorous evaluation under realistic data volumes, acquisition variability, and deployment conditions.”

Comment 4: Discussion would be extended by briefly mentioning latest research, to show readers future research possibilities. I would like to recommend authors to find some articles exploring the role of artificial intelligence in smart healthcare. There are some capability and function-oriented reviews on this thesis and mentioning it would provide broader context for how AI techniques like split federated learning fit into the larger smart healthcare ecosystem.
Response 4: Thank you for this valuable suggestion.
We modified the “Limitations and Future works” section as follows:
“Finally, recent capability-oriented reviews in smart healthcare \citep{nasr2021smart} highlight how AI contributes to integrated monitoring \citep{nasr2021smart, rath2024artificial}, remote diagnostics \citep{mansour2021artificial}, decentralized decision support \citep{moreira2019comprehensive}, and data-driven hospital systems \citep{katal2024ai}. SplitFed architectures align naturally with these developments because they enable collaboration across institutions while preserving data privacy. Incorporating SplitFed networks from MedSegNet10 into smart-healthcare frameworks could facilitate interoperable, privacy-preserving segmentation tools that operate across hospitals or global imaging networks. This integration represents a promising avenue for extending MedSegNet10 beyond standalone model training towards deployment in real clinical infrastructures.”

Comment 5: The random distribution of data across clients does not reflect realistic healthcare scenarios where different hospitals or institutions have systematically different patient populations, imaging equipment, or annotation practices. The paper does not test for data heterogeneity (non-IID data), which is one of the primary challenges in federated learning.
Response 5: Thank you for this comment. We intentionally used IID data partitions to establish a controlled baseline for benchmark reproducibility. However, real healthcare institutions show non-IID characteristics such as device variability, demographic differences, and annotation style inconsistencies. Our goal in this work was to standardize the evaluation environment rather than simulate cross-institution heterogeneity.
To address this valuable comment, we added this section to the “Limitations and future works” section:
“Third, the current experiments intentionally adopt IID data partitions to establish a controlled baseline and support benchmark reproducibility. This design choice does not reflect real clinical environments, where hospitals often exhibit strongly non-IID data distributions due to differing demographics, imaging devices, and annotation protocols. Non-IID robustness is a central challenge in federated learning, and evaluating MedSegNet10 under a range of realistic non-IID scenarios is an important direction for future work.”

4. Additional Clarifications
N/A

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors The study is on MedSegNet10 for medical image segmentation by means of split-federated learning. The first documented repository of SplitFed networks for medical image segmentation tasks is presented in the work. As suggestions, the following may be integrated: The main objective of the paper should be clearly indicated. What has been achieved should be indicated as well. Broad statements can be avoided, and more specific ones should be added to reflect the essence and motivation of the paper. (i.e. By leveraging SplitFed’s benefits, MedSegNet10 allows collaborative training on privately stored, horizontally split data, ensuring privacy and integrity.) + (“As SplitFed continues to evolve, we believe that MedSegNet10 would serve as a valuable foundation for ongoing developments, guiding researchers toward more precise and impactful solutions in medical image segmentation.”) The paper can be compared with previous ones so that the novel and original parts of the paper can be shown. If there are parts generated by AI, those parts can be checked and restated. It can be made sure if all the figures and tables are cited and interpreted adequately. Yours faithfully,

Author Response

2. Point-by-point response to comments and suggestions for authors
Comment 1: The main objective of the paper should be clearly indicated. What has been achieved should be indicated as well.
Response 1: We thank the reviewer for this suggestion. Modified text in the introduction to clearly indicate the objective of the paper as follows:
“The primary objective of this work is to develop and release MedSegNet10, a publicly accessible repository of SplitFed-ready medical image segmentation networks. Our goal is to standardize, benchmark, and streamline the development of SplitFed architectures by providing rigorously implemented and fully reproducible SplitFed variants of widely used segmentation models. MedSegNet10 serves as a practical and comprehensive resource, offering both novice and experienced practitioners a unified platform for implementing and evaluating SplitFed-based segmentation. It consolidates access to multiple architectures, enabling systematic comparison of structural differences, performance characteristics, and SplitFed behaviour across models. Importantly, this work does not introduce new SplitFed algorithms or novel segmentation architectures; instead, its contribution lies in establishing the first unified, reusable, and reproducible SplitFed repository for medical image segmentation, with standardized split-point definitions, consistent client–server partitioning, and harmonized training pipelines.”

Comment 2: Broad statements can be avoided, and more specific ones should be added to reflect the essence and motivation of the paper. (i.e. By leveraging SplitFed’s benefits, MedSegNet10 allows collaborative training on privately stored, horizontally split data, ensuring privacy and integrity.) + (“As SplitFed continues to evolve, we believe that MedSegNet10 would serve as a valuable foundation for ongoing developments, guiding researchers toward more precise and impactful solutions in medical image segmentation.”)
Response 2: Thank you for the suggestion. We replaced the above text as follows:
“By leveraging SplitFed’s benefits, MedSegNet10 allows collaborative training on privately stored, horizontally split data, ensuring privacy and integrity.” is changed as: “MedSegNet10 implements SplitFed versions of ten established segmentation architectures, enabling collaborative training without centralizing raw data and labels, reducing the computational load required at client sites.”
As SplitFed continues to evolve, we believe that MedSegNet10 would serve as a valuable foundation for ongoing developments, guiding researchers toward more precise and impactful solutions in medical image segmentation.” is changed as “MedSegNet10 provides a structured starting point for future work on SplitFed architectures, including extensions to more diverse datasets, non-IID clinical scenarios, and interpretability-oriented model designs”
Further we added a “Limitations and Future works” section to support this sentence.

Comment 3: The paper can be compared with previous ones so that the novel and original parts of the paper can be shown.
Response 3: We changed the section 2.2 as this:
“Although these repositories provide valuable tools for federated experimentation, they primarily focus on general-purpose FL pipelines, data-sharing frameworks, or isolated model implementations. Existing platforms do not provide a unified, reusable collection of \emph{SplitFed} architectures for medical image segmentation, nor do they standardize split-point definitions, client--server model partitioning, or consistent training pipelines across multiple segmentation networks. Prior SplitFed studies typically evaluate a single model or a narrow experimental setup, limiting reproducibility and cross-architecture comparison.
These limitations reveal the need for a robust and adaptable SplitFed network repository that supports collaborative and reproducible research. MedSegNet10 directly addresses this gap by streamlining the implementation, training, and evaluation of SplitFed architectures in medical image segmentation.”

3. Response to comments on the quality of English Language and Figures
The reviewer stated that “The English is fine and does not require any improvement.” Therefore, no changes were made in this regard.
We added all the figures in the .png format again to ensure high quality.
Please also refer to the attachment: Medsegnet paper_Figures.pdf for the modified figures.

4. Additional Clarifications
N/A

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1、The paper claims that MedSegNet10 is the "first SplitFed repository for medical image segmentation". However, the core value of this "repository" is merely the "integration of SplitFed versions of 10 existing segmentation models", with no original technical contributions proposed.

2、Existing federated learning repositories, such as FATE and MedAugment, already support FL-based training for medical images. MedSegNet10 only replaces "FL" with "SplitFed" and aggregates 10 models, without incorporating any additional practical functionalities.

3、Section 2.1 only lists studies related to SplitFed, failing to establish a logical progression of technical evolution. It also does not compare the core improvements, applicable scenarios, or performance bottlenecks of different studies.

4、Section 2.3 provides sufficient individual descriptions of each model but fails to extract a "unified framework for split point design". It does not explore whether there are common patterns in the "layer proportion, parameter proportion, or computational load proportion" of the Front-End /Back-End sub-models across different architectures.

5、When comparing the three model types (centralized, locally centralized, and SplitFed), the paper fails to ensure "consistency of all variables except the training paradigm", resulting in a lack of fairness in performance comparison.

6、The comparison with existing methods is severely inadequate. In Table 5, for the Blastocyst dataset, only the authors’ previous research is included for comparison, while current mainstream methods, such as SAM, are not incorporated.

7、The paper does not specify the minimum hardware requirements for clients, such as CPU model, GPU memory size, or inference frame rates of different models on edge devices (NVIDIA Jetson). Additionally, it provides no user documentation for the repository code—including environment configuration steps or adaptation tutorials for custom datasets—making practical deployment impossible for practitioners.

8、The paper does not objectively analyze the research limitations, such as its support only for RGB images and a lack of coverage for mainstream medical imaging modalities（CT, MRI).

Author Response

2. Point-by-point response to comments and suggestions for authors
Comment 1: The paper claims that MedSegNet10 is the "first SplitFed repository for medical image segmentation". However, the core value of this "repository" is merely the "integration of SplitFed versions of 10 existing segmentation models", with no original technical contributions proposed.
Response 1: Thank you very much for your valuable and constructive comments. MedSegNet10 does not propose a new SplitFed algorithm or a new segmentation architecture. The goal of this work is to fill a different gap: the absence of a unified, reusable, and reproducible SplitFed repository for medical image segmentation. Existing SplitFed studies typically evaluate only single models with inconsistent split designs and training setups, making systematic cross-architecture comparison difficult. Our contribution is therefore infrastructural rather than algorithmic, and we have clarified this explicitly in the Introduction to ensure this distinction is transparent.
We modified the relevant text in the Introduction as follows:
“The primary objective of this work is to develop and release a publicly accessible repository of SplitFed-ready medical image segmentation networks. In support of this objective, we standardize, benchmark, and streamline SplitFed development by rigorously implementing and fully reproducing SplitFed variants of ten well-established segmentation models. MedSegNet10 serves as a practical and comprehensive resource, offering both novice and experienced practitioners a unified platform for implementing and evaluating SplitFed-based segmentation. It consolidates access to multiple architectures, enabling systematic and meaningful comparisons of structural differences, performance characteristics, and SplitFed behavior across models. Importantly, this work does not introduce new SplitFed algorithms or novel segmentation architectures. Instead, its contribution lies in establishing the first unified, reusable, and reproducible SplitFed repository for medical image segmentation—where all included networks follow standardized split-point definitions, consistent client–server partitioning, and harmonized training pipelines. By creating this consistent experimental foundation, MedSegNet10 addresses a critical gap in the literature and provides the necessary infrastructure for rigorous cross-architecture comparison and future SplitFed research.”

Comment 2: Existing federated learning repositories, such as FATE and MedAugment, already support FL-based training for medical images. MedSegNet10 only replaces "FL" with "SplitFed" and aggregates 10 models, without incorporating any additional practical functionalities.
Response 2: We agree that several FL repositories exist as we shown in section 2.2; however, none provides SplitFed implementations with unified split-point definitions across multiple medical segmentation models. SplitFed differs fundamentally from FL in its client–server architecture, communication of features and gradients, and computational load distribution. We have revised Section 2.2 to clearly articulate how MedSegNet10 complements existing repositories by focusing specifically on SplitFed, which is not supported by FATE, MedAugment, or similar frameworks.
“Although these repositories provide valuable tools for federated experimentation, they primarily focus on general-purpose FL pipelines, data-sharing frameworks, or isolated model implementations. Existing platforms do not provide a unified, reusable collection of \emph{SplitFed} architectures for medical image segmentation, and they do not standardize split-point definitions, client-server model partitioning, or consistent training pipelines across multiple segmentation networks. Prior SplitFed studies typically evaluate a single model or a narrow experimental setup, limiting reproducibility and cross-architecture comparison.
These limitations reveal the need for a robust and adaptable SplitFed network repository that supports collaborative and reproducible research. MedSegNet10 directly addresses this gap by streamlining the implementation, training, and evaluation of SplitFed architectures in medical image segmentation.”

Comment 3: Section 2.1 only lists studies related to SplitFed, failing to establish a logical progression of technical evolution. It also does not compare the core improvements, applicable scenarios, or performance bottlenecks of different studies.
Response 3: We thank the reviewer for this observation. In response, we have revised Section 2.1 to present a clear and technical progression of SplitFed research. The updated text now describes how early SplitFed work established the basic client–server architecture, followed by later studies addressing specific limitations such as data heterogeneity, communication noise, label imperfections, and security risks. We also highlight recent transformer-based SplitFed extensions and discuss their benefits and computational challenges. Furthermore, the revised section explicitly compares the core contributions, applicable use-cases, and bottlenecks of existing approaches. This directs the motivation for MedSegNet10 as a unified, reproducible SplitFed repository designed to overcome the fragmentation in current SplitFed literature.

Comment 4: Section 2.3 provides sufficient individual descriptions of each model but fails to extract a "unified framework for split point design". It does not explore whether there are common patterns in the "layer proportion, parameter proportion, or computational load proportion" of the Front-End /Back-End sub-models across different architectures.
Response 4: We acknowledge the reviewer’s point. We added text to Section 2.4 clarifying the common design principles that guide split-point selection across all ten models. Although the network architectures differ, the underlying FE/Server/BE rationale is unified.
Added text in section 2.4:
“While the ten architectures used in this study differ structurally, our split-point design follows a consistent principle: (i) the FE contains low-level feature extractors that interface with sensitive data, (ii) the SS contains the majority of parameters and computational load, and (iii) the BE handles prediction and gradients tied to GT labels to fully preserve the privacy. Although the proportion of layers varies across architectures, these design principles ensure functional consistency and reproducibility. ”

Comment 5: When comparing the three model types (centralized, locally centralized, and SplitFed), the paper fails to ensure "consistency of all variables except the training paradigm", resulting in a lack of fairness in performance comparison
Response 5: We thank the reviewer for showing this point. We clarified in Section 3.1 that all variables were held constant except for the training paradigm. Any differences in features and gradient flow arise from SplitFed itself rather than from experimental bias. This statement ensures fairness and transparency.
Changed text in section 3.1: “All experiments used the same loss function- \textit{Soft Dice Loss}~\citep{sudre_2017}, Adam optimizer, and stopping criteria. We used network-specific and dataset-specific initial learning rates. \textit{Intersection Over Union} (IoU) averaged over the sample size (average IoU) or the average Jaccard Index~\citep{Cox_2008} was used as the performance metric. For the Blastocyst dataset, we utilized the average IoU of the TE, ZP, BL, and ICM components. Each centralized model was trained for 120 epochs. Each SplitFed model was trained for 10 global communication rounds, while each client trained their local models for 12 local epochs. The only difference between the three settings—centralized, locally centralized, and SplitFed is the training paradigm itself. This ensures a controlled comparison, ensuring fairness and transparency.”

Comment 6: The comparison with existing methods is severely inadequate. In Table 5, for the Blastocyst dataset, only the authors’ previous research is included for comparison, while current mainstream methods, such as SAM, are not incorporated.
Response 6: We thank the reviewer for raising this important point. We agree that comparisons with state-of-the-art models such as SAM are valuable. However, SAM and other large models have not reported standardized results on the Blastocyst dataset, and pretrained SAM models are not compatible without significant pretraining. To ensure fairness, we only compared with methods that explicitly report average IoU on the same dataset.

This rationale is now included in Section 3.3.3.
“For fairness and consistency, we only included studies that explicitly report average IoUs or average Jaccard Indices on the same datasets used in this work. Several mainstream segmentation models, such as SAM \citep{kirillov2023segment} do not provide reproducible public benchmarks or pretrained models for these datasets due to data availability, privacy restrictions, or data incompatibility. As a result, they cannot be reliably included in a quantitative comparison. In particular, the Blastocyst dataset is one of our proprietary datasets, so comparisons with SoTA are missing.”

Comment 7: The paper does not specify the minimum hardware requirements for clients, such as CPU model, GPU memory size, or inference frame rates of different models on edge devices (NVIDIA Jetson). Additionally, it provides no user documentation for the repository code—including environment configuration steps or adaptation tutorials for custom datasets—making practical deployment impossible for practitioners.
Response 7: We appreciate the reviewer’s suggestion. We added a paragraph about the hardware specifications we used during the experiments. However, the current work in this study does not include deployment benchmarks on edge devices. Additionally, only minimal repository documentation is provided. We have added the environment.yml file on the repository, which facilitate the reader for the reproduction of the results. Actual deployments of the models are outside the scope of this initial release but will be included in future iterations of the repository.

Addition to section 3.1:

“Experiments were conducted on the Graham, Narval, and Cedar clusters using high-performance computing resources provided by Digital Research Alliance of Canada. We used a simple Linux utility for resource management scripts (SLURM) to request 1 GPU, 8 CPU cores, and 64 GB RAM per experiment on a single node. We have added the environment.yml file to the repository to support reproducibility. The codebase follows a standard PyTorch dataset/dataloader structure, allowing users to integrate their own datasets by implementing a dataset class consistent with the existing format. Actual deployments of the models are outside the scope of this initial release but will be included in future iterations of the repository.”

Comment 8: The paper does not objectively analyze the research limitations, such as its support only for RGB images and a lack of coverage for mainstream medical imaging modalities（CT, MRI).
Response 8: We acknowledge the reviewers point. We have added a separate section as “Limitations & Future works” to explain the limitations of the current version of MedSegNet10 clearly to the reader.

4. Additional Clarifications
N/A

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

All comments addressed

Author Response

Thank you very much for taking time to review the revised version of our manuscript. We have further revised our manuscript and have highlighted our changes in blue in the resubmitted file.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1、The core content of the manuscript is integrating SplitFed variants of 10 existing segmentation models into a repository, failing to propose any original technical contributions. Although the authors claim the contribution lies in "infrastructure construction", the constructed infrastructure still has significant deficiencies: it lacks a unified split-point design framework, fails to explore common patterns such as layer proportion, parameter proportion, and computational load proportion of sub-models, and provides no key practical support. Only an environment.yml file is added, which falls short of meeting the requirements for practical deployment. Such an incomplete infrastructure cannot effectively fill the so-called "research gap" and has limited guiding significance for subsequent research and applications.

2、The performance comparison of different training paradigms only supplements textual descriptions of "variable consistency" but provides no additional experimental verification or detailed parameter records, failing to fully ensure the fairness and reliability of the comparisons.

3、The comparison with existing methods is severely inadequate. While the authors' reason for excluding mainstream models such as SAM is understandable, the manuscript only compares with the authors' own previous research, making it impossible to reflect the performance and advantages of the proposed repository fully.

4、Regarding the comment that "Section 2.1 lacks a logical evolutionary context and comparative analysis of SplitFed-related studies“, the authors only claim to have revised the text but do not clearly explain how the revised content constructs a technical evolution framework or effectively compares the core differences among existing studies.

5、For the issue that "the repository only supports RGB images and does not cover mainstream medical modalities such as CT and MRI", the authors merely added a "Limitations & Future Works" section without proposing feasible improvement plans or supplementing relevant verification to enhance the repository's applicability.

Author Response

Comment 1: The core content of the manuscript is integrating SplitFed variants of 10 existing segmentation models into a repository, failing to propose any original technical contributions. Although the authors claim the contribution lies in "infrastructure construction", the constructed infrastructure still has significant deficiencies: it lacks a unified split-point design framework, fails to explore common patterns such as layer proportion, parameter proportion, and computational load proportion of sub-models, and provides no key practical support. Only an environment.yml file is added, which falls short of meeting the requirements for practical deployment. Such an incomplete infrastructure cannot effectively fill the so-called "research gap" and has limited guiding significance for subsequent research and applications.

Response 1: We thank the reviewer for the thoughtful comment. MedSegNet10 s not intended to introduce new SplitFed algorithms or novel segmentation architectures as we stated in the manuscript. Its contribution lies in addressing a different and previously unaddressed need: the absence of a unified, reproducible, and multi-architecture SplitFed benchmark for medical image segmentation. The literature currently evaluates SplitFed on isolated models under inconsistent experimental conditions, making cross-architecture analysis impossible. MedSegNet10 is designed specifically to fill this reproducibility and benchmarking gap and not to replace methodological research, but to support it.

In response to the reviewer’s concern regarding the lack of comparative structural analysis, we have added a new table (Table 5) that explicitly quantifies layer proportions, trainable parameter (TP) distributions, and computational load (FLOPs) across client-side and server-side sub-models for all included architectures. This table establishes a unified split-point analysis framework and reveals common patterns and trade-offs across architectures, directly addressing the reviewer’s concern about missing comparative insights.

Regarding practical support, MedSegNet10 provides fully implemented SplitFed-ready model variants, standardized training pipelines, and reproducible experimental settings. While deployment-level development is outside the scope of this initial release, the repository is intentionally positioned as a foundational infrastructure for SplitFed research. We believe this scope is appropriate for a repository-focused contribution and that the added structural analysis significantly strengthens the repository’s guiding value for future research and applications.

Comment 2: The performance comparison of different training paradigms only supplements textual descriptions of "variable consistency" but provides no additional experimental verification or detailed parameter records, failing to fully ensure the fairness and reliability of the comparisons.

Response 2: We thank the reviewer for this comment. To ensure a fair comparison between the centralized, locally centralized, and SplitFed training paradigms, all experimental variables including preprocessing steps, data augmentations, image resolution, optimizer (Adam), loss function (Soft Dice Loss), learning-rate settings, initialization, client assignments, training epochs, and stopping criteria were held strictly constant across all experiments, with the training paradigm itself being the only differing factor. Because the purpose of the comparison is to evaluate the impact of the training mechanism under identical conditions, additional experimental verification would not change the fairness or reliability of the results.

Comment 3: The comparison with existing methods is severely inadequate. While the authors' reason for excluding mainstream models such as SAM is understandable, the manuscript only compares with the authors' own previous research, making it impossible to reflect the performance and advantages of the proposed repository fully.

Response 3: We appreciate the reviewer’s concern regarding the scope of external comparisons. Our intention in Table 5 was to provide only fair and directly comparable baselines—i.e., methods that explicitly report average IoU on the same datasets used in our study. Unfortunately, for the Blastocyst dataset, no mainstream methods, including SAM or other recent large segmentation models, provide reproducible benchmarks or publicly available pretrained weights due to dataset unavailability, licensing constraints, or incompatibility with the proprietary data. As a result, our prior work remains the only publicly documented baseline with matching evaluation metrics on the Blastocyst dataset. However, for the HAM10K and KVASIR-SEG, we included all existing methods that report IoU-based segmentation results on the exact datasets. We have clarified this rationale in the manuscript to make the comparison boundaries explicit. Importantly, the purpose of MedSegNet10 is not to exceed SoTA performance but to provide the first unified SplitFed-ready benchmark across multiple architectures; thus, its value lies in standardization and reproducibility rather than outperforming external methods. Nevertheless, we acknowledge this limitation and emphasize in the revised “Limitations & Future Work” section that future versions of MedSegNet10 will incorporate comparisons with additional public models once reproducible dataset-aligned benchmarks become available.

Comment 4: Regarding the comment that "Section 2.1 lacks a logical evolutionary context and comparative analysis of SplitFed-related studies“, the authors only claim to have revised the text but do not clearly explain how the revised content constructs a technical evolution framework or effectively compares the core differences among existing studies.

Response 4: We thank the reviewer for this comment. Section 2.1 in the revised manuscript already presents SplitFed research within a logical evolutionary context, starting from the original SplitFed formulation and progressing through studies that address specific technical limitations such as non-IID data, security, communication instability, label noise, and architectural scalability. Rather than providing a detailed algorithm-by-algorithm comparison, the section groups prior work according to the problems they aim to solve, which constitutes the intended comparative framework.

We emphasize that the goal of this paper is to introduce a unified SplitFed repository and benchmarking resource, not to perform an exhaustive comparative analysis of SplitFed algorithmic variants. Accordingly, Section 2.1 is designed to contextualize existing work sufficiently to motivate the need for standardization and reproducibility, which directly leads to MedSegNet10. We believe the current revision appropriately fulfills this purpose, and no further changes are made in the updated manuscript.

Comment 5: For the issue that "the repository only supports RGB images and does not cover mainstream medical modalities such as CT and MRI", the authors merely added a "Limitations & Future Works" section without proposing feasible improvement plans or supplementing relevant verification to enhance the repository's applicability.

Response 5: We thank the reviewer for this comment. Support for volumetric modalities such as CT and MRI is explicitly identified as a limitation of the current MedSegNet10 release. The scope of this initial repository is limited to 2D RGB segmentation images to establish a reproducible SplitFed benchmark. Extending MedSegNet10 to 3D architectures and modality-specific preprocessing pipelines is a non-trivial effort that requires systematic architectural redesign, memory-aware SplitFed training strategies, and modality-specific evaluation. Therefore, those are intentionally deferred to future work. The current version provides a stable foundation upon which such extensions can be built. Therefore, we believe this is an appropriate and feasible development path for a repository-focused contribution.

Author Response File: Author Response.pdf

Article Menu

MedSegNet10: A Publicly Accessible Network Repository for Split Federated Medical Image Segmentation

Further Information

Guidelines

MDPI Initiatives

Follow MDPI