MambaDPF-Net: A Dual-Path Fusion Network with Selective State Space Modeling for Robust Low-Light Image Enhancement
Deokwoo Lee
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper presents a novel dual-path fusion network (MambaDPF-Net) for low-light image enhancement, integrating Retinex theory with selective state space modeling. The integration of sharpening, decoupling, denoising, and coupling modules is logically structured and contributes to both interpretability and performance. The incorporation of a Mamba-based selective state space module for cross-domain fusion is innovative and efficiently handles long-range dependencies. Extensive experiments on multiple datasets validate the method’s robustness and generalization ability. While the content is well-structured and informative, there are several recommendations for revisions to enhance its clarity and academic rigor as follows:
- It is recommended to move Figure 7 (Network Architecture of MambaDPF-Net) to Section 3.1 (Overview), and note that the caption for Figure 7 is currently missing in the text.
- It is recommended to incorporate the overall architecture of each sub-network (e.g., Sharp-Net, Decouple-Net, etc.) into the main text to clearly illustrate how features are processed. Existing figures can be adapted so that each subsection includes a single network architecture diagram, which should encompass both the overall structure and its individual components. Taking Section 3.2 as an example, the RM and BIU should be integrated into one figure along with the overall structure of the sub-network.
- This paper lacks a complete explanation of the variable definitions in the loss function formulas.
- This paper lacks a detailed description of the configuration for each dataset, such as the size of the training set and testing set.
- It is recommended to add qualitative comparative results on the LIME and VV datasets.
- The tables only provide partial comparative results from some methods, lacking sufficient data.
- Please thoroughly review the content of the paper, including vocabulary, grammar, figures, tables, formulas, etc.
Author Response
Dear Reviewer,
Thank you for your valuable feedback. We have carefully revised the manuscript based on your suggestions. A point-by-point response to your comments is provided in the attached document.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper proposes MambaDPF-Net for low-light image enhancement. Experiments on the LOL and LSRW datasets show that the proposed method outperforms RetinexNet, R2RNet, and Retinexformer, while additional tests on the LIME and VV datasets demonstrate robustness. The authors claim the main contribution lies in introducing Mamba into low-light enhancement to model long-range dependencies and non-local interactions with near-linear computational complexity, and in designing a dual-path fusion architecture.
1)Novelty: The Retinex models have already been widely adopted in works such as RetinexNet. The sharpening prior has also been explored in pre-enhancement approaches. The distinctive part of this paper is the introduction of Mamba, but the paper should provide stronger justification of Mamba’s unique advantages beyond its claimed near-linear complexity.
2)The paper emphasizes the efficiency of Mamba and claims linear complexity, but no quantitative results such as parameter counts, FLOPs, or inference speed (FPS) are provided.
3) The current ablation only compares Baseline → +BIU → Full model. There is no individual evaluation of the sharpening branch, FDAM, or SS2D. This makes it difficult to assess the contribution of each submodule.
4) Figure 7 does not clearly illustrate the processing flow, and the position of the sharpening branch within the framework is ambiguous. In addition, the text inside the diagram is too small to read. The figure should be redrawn for better readability.
5) The related work is lengthy but not well organized. Traditional methods and deep learning methods are mixed without clear categorization, and the limitations of existing cross-domain modeling are not well emphasized. This diminishes the visibility of the paper’s contributions.
6) Some recent work on low-light image enhancement should be discussed and state theirs differences. E.g., Mamba-based method: PDCE: Patch-wise Dynamic Curve Estimation for Low-light Image Enhancement. ICASSP, 2025. Frequency-domain based method.
7)Minor issues:
a) In Section 3.1, “deconvolution subnetwork” is mentioned, but based on the paper structure it seems this should be “decoupling subnetwork?
b) Equations (1)–(3) in Section 3.3 are not clearly explained.
c) Some figures (e.g., Fig. 6) have low resolution and are difficult to read.
d) In Table 3, the best results are not highlighted in bold as stated.
e) In Section 3, both “Overview” and “Sharp-Net” are labeled as 3.1.
Author Response
Dear Reviewer,
Thank you for your valuable feedback. We have carefully revised the manuscript based on your suggestions. A point-by-point response to your comments is provided in the attached document.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors- The introduction states that it provides “comprehensive ablations, cross-dataset generalization, and complexity-throughput analysis,” yet the main text fails to present the essential metrics (number of model parameters, GFLOPs, inference latency (ms/image), throughput (img/s), peak memory (GB)) and measurement protocols required for such analysis. The PSNR/SSIM metrics alone are insufficient to validate the efficiency claims, resulting in a lack of reliability and reproducibility for the efficiency-related contributions.
- The authors propose MambaDPF-Net, which integrates sharpening priors for texture stabilization and follows a ‘decoupling–denoising–coupling’ paradigm. However, Section 4.2 states, “Our method outperforms not only Retinexformer but also the recently proposed MambaDPF-Net,” making it unclear whether this MambaDPF-Net refers to another method with the same name or is simply a typo. This creates confusion regarding the interpretation of contributions and comparison results.
- The Table 1 referenced in Section 4.2 is missing from the main text, and the cross-reference within the paragraph is broken. Consequently, the claim that “it outperforms Retinexformer and the recently proposed MambaDPF-Net on both datasets” cannot be verified. Please restore Table 1 and the broken reference, and clearly present the comparison metrics and the evaluation protocol.
- The items listed in the Methods column of each table are presented without author, year, or reference number, making identification difficult (e.g., the “Dong” entry in Table 2 does not specify the specific paper). Please include the exact reference (author, year, [reference number]) in a footnote or parenthetical citation at the bottom of each table. Additionally, provide a mapping table for abbreviations (e.g., RetinexNet, R2RNet, etc.) to their corresponding original papers.
- The realization details state LSRW·LOLv was used, but the results table/description compares LSRW·LOLv2, creating inconsistency. This makes it difficult to judge the fairness and reproducibility of the comparison. Clearly explain whether LOLv and LOLv2 are identical or different (including version differences, quantity/division differences). If different versions were mixed, describe the reason and its impact. Label each specific value in the tables/figures to indicate which version it was calculated from.
- The mixed use of VV and vv notation makes it unclear whether they refer to the same dataset or different collections/subsets. This hinders the fairness of comparisons and the assessment of reproducibility. Clearly define the relationship between the formal name and the abbreviations (VV/vv) upon their first appearance. If they refer to the same dataset, select one standard notation and apply it consistently throughout the document. If they represent different datasets/subsets, explain the source and compositional differences, and clearly label each figure/table value to indicate which dataset it corresponds to.
Author Response
Dear Reviewer,
Thank you for your valuable feedback. We have carefully revised the manuscript based on your suggestions. A point-by-point response to your comments is provided in the attached document.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper presents a novel dual-path fusion network (MambaDPF-Net) for low-light image enhancement, integrating Retinex theory with selective state space modeling. The integration of sharpening, decoupling, denoising, and coupling modules is logically structured and contributes to both interpretability and performance. The incorporation of a Mamba-based selective state space module for cross-domain fusion is innovative and efficiently handles long-range dependencies. Extensive experiments on multiple datasets validate the method’s robustness and generalization ability. While the content is well-structured and informative, there are several recommendations for revisions to enhance its clarity and academic rigor as follows:
- It is recommended to move Figure 7 (Network Architecture of MambaDPF-Net) to Section 3.1 (Overview), and note that the caption for Figure 7 is currently missing in the text.
- It is recommended to incorporate the overall architecture of each sub-network (e.g., Sharp-Net, Decouple-Net, etc.) into the main text to clearly illustrate how features are processed. Existing figures can be adapted so that each subsection includes a single network architecture diagram, which should encompass both the overall structure and its individual components. Taking Section 3.2 as an example, the RM and BIU should be integrated into one figure along with the overall structure of the sub-network.
- Some existing papers about image enhancement can be cited and compared together, such as: Optics Express, 2024, 32(3): 3835-3851; Optics & Laser Technology, 2025, 187, 112900; Optics Letters, 2025. 50 (10), 3413-3416.
- This paper lacks a complete explanation of the variable definitions in the loss function formulas.
- This paper lacks a detailed description of the configuration for each dataset, such as the size of the training set and testing set.
- It is recommended to add qualitative comparative results on the LIME and VV datasets.
- The tables only provide partial comparative results from some methods, lacking sufficient data.
- Please thoroughly review the content of the paper, including vocabulary, grammar, figures, tables, formulas, etc.
Author Response
The detailed reply is in the attachment
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript has been revised properly, and it is now acceptable.
Author Response
Dear Reviewer,
We would like to express our sincere gratitude to Reviewer 3 for their positive feedback and for their time and effort in reviewing our manuscript.
We are very pleased to learn that the reviewer finds the revised manuscript to be properly revised and now acceptable for publication.
Thank you once again for your valuable support.
Sincerely,
Zikang Zhang and All Co-authors