Confidence-Guided Code Recognition for Shipping Containers Using Deep Learning
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1.Could the authors provide a detailed layer-by-layer architecture (number of layers, filter sizes, activation functions, and output dimensions) for both the CCLN and CCRN models?
2.Was any learning rate scheduling, early stopping, or weight decay used during training?
3.What deep learning framework was used (e.g., PyTorch, TensorFlow)?
4.The dataset split of 70% training, 10% validation, and 20% testing deviates from standard practice (e.g., 80–10–10). Please clarify the reasoning behind this choice and whether the larger test portion affected model convergence or generalization performance.
5.Why α = 0.5 was chosen?
Author Response
See Attached
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis research presents a novel localization and recognition framework for container codes in real-world industrial environments. This is quite a challenging task due to variable lighting, deformation, and occlusion conditions. The authors proposed a combined localization and recognition system (CCLN and CCRN integration), building upon existing text recognition architectures with architectural changes that enhance robustness under various conditions. They conducted comprehensive experimental evaluations on real-world datasets, demonstrating better performance over other state-of-the-art methods. This research addresses a critical need in real-world scenarios for automated inspection systems, particularly in logistics and manufacturing environments where accurate, real-time reading of container codes is essential.
The experimental results are convincing and show clear improvements over previous state-of-the-art methods. However, the analysis lacks a deeper comparative benchmarking, especially in terms of computational efficiency, latency, and memory footprint.
Additionally, the paper mentions future work on handling deformation but provides no strong plan or evaluation strategy for how such improvements would be implemented or validated. This could limit the model’s scalability.
Suggestions for the authors are:
1. The paper is well-written and easy to follow, but some minor corrections are needed on page 19 ("accesssed" instead of "accessed"). The overall clarity, coherence, and technical accuracy are sufficient.
2. While all figures have good resolution, Figures 1 and 7 are particularly difficult to read due to small labels and legends. I recommend enlarging these figures to improve readability.
3. The bibliography references are adequate. However, they could include additional recent sources from the past three years in order to be more extensive in their literature review.
4. The authors provided repository links of the datasets and details on the specifications of the system used for the experiments, which is commendable. However, they should also include a public repository (GitHub, Kaggle, etc.) that contains the full implementation of the proposed algorithms, including for example training scripts, model weights, and preprocessing pipelines. This would significantly enhance the reproducibility and usability of the work for the research community.
5. The paper does not discuss the computational cost of the proposed model, which is a critical factor for real-world deployment in industrial settings. The authors should also consider providing other details (latency, CPU and resource usage, inference time per image, model size, number of parameters, FLOPS, etc.)
6. Although it is being mentioned in future works, a more detailed clarification on ways to address varied conditions like deformation would strengthen the overall image of the paper, with details on how these changes would be implemented or evaluated. The authors should propose techniques or describe how these methods would be implemented or evaluated.
Author Response
See attached
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors1. Innovation Assessment and Improvement Recommendations The “confidence-guided code recognition” framework proposed in this paper demonstrates some innovation in integrating ResNet+UNet with CNN-LSTM architectures, though its overall novelty remains somewhat limited. Authors are advised to further clarify the unique contributions of the “confidence-guided” mechanism, such as: Does this mechanism introduce confidence-weighted CTC decoding? Does it optimize end-to-end training through confidence feedback? The current description does not sufficiently highlight the core innovation of “confidence-guided” learning.
2. Model Structure Documentation Needs Improvement While the structures of CCLN and CCRN are described in the paper, clear network diagrams or tables comparing layer configurations (number of layers, convolution kernel sizes, activation functions, etc.) are lacking. We recommend supplementing the paper with detailed model structure tables (e.g., tabular listings of layer parameters) to facilitate reproducibility by other researchers.
3. Tables 1 and 2 compare multiple existing models, but the paper does not clarify whether these models were reimplemented or their original results referenced under identical datasets and training settings. Were all compared models retrained under the same hardware and data conditions? If results from original papers are referenced, this should be annotated in the tables (e.g., using † to indicate referenced results).
4. Some figures (e.g., Figures 7 and 8) lack unit labels or metric ranges on axes, impairing readability. Add color legends or figure captions to clarify curves.
5. Emphasize research trends over the next three years (2022–2025).
6. While the paper highlights the application value of port automation, it lacks a systematic evaluation. Authors are advised to revise by adding: experimental results from real-world deployment environments (e.g., recognition rate tests on port surveillance video streams); comparisons with efficiency gains over manual identification; and potential impacts on future smart port automation and security, along with ethical considerations (e.g., privacy and surveillance issues).
REFERENCE
China Futures Market and World Container Shipping Economy: An Exploratory Analysis Based on Deep Learning
Author Response
See attached
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsDONE

