Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Research on Multi-Scene Electronic Component Detection Algorithm with Anchor Assignment Based on K-Means

Electronics 2022, 11(4), 514; https://doi.org/10.3390/electronics11040514

by Zilin Xia

, Jinan Gu^*, Ke Zhang, Wenbo Wang

and Jing Li

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Electronics 2022, 11(4), 514; https://doi.org/10.3390/electronics11040514

Submission received: 10 January 2022 / Revised: 5 February 2022 / Accepted: 7 February 2022 / Published: 9 February 2022

(This article belongs to the Section Artificial Intelligence)

Round 1

Reviewer 1 Report

The paper “Research on multi-scene electronic component detection algorithm with anchor assignment based on K-Means” propose a method for detecting electronic components in various kind of scenes. The method is based on EfficientNetV2 backbone for feature extraction to achieve both accuracy and efficiency. The authors use a KMeans-based two-stage adaptive division strategy for better balancing the positive and negative samples on training phase. They also build a specific multi-scene electronic components dataset consisting of 1040 images containing 14 categories of common electronic components.

The paper is well written, well organized, and well illustrated. It starts with a pertinent discussion on state of the art in the fields of deep learning based object detection and, more specific, deep learning based electronic components detection. Then, the authors explain how the dataset was built. The proposed method is described in Section 3, where the usage of K-Means in positive-negative samples division is also explained. The method validation is done against both public and specific datasets. The overall impression about the paper is positive. However, there are some unclarities and issues as following:

- The authors do not explain why they choose the following combination for validation on public datasets: [225] “We mixed the PASCAL VOC2007 training set, validation set and PASCAL VOC2012 training set as our training set and validation set, which included a total of 21380 images, and the PASCAL VOC2007 test set as our test set, which included a total of 4952 images.” (e.g. why PASCAL VOC2012 is not used for testing set?)

- In Equation (1) [266] it is not clear the how the values of lambda1 and lambda2 are set.

- In Algorithm 1 [324] it is not clear what happen with anchors having Cg = mg (lines 11, 12).

- How the categories (cat and pottedplant) used for comparations in Figures9 were selected? Why do the authors not refer and discuss the best/worst case for algorithm performance along all classes?

- The validation on specific dataset (electronic components) is less convincing as the authors compare their results with general models (as SSD, Yolo V3, Yolo V4, Faster-RCNN, FCOS) instead of comparing them with state of the art methods (as Kuo [30], Sun[31], Huang[32], Dong[34] etc.).
Indeed, as the state of the art methods uses improved version of the models
(e.g. [144] “Sun et al.[31] proposed an improved SSD algorithm to implement electronic component detection in stacked scenes. Huang et al.[32] proposed an improved YOLOV3 algorithm to detect electronic components in stacked scenes. Li et al.[33] proposed an improved YOLOV3 algorithm to detect electronic components on PCB boards. Dong et al.[34] proposed an improved Mask R-CNN method to achieve the detection of electronic components in the stacked state.”),
it can be expected for their methods to obtain better result as the general models on which there are based.

- There are many unexplained acronyms as GTBOX [69], NAS [132], AP, mAP [385] etc.

- Long and blurry phrases are used along all paper as [141] “Kuo et al.[30] proposed a dataset for detecting generic electronic components on printed circuit boards and a three-stage detection method based on deep learning to implement electronic component detection on printed circuit boards.”, [181] “The detection of electronic components throughout the assembly process of electronic components, before assembly for the identification and positioning of electronic components, after assembly for PCB board electronic components review (detection of PCB board has been assembled electronic components, whether to miss the assembly, whether to complete the assembly). But the electronic components in the assembly before and after the assembly of different scenes show different features, for example, some electronic components in the assembly before its own gravity factor is lying down and assembly to the PCB is vertical.” etc.

- Ambiguous sentences must be avoided (e.g. [457] “Since the complexity of the constructed multi-scene electronic components dataset is inferior to the public dataset.”)

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors have performed research on multi-scene electronic component detection algorithm with anchor assignment based on K-Means. Paper is interesting and I have following observations:

1. “Firstly, a new dataset was constructed to describe the multi-scene electronic component scene”. How the datasets have been constructed for this study 2. It has been observed that authors have cited references after full stop “ image.[2]” which is not the way of citing a reference therefore authors need to make corrections everywhere carefully. 3. Intensive grammatical check requires for whole manuscript. Some of the errors are

“Object detection based on deep learning has developed rapidly in recent years”

“Compared with the one-stage method, the two-stage method has more refinement 46 and adjustment of the anchor, so its accuracy is higher, but the speed is slower.”

“As the first two-stage anchor-based object detection method, Faster RCNN[3], comes from R-CNN[4] and Fast RCNN[5].”

4. Introduction is full of text but does not provide clear picture about the study presented. The major reason for it is unnecessarily use of undesirable texts to make introduction lengthy. The second reason is that author don’t have good command over English language. 5. “In this paper, the different assembly scenes we described were built in a laboratory simulating a real assembly scene, using an industrial camera with a resolution of 4092×3000 acquired 1040 images. The pre-assembly scene contains 494 images, the in-assembly scene contains 256 images, and 205 the post-assembly scene contains images.” These lines need to be reconstructed. 6. In Figure 2, Y-coordinate must be labelled as “Categories” like X-coordinate is labelled as number. Include high resolution figure. 7. The mathematical relations used in this article are well known equations from the literature, thus, authors need to cite the source of these references. 8. In the table [2], authors have compared results from different model and concluded that the result of their own algorithm is best among all. Here, the question is that have authors done experiments with all the models as tabulated or these model results are taken from literature ? if it is taken from literature then authors need to cite the appropriate references in the table. The authors are also requested to change “Ours” by “Proposed model”. Same need to be done with all other tables. 9. All the figures required in high resolution pixels. They are hazy and text are also not much clear. 10. Authors need to revise conclusion section as the conclusion is not providing a clear conclusion of the works presented. it is suffering from English language problems e.g. “Through this study, although the proposed method is simple and efficient and can achieve the detection of electronic components in multiple scenes, the current collective amount of data is not large enough. ” This line has grammatical issues.

I would recommend the article to be reconsidered only after substantial revisions and corrections. The article requires extensive grammatical check for English, authors are suggested to use professional language editing services from MDPI or other professional companies.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The article is well written and easy to understand. However, few of my feedback can be considered to improve the quality of the paper.

Introduction may be improved, adding the highlights and the problem statements.
Provide the experimental setup and the tools used for the study.
If possible provide a simulation parameters table.
You could improve writing, link better the ideas flow in the Introduction.
Review references because some of them are unstandardized.
The conclusion needs improvements towards major claimed contribution.
Write some future directions in the conclusion section.
The difference between your proposal and related works is not clear, you could to details better. I suggest add a comparative table in ''Related Literature'' to contrast your solution in front of related works.
You could discuss the relationship between your solution and past literature.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

I accept the manuscript after minor revision. My detailed comments are presented in the file "Review_electronics-1570576.pdf" for the authors.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The revised version of the paper “Research on multi-scene electronic component detection algorithm with anchor assignment based on K-Means” clarify better some aspects related with proposed method and results interpretation. It also corrects some textual imperfections. There are still some unclarities especially on comparing results with some general models. However, this version represents a clear improvement, so I can recommend it for publication.

Author Response

Thank you very much for your comment, when comparing with the general model, we mainly experimented with all algorithms on our own devices. And the performance of all algorithms is compared and analyzed on the same test set. We begin the Experiments section with an explanation of how all the algorithms are experimented with and the experimental conditions, as described in "And all algorithms are validated on the same device with E5-2678 V3 CPU, 16G RAM and 3090 graphics card with 24G video memory. All algorithms were trained for 100 epochs, and the first 50 epochs had a learning rate of 0.001 and the weights of the backbone feature extraction network were frozen. The second 50 epochs have a learning rate of 0.0001, the weights of the backbone feature extraction network are unfrozen, and the weights of the whole network are updated." below.

Reviewer 2 Report

I am not satisfied with the revision submitted by authors.

The points 4, Point 7, and point 8 are not satisfactory.

I can still see unnecessary texts in introduction section,

no reference has been provided for mathematical relations,

authors are claiming that they have performed the experiments using each algorithms presented in Table. 2 but I can not see the experiments performed in article.

I do not recommend article for publication.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Authors updated the paper as per my previous comments. No more further update requires.

Author Response

Thank you very much for your previous comments, they were very helpful for our paper. Sincere greetings to you.

Round 3

Reviewer 2 Report

Accepted in present form

Article Menu

Research on Multi-Scene Electronic Component Detection Algorithm with Anchor Assignment Based on K-Means

Further Information

Guidelines

MDPI Initiatives

Follow MDPI