# Molecular-Clump Detection Based on an Improved YOLOv5 Joint Density Peak Clustering

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

^{12}CO,

^{13}CO, and C

^{18}O(1-0) lines emission. The first phase of MWISP has acquired a large number of molecular-cloud observational data, providing sufficient training samples for the deep-learning-based clump detection algorithm. This paper proposes a clump-detection method based on an improved YOLOv5 joint Density Peak Clustering (DPC) [24]. The method initially locates the clumps in the position-position (PP) coordinate representing the Galactic longitude and latitude by using the improved YOLOv5. Subsequently, DPC is used to cluster the clumps in the velocity direction, thereby ultimately realizing the clump detection in position-position-velocity (PPV) three-dimensional (3D) space. This method utilizes supervised deep learning to detect clumps by labeling the areas of interest. The detection results are directly related to the labeling scheme and feature learning of the data. During detection, the 3D information where the target candidate is located can be obtained quickly with far fewer parameters tuning during detection.

## 2. Data

^{13}CO(1-0) line emission data obtained by MWISP. The simulated clumps, the observational data, and the synthesized data are described separately below.

#### 2.1. Simulated Clumps

^{13}CO(1-0) line emission data of the M16 region obtained by MWISP are regarded as the input into the 3D Gaussian model. The

^{13}CO(1-0) line emission of M16 region ranges ${15}^{\circ}{15}^{\prime}$ < l <$\phantom{\rule{3.33333pt}{0ex}}{18}^{\circ}{15}^{\prime}$, ${0}^{\circ}$ < b <$\phantom{\rule{3.33333pt}{0ex}}{1}^{\circ}{30}^{\prime}$. LDC [9] algorithm was applied to detect the

^{13}CO(1-0) line emission of M16 and a total of 658 clumps have been detected. We counted the morphological parameters of these clumps and obtained the parameters range in Table 1. Five thousand simulated clumps were randomly generated, and fluxes were calculated according to the parameters in Table 1. The flux distribution is shown in Figure 1. These simulated clumps maintain consistency with the MWISP observational data and provide a more realistic representation of the detection performance.

#### 2.2. Observational Data

^{13}CO(1-0) line emission data obtained from MWISP are selected as the background for generating the synthesized data. The typical noise level of

^{13}CO(1-0) line emission is about 0.23 K with a channel width of 0.167 km ${\mathrm{s}}^{-1}$. Three regions all contain information in the three dimensions of Galactic longitude, Galactic latitude, and velocity. Due to the structure of the spiral arm of the Milky Way, different quadrants contain different spiral arms and have different gas distributions. The range of the Galactic plane is ${3}^{\circ}\times {2}^{\circ}$ and the velocity range is $70\phantom{\rule{3.33333pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$, of all three backgrounds, corresponding with the size of $361\times 241\times 424$ pixels. The regions selected in the first Galactic quadrant range ${13}^{\circ}<l<{16}^{\circ}$ and $-{1}^{\circ}{30}^{\prime}<b<{30}^{\prime}$, and the velocity ranges $0\phantom{\rule{3.33333pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<70\phantom{\rule{3.33333pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. It belongs to the inner Milky Way area, in the direction of the galactic center. The regions selected in the second Galactic quadrant range ${101}^{\circ}<l<{104}^{\circ}$ and ${2}^{\circ}<b<{4}^{\circ}$, and the velocity ranges $-60\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<10\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. The regions selected in the third Galactic quadrant range ${184}^{\circ}{30}^{\prime}<l<{187}^{\circ}{30}^{\prime}$ and $-{1}^{\circ}<b<{1}^{\circ}$, and the velocity ranges $-10\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<60\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. They both belong to the outer Milky Way area. The density of the three regions is different. The background gas is dense in the first Galactic quadrant region and sparse in the third Galactic quadrant region. In the second Galactic quadrant region, the density of the background is between the first and the third Galactic quadrants. Different gas densities can reflect the detection performance of the detection algorithm in different signal-to-noise ratio environments. Figure 2 shows the velocity-integrated intensity maps for each selected background region.

#### 2.3. Synthesized Data

## 3. Method

#### 3.1. The Improved YOLOv5 − Molecular Clump Detection (MCD)-YOLOv5

#### 3.1.1. Coordinate Attention

#### 3.1.2. Normalized Wasserstein Distance

#### 3.2. Density Peak Clustering Algorithm

#### 3.3. MCD-YOLOv5 Joint DPC

## 4. Experiments and Discussion

#### 4.1. Evaluation of Indicators

#### 4.2. MCD-YOLOv5 Training and DPC

#### 4.2.1. MCD-YOLOv5 Dataset Generation

#### 4.2.2. Training MCD-YOLOv5

#### 4.2.3. Result of DPC

#### 4.3. Detection Results of Second-Quadrant Synthesized Data

#### 4.4. Detection Results of Observational Data

^{13}CO(1-0) line emission data obtained by MWISP for testing. The region selected ranges ${180}^{\circ}<l<{195}^{\circ}$ and $-{5}^{\circ}<b<{5}^{\circ}$, and the velocity ranges $-200\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<200\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. This region is divided into different data blocks at ${1}^{\circ}$ grid spacing corresponding with the size of $121\times 121\times 2411$ pixels. To obtain annotation information, we used the LDC algorithm to detect this region and obtained the clumps and their corresponding parameter information. We selected the clumps among 200 data blocks with 487 clumps to obtain 8496 samples and their annotation information by the method mentioned in Section 4.2.1. The pixel size of the sample is $121\times 121$. Figure 15 shows the examples from the observational dataset and the annotation information on the samples. The dataset is divided in the same way as in Section 4.2.2. The training parameter is the default parameter of YOLOv5. To reduce the complexity of model training, we used Transfer Learning [35] when training MCD-YOLOv5 with the observational dataset. The trained model parameters on the synthesized dataset in Section 4.2.2 are used as the pre-training weights for the model training on the observational dataset. Figure 16 shows variation curves of loss-change, precision, and recall on the validation set of the observational dataset. Figure 16a,b shows that MCD-YOLOv5 is not overfitted or underfitted when trained on the observational dataset. We selected some samples in the validation set and inputted them into the trained MCD-YOLOv5 for detection.

## 5. Summary

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

^{12}CO,

^{13}CO, and C

^{18}O along the northern Galactic plane with PMO-13.7m telescope. We are grateful to all the members of the MWISP working group, particularly the staff members at PMO-13.7m telescope, for their long-term support.

## Conflicts of Interest

## Note

1 | https://github.com/SunetK/MCD-YOLOv5-joint-DPC(accessed on 16 August 2023). |

## References

- Heyer, M.; Dame, T.M. Molecular Clouds in the Milky Way. Annu. Rev. Astron. Astrophys.
**2015**, 53, 583–629. [Google Scholar] [CrossRef] - Williams, J.P.; Blitz, L.; McKee, C.F. The Structure and Evolution of Molecular Clouds: From Clumps to Cores to the IMF. arXiv
**1999**, arXiv:astro-ph/9902246. [Google Scholar] - Stutzki, J.; Guesten, R. High Spatial Resolution Isotopic CO and CS Observations of M17 SW: The Clumpy Structure of the Molecular Cloud Core. Astrophys. J.
**1990**, 356, 513. [Google Scholar] [CrossRef] - Krumholz, M.R.; McKee, C.F.; Tumlinson, J. The Star Formation Law in Atomic and Molecular Gas. Astrophys. J.
**2009**, 699, 850–856. [Google Scholar] [CrossRef] - Zinnecker, H.; Yorke, H.W. Toward Understanding Massive Star Formation. Annu. Rev. Astron. Astrophys.
**2007**, 45, 481–563. [Google Scholar] [CrossRef] - Williams, J.P.; de Geus, E.J.; Blitz, L. Determining Structure in Molecular Clouds. Astrophys. J.
**1994**, 428, 693. [Google Scholar] [CrossRef] - Berry, D.S. FellWalker-A clump identification algorithm. Astron. Comput.
**2015**, 10, 22–31. [Google Scholar] [CrossRef] - Kirk, H.; Di Francesco, J.; Johnstone, D.; Duarte-Cabral, A.; Sadavoy, S.; Hatchell, J.; Mottram, J.C.; Buckle, J.; Berry, D.S.; Broekhoven-Fiene, H.; et al. The JCMT Gould Belt Survey: A First Look at Dense Cores in Orion B. Astrophys. J.
**2016**, 817, 167. [Google Scholar] [CrossRef] - Luo, X.; Zheng, S.; Huang, Y.; Zeng, S.; Zeng, X.; Jiang, Z.; Chen, Z. Molecular Clump Extraction Algorithm Based on Local Density Clustering. Res. Astron. Astrophys.
**2022**, 22, 015003. [Google Scholar] [CrossRef] - Chen, Q.; Wang, Y.; Yang, T.; Zhang, X.; Cheng, J.; Sun, J. You Only Look One-Level Feature. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, 19–25 June 2021; pp. 13039–13048. [Google Scholar] [CrossRef]
- Wang, J.; Song, L.; Li, Z.; Sun, H.; Sun, J.; Zheng, N. End-to-End Object Detection with Fully Convolutional Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, 19–25 June 2021; pp. 15849–15858. [Google Scholar] [CrossRef]
- Yan, B.; Peng, H.; Wu, K.; Wang, D.; Fu, J.; Lu, H. LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, 19–25 June 2021; pp. 15180–15189. [Google Scholar] [CrossRef]
- Kumar, A.; Rawat, Y.S. End-to-End Semi-Supervised Learning for Video Action Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; pp. 14680–14690. [Google Scholar] [CrossRef]
- Liang, C.; Wang, W.; Zhou, T.; Yang, Y. Visual Abductive Reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; pp. 15544–15554. [Google Scholar] [CrossRef]
- Zhou, K.; Yang, J.; Loy, C.C.; Liu, Z. Conditional Prompt Learning for Vision-Language Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, 18–24 June 2022; pp. 16795–16804. [Google Scholar] [CrossRef]
- Kim, E.J.; Brunner, R.J. Star-galaxy classification using deep convolutional neural networks. Mon. Not. R. Astron. Soc.
**2017**, 464, 4463–4475. [Google Scholar] [CrossRef] - González, R.E.; Muñoz, R.P.; Hernández, C.A. Galaxy detection and identification using deep learning and data augmentation. Astron. Comput.
**2018**, 25, 103–109. [Google Scholar] [CrossRef] - Leung, H.W.; Bovy, J. Deep learning of multi-element abundances from high-resolution spectroscopic data. Mon. Not. R. Astron. Soc.
**2019**, 483, 3255–3277. [Google Scholar] [CrossRef] - Xie, J.; Bu, Y.; Liang, J.; Li, H.; Wang, X.; Pan, J. Improve the Search of Very Metal-poor Stars Using the Deep Learning Method. Astron. J.
**2021**, 162, 155. [Google Scholar] [CrossRef] - He, Z.; Qiu, B.; Luo, A.L.; Shi, J.; Kong, X.; Jiang, X. Deep learning applications based on SDSS photometric data: Detection and classification of sources. Mon. Not. R. Astron. Soc.
**2021**, 508, 2039–2052. [Google Scholar] [CrossRef] - Yi, Z.; Li, J.; Du, W.; Liu, M.; Liang, Z.; Xing, Y.; Pan, J.; Bu, Y.; Kong, X.; Wu, H. Automatic detection of low surface brightness galaxies from Sloan Digital Sky Survey images. Mon. Not. R. Astron. Soc.
**2022**, 513, 3972–3981. [Google Scholar] [CrossRef] - Cao, Z.; Yi, Z.; Pan, J.; Su, H.; Bu, Y.; Kong, X.; Luo, A. L-dwarf Detection from SDSS Images using Improved Faster R-CNN. Astron. J.
**2023**, 165, 184. [Google Scholar] [CrossRef] - Su, Y.; Yang, J.; Zhang, S.; Gong, Y.; Wang, H.; Zhou, X.; Wang, M.; Chen, Z.; Sun, Y.; Chen, X.; et al. The Milky Way Imaging Scroll Painting (MWISP): Project Details and Initial Results from the Galactic Longitudes of 25.°8-49.°7. Astrophys. J. Suppl. Ser.
**2019**, 240, 9. [Google Scholar] [CrossRef] - Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science
**2014**, 344, 1492–1496. [Google Scholar] [CrossRef] - Matsubara, T. Analytic Minkowski functionals of the cosmic microwave background: Second-order non-Gaussianity with bispectrum and trispectrum. Phys. Rev. D
**2010**, 81, 083505. [Google Scholar] [CrossRef] - Li, Z.; Wang, Y.; Chen, K.; Yu, Z. Channel Pruned YOLOv5-based Deep Learning Approach for Rapid and Accurate Outdoor Obstacles Detection. arXiv
**2022**, arXiv:2204.13699. [Google Scholar] - Darapaneni, N.; Kumar, S.; Krishnan, S.; Rajagopal, A.; Nagendra; Paduri, A.R. Implementing a Real-Time, YOLOv5 based Social Distancing Measuring System for COVID-19. arXiv
**2022**, arXiv:2204.03350. [Google Scholar] - Ewaidat, H.A.; Brag, Y.E. Identification of lung nodules CT scan using YOLOv5 based on convolution neural network. arXiv
**2022**, arXiv:2301.02166. [Google Scholar] - Jain, S. Adversarial Attack on Yolov5 for Traffic and Road Sign Detection. arXiv
**2023**, arXiv:2306.0607. [Google Scholar] - Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Yu, J.; Jiang, Y.; Wang, Z.; Cao, Z.; Huang, T. UnitBox: An Advanced Object Detection Network. arXiv
**2016**, arXiv:1608.01471. [Google Scholar] - Wang, J.; Xu, C.; Yang, W.; Yu, L. A Normalized Gaussian Wasserstein Distance for Tiny Object Detection. arXiv
**2022**, arXiv:2110.13389. [Google Scholar] - Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression. arXiv
**2019**, arXiv:1911.08287. [Google Scholar] [CrossRef] - Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man, Cybern.
**1979**, 9, 62–66. [Google Scholar] [CrossRef] - Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng.
**2010**, 22, 1345–1359. [Google Scholar] [CrossRef]

**Figure 2.**Velocity—integrated intensity maps for selected background regions. (

**a**) the first Galactic quadrant with ${13}^{\circ}<l<{16}^{\circ}$ and $-{1}^{\circ}{30}^{\prime}<b<{30}^{\prime}$; the velocity ranges $0\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<70\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. (

**b**) the second Galactic quadrant with ${101}^{\circ}<l<{104}^{\circ}$ and ${2}^{\circ}<b<{4}^{\circ}$; the velocity ranges $-60\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<10\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$. (

**c**) the third Galactic quadrant with ${184}^{\circ}{30}^{\prime}<l<{187}^{\circ}{30}^{\prime}$ and $-{1}^{\circ}<b<{1}^{\circ}$; the velocity ranges $-10\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}<vs.<60\phantom{\rule{4pt}{0ex}}\mathrm{km}\phantom{\rule{4.pt}{0ex}}{\mathrm{s}}^{-1}$.

**Figure 3.**Velocity—integrated intensity maps of the synthesized data: (

**a**) the first Galactic quadrant synthesized data. (

**b**) the second Galactic quadrant synthesized data. (

**c**) the third Galactic quadrant synthesized data.

**Figure 4.**Architecture of MCD-YOLOv5. MCD-YOLOv5 is mainly composed of three parts: backbone, neck, and YOLO head. The backbone consists of CBS, C3_1, and CA modules. CBS is a composite convolution module, which encapsulates a convolutional layer, a batch normalization layer, and the SiLU activation function. MCD-YOLOv5 contains two types of C3 modules; C3_1 is applied in the backbone and C3_2 is applied in the neck. The C3_1 module and C3_2 module both consist of CBS. In C3_1, the input feature map passes through three branches, while in C3_2, it only passes through two branches. The branches are finally spliced by channel and then output through a CBS module. We add a CA module after each layer of the C3_1. The neck consists of SPPF(Spatial Pyramid Pooling-Fast), and CSP-PAN(CSP-Path Aggregation Network) modules. SPPF is a spatial pyramid pooling module. SPPF encapsulates a CBS module and three maximum pooling layers. Three pooling results with input feature maps are spliced by channel and passed through a CBS module. CSP-PAN is composed of CBS and C3_2, and the feature fusion of different feature layers is realized through upsampling and downsampling, which solves the target multi-scale problem to a certain extent. The main part of the YOLO head is three detectors, that is, using mesh-based anchors to detect objects on feature maps at different scales.

**Figure 5.**Architecture of Coordinate Attention module. The module is made of two average pooling layers, a convolution layer with concat operation, a batch normalization layer with an h_swish activation function, two convolution layers, and two sigmoid activation functions. The average pooling layers encode each channel of the feature map along the X and Y directions. The resulting feature maps are combined and transformed via the convolution layer with concat operation and the batch normalization layer with an h_swish activation function to create intermediate feature maps. These feature maps are then divided into separate tensors, which are again transformed and expanded by the convolution layer and sigmoid activation function to become the value of the attention weight assignment.

**Figure 6.**Flowchart of the clump-detection method based on MCD-YOLOv5 joint DPC. The four branches represent any four slices in a given data cube. Each slice achieves 2D detection by MCD-YOLOv5 to obtain the anchor information of the target on the corresponding slice. The rest slices are indicated by ellipses. Heads in different size fonts indicate different scales of detection heads.

**Figure 7.**Examples from the MCD-YOLOv5 dataset and the annotation information: (

**a**) synthesized data in the third Galactic quadrant, (

**b**) synthesized data in the first Galactic quadrant, (

**c**) annotation information on (

**a**), and (

**d**) annotation information on (

**b**). The red-labeled boxes represent the clumps.

**Figure 8.**Comparing loss-change curves MCD-YOLOv5 and YOLOv5 training. The clump-detection task has only one category of clump, so the loss contains only localization loss and confidence loss: (

**a**) variation in localization loss (Box Loss) with number of training rounds (epoch), (

**b**) variation in confidence loss (Obj Loss) with number of training rounds (epoch).

**Figure 9.**Variation curves of precision, recall, and AP as a function of the number of training rounds (epoch) for MCD-YOLOv5 and YOLOv5 on the validation set: (

**a**) precision, and (

**b**) recall, and (

**c**) AP.

**Figure 10.**A typical example of detection results of MCD-YOLOv5 and YOLOv5. (

**a**) Seven simulated clumps in the area. (

**b**) MCD-YOLOv5 detected 7 clumps. (

**c**) YOLOv5 detected 6 clumps. The red rectangular box in (

**c**) represents the missed clump.

**Figure 11.**Display of the DPC results. The differently colored dots represent the center of mass of a clump detected by MCD-YOLOv5, the red rhombus represents a clustering center, and the green square represents the location of the center of mass of a simulated clump.

**Figure 12.**Velocity—integrated intensity maps of the detection result. The white circles are the center of mass positions of the simulated clumps, and the red dots indicate the center of mass positions of the clumps that are detected and matched with the simulated clumps by MCD-YOLOv5 joint DPC.

**Figure 13.**The information of the simulated clump is shown. (

**a**,

**b**) two examples of the integrated intensity map of the detected clumps of second quadrant synthesized data. The top three subplots offer the l−b, l−v, and b−v maps integration in three directions for a clump, while the middle and bottom of the figure show the peak spectrum and average spectrum of the clump, respectively.

**Figure 14.**Variation of recall with flux and PSNR for ClumpFind, FellWalker, and MCD-YOLOv5 joint DPC: (

**a**) Recall variation with flux, and (

**b**) Recall variation with PSNR.

**Figure 15.**Examples from the MCD-YOLOv5 dataset and labeling the annotation information on the samples. (

**a**–

**d**) all show the labeling of the observational data slices in the third Galactic quadrant. The red-labeled box is annotation information.

**Figure 16.**Variation curves of loss-change, precision, and recall on the validation set of the real dataset during MCD-YOLOv5 training: (

**a**) variation in localization loss, (

**b**) variation in confidence loss, (

**c**) variation in precision and recall.

**Figure 17.**Detection results of MCD-YOLOv5 in observational data. (

**a**,

**c**) the slices generated by intercepting along the velocity channel of the observational data. (

**b**,

**d**) the detection results of MCD-YOLOv5. The red-labeled box is annotation information, the yellow-labeled box is the detection result of MCD-YOLOV5.

**Figure 18.**Velocity—integrated intensity maps of detection results. (

**a**,

**b**) velocity—integrated intensity maps of detection results of the two examples. The white circles are the center of mass positions of the clumps detected by MCD-YOLOv5 joint DPC, and the red dots indicate the center of mass positions of the clumps that are detected by FellWalker.

**Figure 19.**The information of the clump is shown. (

**a**,

**b**) two examples of the integrated intensity map of the detected clumps of observational data. The top three subplots offer the l−b, l−v, and b−v maps integration in three directions for a clump, while the middle and bottom of the figure show the peak spectrum and average spectrum of the clump, respectively.

Parameter Name | Explanation | Range |
---|---|---|

Peak | Peak intensity of the clump | [0.7, 15] |

${\sigma}_{x}$ | Standard deviation on the Galactic longitude | [1, 4] × 2.3548 |

${\sigma}_{y}$ | Standard deviation on the Galactic latitude | [1, 4] × 2.3548 |

${\sigma}_{v}$ | Standard deviation on the velocity direction | [1, 7] × 2.3548 |

(${x}_{0}$, ${y}_{0}$, ${v}_{0}$) | Position of the center of mass of the clump | Randomization |

$\theta $ | Rotation angle on the Galactic plane | ${0}^{\circ}$–${180}^{\circ}$ |

Parameter Name | Explanation |
---|---|

ID | Clump number |

Peak1, Peak2, Peak3 | Peak coordinates of clumps |

Cen1, Cen2, Cen3 | Coordinates of the center of mass of clumps |

Size1, Size2, Size3 | Axis lengths of clumps in the Galactic plane, and velocity direction |

$\theta $ | Rotation angles of clumps on the Galactic plane |

Sum | Total flux of clumps |

Peak | Peak intensity of clumps |

Description | Parameters Name | Explanation |
---|---|---|

${x}_{i}$ | Galactic longitude coordinates of the center of mass in the region detected by MCD-YOLOv5 | |

${y}_{i}$ | Galactic latitude coordinates of the center of mass in the region detected by MCD-YOLOv5 | |

Input | ${v}_{i}$ | Channels in the velocity direction detected by MCD-YOLOv5 |

${I}_{i}$ | Intensity of (${x}_{i},{y}_{i},{v}_{i}$) in the synthesized data | |

Output | numClust | Number of clustered clumps categories |

Model | Precision | Recall | AP |
---|---|---|---|

MCD-YOLOv5 | 0.969 | 0.935 | 0.972 |

YOLOv5 | 0.969 | 0.910 | 0.956 |

Parameters Name And Default Value |
---|

FELLWALKER.ALLOWEDGE = 1 |

FELLWALKER.CLEANITER = 1 |

FELLWALKER.FLATSLOPE = 2 × RMS |

FELLWALKER.FWHMBEAM = 2 |

FELLWALKER.MAXBAD = 0.05 |

FELLWALKER.MAXJUMP = 4 |

FELLWALKER.MINDIP = 1 × RMS |

FELLWALKER.MINHEIGHT = 3 × RMS |

FELLWALKER.MINPIX = 27 |

FELLWALKER.NOISE = 2 × RMS |

FELLWALKER.VELORES = 2 |

Parameters Name And Default Value |
---|

CLUMPFIND.ALLOWEDGE = 1 |

CLUMPFIND.DELTAT = 2 × RMS |

CLUMPFIND.FWHMBEAM = 2 |

CLUMPFIND.IDLAIG = 1 |

CLUMPFIND.MAXBAD = 0.05 |

CLUMPFIND.MINPIX = 27 |

CLUMPFIND.NAXIS = 3 |

CLUMPFIND.NOISE = 2 × RMS |

CLUMPFIND.TLOW = 3 × RMS |

CLUMPFIND.VELORES = 2 |

**Table 7.**MCD-YOLOv5 joint DPC parameters. minRho represents the minimum peak intensity of the clump, i.e., the point corresponding to this intensity can be used as the cluster center during the DPC clustering process. minRho can be set according to the intensity characteristics of the clumps in different regions. minDelta represents the minimum pixel distance to distinguish between two clumps. minDelta can be set according to the sparseness of the clump distribution.

Parameters Name | Explanation | Default Value |
---|---|---|

minRho | The minimum intensity of clump | [2, 5] × RMS |

minDelta | The minimum pixel distance to distinguish between two clumps | 4 |

Method | Matched Clumps | Recall |
---|---|---|

MCD-YOLOv5 joint DPC | 9841 | 98.41% |

FellWalker | 9770 | $97.70\%$ |

ClumpFind | 9631 | $96.31\%$ |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Hu, J.-B.; Huang, Y.; Zheng, S.; Chen, Z.-W.; Zeng, X.-Y.; Luo, X.-Y.; Long, C.
Molecular-Clump Detection Based on an Improved YOLOv5 Joint Density Peak Clustering. *Universe* **2023**, *9*, 480.
https://doi.org/10.3390/universe9110480

**AMA Style**

Hu J-B, Huang Y, Zheng S, Chen Z-W, Zeng X-Y, Luo X-Y, Long C.
Molecular-Clump Detection Based on an Improved YOLOv5 Joint Density Peak Clustering. *Universe*. 2023; 9(11):480.
https://doi.org/10.3390/universe9110480

**Chicago/Turabian Style**

Hu, Jin-Bo, Yao Huang, Sheng Zheng, Zhi-Wei Chen, Xiang-Yun Zeng, Xiao-Yu Luo, and Chen Long.
2023. "Molecular-Clump Detection Based on an Improved YOLOv5 Joint Density Peak Clustering" *Universe* 9, no. 11: 480.
https://doi.org/10.3390/universe9110480