Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (11)

Search Parameters:
Keywords = anchor-free two-stage detector

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 51004 KB  
Article
An Intelligent Ship Detection Algorithm Based on Visual Sensor Signal Processing for AIoT-Enabled Maritime Surveillance Automation
by Liang Zhang, Yueqiu Jiang, Wei Yang and Bo Liu
Sensors 2026, 26(3), 767; https://doi.org/10.3390/s26030767 - 23 Jan 2026
Viewed by 211
Abstract
Oriented object detection constitutes a fundamental yet challenging task in Artificial Intelligence of Things (AIoT)-enabled maritime surveillance, where real-time processing of dense visual streams is imperative. However, existing detectors suffer from three critical limitations: sequential attention mechanisms that fail to capture coupled spatial–channel [...] Read more.
Oriented object detection constitutes a fundamental yet challenging task in Artificial Intelligence of Things (AIoT)-enabled maritime surveillance, where real-time processing of dense visual streams is imperative. However, existing detectors suffer from three critical limitations: sequential attention mechanisms that fail to capture coupled spatial–channel dependencies, unconstrained deformable convolutions that yield unstable predictions for elongated vessels, and center-based distance metrics that ignore angular alignment in sample assignment. To address these challenges, we propose JAOSD (Joint Attention-based Oriented Ship Detection), an anchor-free framework incorporating three novel components: (1) a joint attention module that processes spatial and channel branches in parallel with coupled fusion, (2) an adaptive geometric convolution with two-stage offset refinement and spatial consistency regularization, and (3) an orientation-aware Adaptive Sample Selection strategy based on corner-aware distance metrics. Extensive experiments on three benchmarks demonstrate that JAOSD achieves state-of-the-art performance—94.74% mAP on HRSC2016, 92.43% AP50 on FGSD2021, and 80.44% mAP on DOTA v1.0—while maintaining real-time inference at 42.6 FPS. Cross-domain evaluation on the Singapore Maritime Dataset further confirms robust generalization capability from aerial to shore-based surveillance scenarios without domain adaptation. Full article
Show Figures

Figure 1

19 pages, 4284 KB  
Article
AOGC: Anchor-Free Oriented Object Detection Based on Gaussian Centerness
by Zechen Wang, Chun Bao, Jie Cao and Qun Hao
Remote Sens. 2023, 15(19), 4690; https://doi.org/10.3390/rs15194690 - 25 Sep 2023
Cited by 5 | Viewed by 3251
Abstract
Oriented object detection is a challenging task in scene text detection and remote sensing image analysis, and it has attracted extensive attention due to the development of deep learning in recent years. Currently, mainstream oriented object detectors are anchor-based methods. These methods increase [...] Read more.
Oriented object detection is a challenging task in scene text detection and remote sensing image analysis, and it has attracted extensive attention due to the development of deep learning in recent years. Currently, mainstream oriented object detectors are anchor-based methods. These methods increase the computational load of the network and cause a large amount of anchor box redundancy. In order to address this issue, we proposed an anchor-free oriented object detection method based on Gaussian centerness (AOGC), which is a single-stage anchor-free detection method. Our method uses contextual attention FPN (CAFPN) to obtain the contextual information of the target. Then, we designed a label assignment method for the oriented objects, which can select positive samples with higher quality and is suitable for large aspect ratio targets. Finally, we developed a Gaussian kernel-based centerness branch that can effectively determine the significance of different anchors. AOGC achieved a mAP of 74.30% on the DOTA-1.0 datasets and 89.80% on the HRSC2016 datasets, respectively. Our experimental results show that AOGC exhibits superior performance to other methods in single-stage oriented object detection and achieves similar performance to the two-stage methods. Full article
Show Figures

Figure 1

21 pages, 8083 KB  
Article
PointPainting: 3D Object Detection Aided by Semantic Image Information
by Zhentong Gao, Qiantong Wang, Zongxu Pan, Zhenyu Zhai and Hui Long
Sensors 2023, 23(5), 2868; https://doi.org/10.3390/s23052868 - 6 Mar 2023
Cited by 6 | Viewed by 6243
Abstract
A multi-modal 3D object-detection method, based on data from cameras and LiDAR, has become a subject of research interest. PointPainting proposes a method for improving point-cloud-based 3D object detectors using semantic information from RGB images. However, this method still needs to improve on [...] Read more.
A multi-modal 3D object-detection method, based on data from cameras and LiDAR, has become a subject of research interest. PointPainting proposes a method for improving point-cloud-based 3D object detectors using semantic information from RGB images. However, this method still needs to improve on the following two complications: first, there are faulty parts in the image semantic segmentation results, leading to false detections. Second, the commonly used anchor assigner only considers the intersection over union (IoU) between the anchors and ground truth boxes, meaning that some anchors contain few target LiDAR points assigned as positive anchors. In this paper, three improvements are suggested to address these complications. Specifically, a novel weighting strategy is proposed for each anchor in the classification loss. This enables the detector to pay more attention to anchors containing inaccurate semantic information. Then, SegIoU, which incorporates semantic information, instead of IoU, is proposed for the anchor assignment. SegIoU measures the similarity of the semantic information between each anchor and ground truth box, avoiding the defective anchor assignments mentioned above. In addition, a dual-attention module is introduced to enhance the voxelized point cloud. The experiments demonstrate that the proposed modules obtained significant improvements in various methods, consisting of single-stage PointPillars, two-stage SECOND-IoU, anchor-base SECOND, and an anchor-free CenterPoint on the KITTI dataset. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

26 pages, 11097 KB  
Article
ATSD: Anchor-Free Two-Stage Ship Detection Based on Feature Enhancement in SAR Images
by Canming Yao, Pengfei Xie, Lei Zhang and Yuyuan Fang
Remote Sens. 2022, 14(23), 6058; https://doi.org/10.3390/rs14236058 - 29 Nov 2022
Cited by 17 | Viewed by 3231
Abstract
Syntheticap erture radar (SAR) ship detection in harbors is challenging due to the similar backscattering of ship targets to surrounding background interference. Prevalent two-stage ship detectors usually use an anchor-based region proposal network (RPN) to search for the possible regions of interest on [...] Read more.
Syntheticap erture radar (SAR) ship detection in harbors is challenging due to the similar backscattering of ship targets to surrounding background interference. Prevalent two-stage ship detectors usually use an anchor-based region proposal network (RPN) to search for the possible regions of interest on the whole image. However, most pre-defined anchor boxes are redundantly and randomly tiled on the image, manifested as low-quality object proposals. To address these issues, this paper proposes a novel detection method combined with two feature enhancement modules to improve ship detection capability. First, we propose a flexible anchor-free detector (AFD) to generate fewer but higher-quality proposals around the object centers in a keypoint prediction manner, which completely avoids the complicated computation in RPN, such as calculating overlapping related to anchor boxes. Second, we leverage the proposed spatial insertion attention (SIA) module to enhance the feature discrimination between ship targets and background interference. It accordingly encourages the detector to pay attention to the localization accuracy of ship targets. Third, a novel weighted cascade feature fusion (WCFF) module is proposed to adaptively aggregate multi-scale semantic features and thus help the detector boost the detection performance of multi-scale ships in complex scenes. Finally, combining the newly-designed AFD and SIA/WCFF modules, we present a new detector, named anchor-free two-stage ship detector (ATSD), for SAR ship detection under complex background interference. Extensive experiments on two public datasets, i.e., SSDD and HRSID, verify that our ATSD delivers state-of-the-art detection performance over conventional detectors. Full article
(This article belongs to the Special Issue Small or Moving Target Detection with Advanced Radar System)
Show Figures

Figure 1

44 pages, 25511 KB  
Article
Nemo: An Open-Source Transformer-Supercharged Benchmark for Fine-Grained Wildfire Smoke Detection
by Amirhessam Yazdi, Heyang Qin, Connor B. Jordan, Lei Yang and Feng Yan
Remote Sens. 2022, 14(16), 3979; https://doi.org/10.3390/rs14163979 - 16 Aug 2022
Cited by 21 | Viewed by 7697
Abstract
Deep-learning (DL)-based object detection algorithms can greatly benefit the community at large in fighting fires, advancing climate intelligence, and reducing health complications caused by hazardous smoke particles. Existing DL-based techniques, which are mostly based on convolutional networks, have proven to be effective in [...] Read more.
Deep-learning (DL)-based object detection algorithms can greatly benefit the community at large in fighting fires, advancing climate intelligence, and reducing health complications caused by hazardous smoke particles. Existing DL-based techniques, which are mostly based on convolutional networks, have proven to be effective in wildfire detection. However, there is still room for improvement. First, existing methods tend to have some commercial aspects, with limited publicly available data and models. In addition, studies aiming at the detection of wildfires at the incipient stage are rare. Smoke columns at this stage tend to be small, shallow, and often far from view, with low visibility. This makes finding and labeling enough data to train an efficient deep learning model very challenging. Finally, the inherent locality of convolution operators limits their ability to model long-range correlations between objects in an image. Recently, encoder–decoder transformers have emerged as interesting solutions beyond natural language processing to help capture global dependencies via self- and inter-attention mechanisms. We propose Nemo: a set of evolving, free, and open-source datasets, processed in standard COCO format, and wildfire smoke and fine-grained smoke density detectors, for use by the research community. We adapt Facebook’s DEtection TRansformer (DETR) to wildfire detection, which results in a much simpler technique, where the detection does not rely on convolution filters and anchors. Nemo is the first open-source benchmark for wildfire smoke density detection and Transformer-based wildfire smoke detection tailored to the early incipient stage. Two popular object detection algorithms (Faster R-CNN and RetinaNet) are used as alternatives and baselines for extensive evaluation. Our results confirm the superior performance of the transformer-based method in wildfire smoke detection across different object sizes. Moreover, we tested our model with 95 video sequences of wildfire starts from the public HPWREN database. Our model detected 97.9% of the fires in the incipient stage and 80% within 5 min from the start. On average, our model detected wildfire smoke within 3.6 min from the start, outperforming the baselines. Full article
(This article belongs to the Special Issue Artificial Intelligence for Natural Hazards (AI4NH))
Show Figures

Graphical abstract

41 pages, 4732 KB  
Review
Deep Learning for SAR Ship Detection: Past, Present and Future
by Jianwei Li, Congan Xu, Hang Su, Long Gao and Taoyang Wang
Remote Sens. 2022, 14(11), 2712; https://doi.org/10.3390/rs14112712 - 5 Jun 2022
Cited by 183 | Viewed by 17142
Abstract
After the revival of deep learning in computer vision in 2012, SAR ship detection comes into the deep learning era too. The deep learning-based computer vision algorithms can work in an end-to-end pipeline, without the need of designing features manually, and they have [...] Read more.
After the revival of deep learning in computer vision in 2012, SAR ship detection comes into the deep learning era too. The deep learning-based computer vision algorithms can work in an end-to-end pipeline, without the need of designing features manually, and they have amazing performance. As a result, it is also used to detect ships in SAR images. The beginning of this direction is the paper we published in 2017BIGSARDATA, in which the first dataset SSDD was used and shared with peers. Since then, lots of researchers focus their attention on this field. In this paper, we analyze the past, present, and future of the deep learning-based ship detection algorithms in SAR images. In the past section, we analyze the difference between traditional CFAR (constant false alarm rate) based and deep learning-based detectors through theory and experiment. The traditional method is unsupervised while the deep learning is strongly supervised, and their performance varies several times. In the present part, we analyze the 177 published papers about SAR ship detection. We highlight the dataset, algorithm, performance, deep learning framework, country, timeline, etc. After that, we introduce the use of single-stage, two-stage, anchor-free, train from scratch, oriented bounding box, multi-scale, and real-time detectors in detail in the 177 papers. The advantages and disadvantages of speed and accuracy are also analyzed. In the future part, we list the problem and direction of this field. We can find that, in the past five years, the AP50 has boosted from 78.8% in 2017 to 97.8 % in 2022 on SSDD. Additionally, we think that researchers should design algorithms according to the specific characteristics of SAR images. What we should do next is to bridge the gap between SAR ship detection and computer vision by merging the small datasets into a large one and formulating corresponding standards and benchmarks. We expect that this survey of 177 papers can make people better understand these algorithms and stimulate more research in this field. Full article
(This article belongs to the Special Issue Synthetic Aperture Radar (SAR) Meets Deep Learning)
Show Figures

Graphical abstract

21 pages, 20011 KB  
Article
Oriented Object Detection in Remote Sensing Images with Anchor-Free Oriented Region Proposal Network
by Jianxiang Li, Yan Tian, Yiping Xu and Zili Zhang
Remote Sens. 2022, 14(5), 1246; https://doi.org/10.3390/rs14051246 - 3 Mar 2022
Cited by 20 | Viewed by 6744
Abstract
Oriented object detection is a fundamental and challenging task in remote sensing image analysis that has recently drawn much attention. Currently, mainstream oriented object detectors are based on densely placed predefined anchors. However, the high number of anchors aggravates the positive and negative [...] Read more.
Oriented object detection is a fundamental and challenging task in remote sensing image analysis that has recently drawn much attention. Currently, mainstream oriented object detectors are based on densely placed predefined anchors. However, the high number of anchors aggravates the positive and negative sample imbalance problem, which may lead to duplicate detections or missed detections. To address the problem, this paper proposes a novel anchor-free two-stage oriented object detector. We propose the Anchor-Free Oriented Region Proposal Network (AFO-RPN) to generate high-quality oriented proposals without enormous predefined anchors. To deal with rotation problems, we also propose a new representation of an oriented box based on a polar coordinate system. To solve the severe appearance ambiguity problems faced by anchor-free methods, we use a Criss-Cross Attention Feature Pyramid Network (CCA-FPN) to exploit the contextual information of each pixel and its neighbors in order to enhance the feature representation. Extensive experiments on three public remote sensing benchmarks—DOTA, DIOR-R, and HRSC2016—demonstrate that our method can achieve very promising detection performance, with a mean average precision (mAP) of 80.68%, 67.15%, and 90.45%, respectively, on the benchmarks. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

16 pages, 5409 KB  
Article
Transformer for Tree Counting in Aerial Images
by Guang Chen and Yi Shang
Remote Sens. 2022, 14(3), 476; https://doi.org/10.3390/rs14030476 - 20 Jan 2022
Cited by 32 | Viewed by 8017
Abstract
The number of trees and their spatial distribution are key information for forest management. In recent years, deep learning-based approaches have been proposed and shown promising results in lowering the expensive labor cost of a forest inventory. In this paper, we propose a [...] Read more.
The number of trees and their spatial distribution are key information for forest management. In recent years, deep learning-based approaches have been proposed and shown promising results in lowering the expensive labor cost of a forest inventory. In this paper, we propose a new efficient deep learning model called density transformer or DENT for automatic tree counting from aerial images. The architecture of DENT contains a multi-receptive field convolutional neural network to extract visual feature representation from local patches and their wide context, a transformer encoder to transfer contextual information across correlated positions, a density map generator to generate spatial distribution map of trees, and a fast tree counter to estimate the number of trees in each input image. We compare DENT with a variety of state-of-art methods, including one-stage and two-stage, anchor-based and anchor-free deep neural detectors, and different types of fully convolutional regressors for density estimation. The methods are evaluated on a new large dataset we built and an existing cross-site dataset. DENT achieves top accuracy on both datasets, significantly outperforming most of the other methods. We have released our new dataset, called Yosemite Tree Dataset, containing a 10 km2 rectangular study area with around 100k trees annotated, as a benchmark for public access. Full article
Show Figures

Figure 1

25 pages, 36080 KB  
Article
Benchmarking Anchor-Based and Anchor-Free State-of-the-Art Deep Learning Methods for Individual Tree Detection in RGB High-Resolution Images
by Pedro Zamboni, José Marcato Junior, Jonathan de Andrade Silva, Gabriela Takahashi Miyoshi, Edson Takashi Matsubara, Keiller Nogueira and Wesley Nunes Gonçalves
Remote Sens. 2021, 13(13), 2482; https://doi.org/10.3390/rs13132482 - 25 Jun 2021
Cited by 32 | Viewed by 8392
Abstract
Urban forests contribute to maintaining livability and increase the resilience of cities in the face of population growth and climate change. Information about the geographical distribution of individual trees is essential for the proper management of these systems. RGB high-resolution aerial images have [...] Read more.
Urban forests contribute to maintaining livability and increase the resilience of cities in the face of population growth and climate change. Information about the geographical distribution of individual trees is essential for the proper management of these systems. RGB high-resolution aerial images have emerged as a cheap and efficient source of data, although detecting and mapping single trees in an urban environment is a challenging task. Thus, we propose the evaluation of novel methods for single tree crown detection, as most of these methods have not been investigated in remote sensing applications. A total of 21 methods were investigated, including anchor-based (one and two-stage) and anchor-free state-of-the-art deep-learning methods. We used two orthoimages divided into 220 non-overlapping patches of 512 × 512 pixels with a ground sample distance (GSD) of 10 cm. The orthoimages were manually annotated, and 3382 single tree crowns were identified as the ground-truth. Our findings show that the anchor-free detectors achieved the best average performance with an AP50 of 0.686. We observed that the two-stage anchor-based and anchor-free methods showed better performance for this task, emphasizing the FSAF, Double Heads, CARAFE, ATSS, and FoveaBox models. RetinaNet, which is currently commonly applied in remote sensing, did not show satisfactory performance, and Faster R-CNN had lower results than the best methods but with no statistically significant difference. Our findings contribute to a better understanding of the performance of novel deep-learning methods in remote sensing applications and could be used as an indicator of the most suitable methods in such applications. Full article
Show Figures

Graphical abstract

23 pages, 5360 KB  
Article
On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data
by Manuel Carranza-García, Jesús Torres-Mateo, Pedro Lara-Benítez and Jorge García-Gutiérrez
Remote Sens. 2021, 13(1), 89; https://doi.org/10.3390/rs13010089 - 29 Dec 2020
Cited by 223 | Viewed by 23096
Abstract
Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario [...] Read more.
Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage. Full article
Show Figures

Graphical abstract

16 pages, 7698 KB  
Article
SAFDet: A Semi-Anchor-Free Detector for Effective Detection of Oriented Objects in Aerial Images
by Zhenyu Fang, Jinchang Ren, He Sun, Stephen Marshall, Junwei Han and Huimin Zhao
Remote Sens. 2020, 12(19), 3225; https://doi.org/10.3390/rs12193225 - 3 Oct 2020
Cited by 13 | Viewed by 4254
Abstract
An oriented bounding box (OBB) is preferable over a horizontal bounding box (HBB) in accurate object detection. Most of existing works utilize a two-stage detector for locating the HBB and OBB, respectively, which have suffered from the misaligned horizontal proposals and the interference [...] Read more.
An oriented bounding box (OBB) is preferable over a horizontal bounding box (HBB) in accurate object detection. Most of existing works utilize a two-stage detector for locating the HBB and OBB, respectively, which have suffered from the misaligned horizontal proposals and the interference from complex backgrounds. To tackle these issues, region of interest transformer and attention models were proposed, yet they are extremely computationally intensive. To this end, we propose a semi-anchor-free detector (SAFDet) for object detection in aerial images, where a rotation-anchor-free-branch (RAFB) is used to enhance the foreground features via precisely regressing the OBB. Meanwhile, a center-prediction-module (CPM) is introduced for enhancing object localization and suppressing the background noise. Both RAFB and CPM are deployed during training, avoiding increased computational cost of inference. By evaluating on DOTA and HRSC2016 datasets, the efficacy of our approach has been fully validated for a good balance between the accuracy and computational cost. Full article
Show Figures

Graphical abstract

Back to TopTop