Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (27)

Search Parameters:
Keywords = camouflaged object segmentation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 9019 KB  
Article
SAF-SD: Self-Distillation Object Segmentation Method Based on Sequential Three-Way Mask and Attention Fusion
by Biao Wang, Jun Su, Volodymyr Kochan and Lingyu Yan
Sensors 2026, 26(7), 2170; https://doi.org/10.3390/s26072170 - 31 Mar 2026
Viewed by 255
Abstract
Transformer models have achieved powerful performance in various computer vision tasks. However, their black-box nature severely limits model interpretability and the reliability of real-world applications. Most existing interpretation methods generate explanation maps by perturbing masks from the last layer of the Transformer encoder, [...] Read more.
Transformer models have achieved powerful performance in various computer vision tasks. However, their black-box nature severely limits model interpretability and the reliability of real-world applications. Most existing interpretation methods generate explanation maps by perturbing masks from the last layer of the Transformer encoder, but they often overlook uncertain information in masks and detail loss during upsampling and downsampling, resulting in coarse localization, blurred boundaries, and significant background noise in explanations. To address these issues, this paper proposes a self-distillation object segmentation method based on sequential three-way mask and attention fusion (SAF-SD), targeting salient and camouflaged binary object segmentation tasks (sub-tasks of binary pixel-level segmentation). The method consists of two core modules: the sequential three-way mask (S3WM) module and the attention fusion (AF) module. The S3WM module performs strict threshold filtering on masks generated from the final-layer feature maps of the Transformer, aiming to accurately segment foreground objects from backgrounds via binary pixel-level prediction. The AF module aggregates attention matrices across all Transformer encoder layers to construct a cross-layer relation matrix, capturing global semantic dependencies among image patches (e.g., interactions between foreground, background, and edge regions). It then computes the importance score for each patch, refining details and suppressing noise in the initial explanation results. Extensive experimental results demonstrate that SAF-SD significantly outperforms existing baseline methods across key evaluation metrics. Full article
Show Figures

Figure 1

30 pages, 16273 KB  
Article
PMG-SAM: Boosting Auto-Segmentation of SAM with Pre-Mask Guidance
by Jixue Gao, Xiaoyan Jiang, Anjie Wang, Yongbin Gao, Zhijun Fang and Michael S. Lew
Sensors 2026, 26(2), 365; https://doi.org/10.3390/s26020365 - 6 Jan 2026
Cited by 1 | Viewed by 842
Abstract
The Segment Anything Model (SAM), a foundational vision model, struggles with fully automatic segmentation of specific objects. Its “segment everything” mode, reliant on a grid-based prompt strategy, suffers from localization blindness and computational redundancy, leading to poor performance on tasks like Dichotomous Image [...] Read more.
The Segment Anything Model (SAM), a foundational vision model, struggles with fully automatic segmentation of specific objects. Its “segment everything” mode, reliant on a grid-based prompt strategy, suffers from localization blindness and computational redundancy, leading to poor performance on tasks like Dichotomous Image Segmentation (DIS). To address this, we propose PMG-SAM, a framework that introduces a Pre-Mask Guided paradigm for automatic targeted segmentation. Our method employs a dual-branch encoder to generate a coarse global Pre-Mask, which then acts as a dense internal prompt to guide the segmentation decoder. A key component, our proposed Dense Residual Fusion Module (DRFM), iteratively co-refines multi-scale features to significantly enhance the Pre-Mask’s quality. Extensive experiments on challenging DIS and Camouflaged Object Segmentation (COS) tasks validate our approach. On the DIS-TE2 benchmark, PMG-SAM boosts the maximal F-measure from SAM’s 0.283 to 0.815. Notably, our fully automatic model’s performance surpasses even the ground-truth bounding box prompted modes of SAM and SAM2, while using only 22.9 M trainable parameters (58.8% of SAM2-Tiny). PMG-SAM thus presents an efficient and accurate paradigm for resolving the localization bottleneck of large vision models in prompt-free scenarios. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

16 pages, 7752 KB  
Article
Image Segmentation of Cottony Mass Produced by Euphyllura olivina (Hemiptera: Psyllidae) in Olive Trees Using Deep Learning
by Henry O. Velesaca, Francisca Ruano, Alice Gomez-Cantos and Juan A. Holgado-Terriza
Agriculture 2025, 15(23), 2485; https://doi.org/10.3390/agriculture15232485 - 29 Nov 2025
Viewed by 550
Abstract
The olive psyllid (Euphyllura olivina), previously considered a secondary pest in Spain, is becoming more prevalent due to climate change and rising average temperatures. Its cottony wax secretions can cause substantial damage to olive crops under certain climatic conditions. Traditional monitoring [...] Read more.
The olive psyllid (Euphyllura olivina), previously considered a secondary pest in Spain, is becoming more prevalent due to climate change and rising average temperatures. Its cottony wax secretions can cause substantial damage to olive crops under certain climatic conditions. Traditional monitoring methods for this pest are often labor-intensive, subjective, and impractical for large-scale surveillance. This study presents an automatic image segmentation approach based on deep learning to detect and quantify the cottony masses produced by E. olivina in olive trees. A well-annotated image dataset is developed and published, and a thorough evaluation of current camouflaged object detection (COD) methods is carried out for this task. Our results show that deep learning-based segmentation enables accurate and non-invasive assessment of pest symptoms, even in challenging visual conditions. However, further calibration and field validation are required before these methods can be deployed for operational integrated pest management. This work establishes a public dataset and a baseline benchmark, providing a foundation for future research and decision-support tools in precision agriculture. Full article
Show Figures

Figure 1

15 pages, 3988 KB  
Article
Boundary-Guided Differential Attention: Enhancing Camouflaged Object Detection Accuracy
by Hongliang Zhang, Bolin Xu and Sanxin Jiang
J. Imaging 2025, 11(11), 412; https://doi.org/10.3390/jimaging11110412 - 14 Nov 2025
Viewed by 1101
Abstract
Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD [...] Read more.
Camouflaged Object Detection (COD) is a challenging computer vision task aimed at accurately identifying and segmenting objects seamlessly blended into their backgrounds. This task has broad applications across medical image segmentation, defect detection, agricultural image detection, security monitoring, and scientific research. Traditional COD methods often struggle with precise segmentation due to the high similarity between camouflaged objects and their surroundings. In this study, we introduce a Boundary-Guided Differential Attention Network (BDA-Net) to address these challenges. BDA-Net first extracts boundary features by fusing multi-scale image features and applying channel attention. Subsequently, it employs a differential attention mechanism, guided by these boundary features, to highlight camouflaged objects and suppress background information. The weighted features are then progressively fused to generate accurate camouflage object masks. Experimental results on the COD10K, NC4K, and CAMO datasets demonstrate that BDA-Net outperforms most state-of-the-art COD methods, achieving higher accuracy. Here we show that our approach improves detection accuracy by up to 3.6% on key metrics, offering a robust solution for precise camouflaged object segmentation. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

14 pages, 1787 KB  
Article
HE-DMDeception: Adversarial Attack Network for 3D Object Detection Based on Human Eye and Deep Learning Model Deception
by Pin Zhang, Yawen Liu, Heng Liu, Yichao Teng, Jiazheng Ni, Zhuansun Xiaobo and Jiajia Wang
Information 2025, 16(10), 867; https://doi.org/10.3390/info16100867 - 7 Oct 2025
Viewed by 970
Abstract
This paper presents HE-DMDeception, a novel adversarial attack network that integrates human visual deception with deep model deception to enhance the security of 3D object detection. Existing patch-based and camouflage methods can mislead deep learning models but struggle to generate visually imperceptible, high-quality [...] Read more.
This paper presents HE-DMDeception, a novel adversarial attack network that integrates human visual deception with deep model deception to enhance the security of 3D object detection. Existing patch-based and camouflage methods can mislead deep learning models but struggle to generate visually imperceptible, high-quality textures. Our framework employs a CycleGAN-based camouflage network to generate highly camouflaged background textures, while a dedicated deception module disrupts non-maximum suppression (NMS) and attention mechanisms through optimized constraints that balance attack efficacy and visual fidelity. To overcome the scarcity of annotated vehicle data, an image segmentation module based on the pre-trained Segment Anything (SAM) model is introduced, leveraging a two-stage training strategy combining semi-supervised self-training and supervised fine-tuning. Experimental results show that the minimum P@0.5 values (50%, 55%, 20%, 25%, 25%) were achieved by HE-DMDeception across You Only Look Once version 8 (YOLOv8), Real-Time Detection Transformer (RT-DETR), Fast Region-based Convolutional Neural Network (Faster-RCNN), Single Shot MultiBox Detector (SSD), and MaskRegion-based Convolutional Neural Network (Mask RCNN) detection models, while maintaining high visual consistency with the original camouflage. These findings demonstrate the robustness and practicality of HE-DMDeception, offering new insights into 3D object detection adversarial attacks. Full article
Show Figures

Figure 1

22 pages, 5943 KB  
Article
LiteCOD: Lightweight Camouflaged Object Detection via Holistic Understanding of Local-Global Features and Multi-Scale Fusion
by Abbas Khan, Hayat Ullah and Arslan Munir
AI 2025, 6(9), 197; https://doi.org/10.3390/ai6090197 - 22 Aug 2025
Cited by 2 | Viewed by 2075
Abstract
Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource [...] Read more.
Camouflaged object detection (COD) represents one of the most challenging tasks in computer vision, requiring sophisticated approaches to accurately extract objects that seamlessly blend within visually similar backgrounds. While contemporary techniques demonstrate promising detection performance, they predominantly suffer from computational complexity and resource requirements that severely limit their deployment in real-time applications, particularly on mobile devices and edge computing platforms. To address these limitations, we propose LiteCOD, an efficient lightweight framework that integrates local and global perceptions through holistic feature fusion and specially designed efficient attention mechanisms. Our approach achieves superior detection accuracy while maintaining computational efficiency essential for practical deployment, with enhanced feature propagation and minimal computational overhead. Extensive experiments validate LiteCOD’s effectiveness, demonstrating that it surpasses existing lightweight methods with average improvements of 7.55% in the F-measure and 8.08% overall performance gain across three benchmark datasets. Our results indicate that our framework consistently outperforms 20 state-of-the-art methods across quantitative metrics, computational efficiency, and overall performance while achieving real-time inference capabilities with a significantly reduced parameter count of 5.15M parameters. LiteCOD establishes a practical solution bridging the gap between detection accuracy and deployment feasibility in resource-constrained environments. Full article
Show Figures

Figure 1

18 pages, 1956 KB  
Article
FCNet: A Transformer-Based Context-Aware Segmentation Framework for Detecting Camouflaged Fruits in Orchard Environments
by Ivan Roy Evangelista, Argel Bandala and Elmer Dadios
Technologies 2025, 13(8), 372; https://doi.org/10.3390/technologies13080372 - 20 Aug 2025
Cited by 2 | Viewed by 959
Abstract
Fruit segmentation is an essential task due to its importance in accurate disease prevention, yield estimation, and automated harvesting. However, accurate object segmentation in agricultural environments remains challenging due to visual complexities such as background clutter, occlusion, small object size, and color–texture similarities [...] Read more.
Fruit segmentation is an essential task due to its importance in accurate disease prevention, yield estimation, and automated harvesting. However, accurate object segmentation in agricultural environments remains challenging due to visual complexities such as background clutter, occlusion, small object size, and color–texture similarities that lead to camouflaging. Traditional methods often struggle to detect partially occluded or visually blended fruits, leading to poor detection performance. In this study, we propose a context-aware segmentation framework designed for orchard-level mango fruit detection. We integrate multiscale feature extraction based on PVTv2 architecture, a feature enhancement module using Atrous Spatial Pyramid Pooling (ASPP) and attention techniques, and a novel refinement mechanism employing a Position-based Layer Normalization (PLN). We conducted a comparative study against established segmentation models, employing both quantitative and qualitative evaluations. Results demonstrate the superior performance of our model across all metrics. An ablation study validated the contributions of the enhancement and refinement modules, with the former yielding performance gains of 2.43%, 3.10%, 5.65%, 4.19%, and 4.35% in S-measure, mean E-measure, weighted F-measure, mean F-measure, and IoU, respectively, and the latter achieving improvements of 2.07%, 1.93%, 6.85%, 4.84%, and 2.73%, in the said metrics. Full article
Show Figures

Graphical abstract

27 pages, 1868 KB  
Article
SAM2-DFBCNet: A Camouflaged Object Detection Network Based on the Heira Architecture of SAM2
by Cao Yuan, Libang Liu, Yaqin Li and Jianxiang Li
Sensors 2025, 25(14), 4509; https://doi.org/10.3390/s25144509 - 21 Jul 2025
Cited by 1 | Viewed by 2432
Abstract
Camouflaged Object Detection (COD) aims to segment objects that are highly integrated with their background, presenting significant challenges such as low contrast, complex textures, and blurred boundaries. Existing deep learning methods often struggle to achieve robust segmentation under these conditions. To address these [...] Read more.
Camouflaged Object Detection (COD) aims to segment objects that are highly integrated with their background, presenting significant challenges such as low contrast, complex textures, and blurred boundaries. Existing deep learning methods often struggle to achieve robust segmentation under these conditions. To address these limitations, this paper proposes a novel COD network, SAM2-DFBCNet, built upon the SAM2 Hiera architecture. Our network incorporates three key modules: (1) the Camouflage-Aware Context Enhancement Module (CACEM), which fuses local and global features through an attention mechanism to enhance contextual awareness in low-contrast scenes; (2) the Cross-Scale Feature Interaction Bridge (CSFIB), which employs a bidirectional convolutional GRU for the dynamic fusion of multi-scale features, effectively mitigating representation inconsistencies caused by complex textures and deformations; and (3) the Dynamic Boundary Refinement Module (DBRM), which combines channel and spatial attention mechanisms to optimize boundary localization accuracy and enhance segmentation details. Extensive experiments on three public datasets—CAMO, COD10K, and NC4K—demonstrate that SAM2-DFBCNet outperforms twenty state-of-the-art methods, achieving maximum improvements of 7.4%, 5.78%, and 4.78% in key metrics such as S-measure (Sα), F-measure (Fβ), and mean E-measure (Eϕ), respectively, while reducing the Mean Absolute Error (M) by 37.8%. These results validate the superior performance and robustness of our approach in complex camouflage scenarios. Full article
(This article belongs to the Special Issue Transformer Applications in Target Tracking)
Show Figures

Figure 1

19 pages, 3691 KB  
Article
ATDMNet: Multi-Head Agent Attention and Top-k Dynamic Mask for Camouflaged Object Detection
by Rui Fu, Yuehui Li, Chih-Cheng Chen, Yile Duan, Pengjian Yao and Kaixin Zhou
Sensors 2025, 25(10), 3001; https://doi.org/10.3390/s25103001 - 9 May 2025
Viewed by 1446
Abstract
Camouflaged object detection (COD) encounters substantial difficulties owing to the visual resemblance between targets and their environments, together with discrepancies in multiscale representation of features. Current methodologies confront obstacles with feature distraction, modeling far-reaching dependencies, fusing multiple-scale details, and extracting boundary specifics. Consequently, [...] Read more.
Camouflaged object detection (COD) encounters substantial difficulties owing to the visual resemblance between targets and their environments, together with discrepancies in multiscale representation of features. Current methodologies confront obstacles with feature distraction, modeling far-reaching dependencies, fusing multiple-scale details, and extracting boundary specifics. Consequently, we propose ATDMNet, an amalgamated architecture combining CNN and transformer within a numerous phases feature extraction framework. ATDMNet employs Res2Net as the foundational encoder and incorporates two essential components: multi-head agent attention (MHA) and top-k dynamic mask (TDM). MHA improves local feature sensitivity and long-range dependency modeling by incorporating agent nodes and positional biases, whereas TDM boosts attention with top-k operations and multiscale dynamic methods. The decoding phase utilizes bilinear upsampling and sophisticated semantic guidance to enhance low-level characteristics, hence ensuring precise segmentation. Enhanced performance is achieved by deep supervision and a hybrid loss function. Experiments applying COD datasets (NC4K, COD10K, CAMO) demonstrate that ATDMNet establishes a new benchmark in both precision and efficiency. Full article
(This article belongs to the Special Issue Imaging and Sensing in Fiber Optics and Photonics: 2nd Edition)
Show Figures

Figure 1

1 pages, 123 KB  
Correction
Correction: Kamran et al. Camouflage Object Segmentation Using an Optimized Deep-Learning Approach. Mathematics 2022, 10, 4219
by Muhammad Kamran, Saeed Ur Rehman, Talha Meraj, Khalid A. Alnowibet and Hafiz Tayyab Rauf
Mathematics 2025, 13(7), 1058; https://doi.org/10.3390/math13071058 - 25 Mar 2025
Cited by 1 | Viewed by 421
Abstract
In the original publication [...] Full article
24 pages, 12658 KB  
Article
Camouflaged Object Detection with Enhanced Small-Structure Awareness in Complex Backgrounds
by Yaning Lv, Sanyang Liu, Yudong Gong and Jing Yang
Electronics 2025, 14(6), 1118; https://doi.org/10.3390/electronics14061118 - 12 Mar 2025
Cited by 9 | Viewed by 3482
Abstract
Small-Structure Camouflaged Object Detection (SSCOD) is a highly promising yet challenging task, as small-structure targets often exhibit weaker features and occupy a significantly smaller proportion of the image compared to normal-sized targets. Such data are not only prevalent in existing benchmark camouflaged object [...] Read more.
Small-Structure Camouflaged Object Detection (SSCOD) is a highly promising yet challenging task, as small-structure targets often exhibit weaker features and occupy a significantly smaller proportion of the image compared to normal-sized targets. Such data are not only prevalent in existing benchmark camouflaged object detection datasets but also frequently encountered in real-world scenarios. Although existing camouflaged object detection (COD) methods have significantly improved detection accuracy, research specifically focused on SSCOD remains limited. To further advance the SSCOD task, we propose a detail-preserving multi-scale adaptive network architecture that incorporates the following key components: (1) An adaptive scaling strategy designed to mimic human visual perception when observing blurry targets. (2) An Attentive Atrous Spatial Pyramid Pooling (A2SPP) module, enabling each position in the feature map to autonomously learn the optimal feature scale. (3) A scale integration mechanism, leveraging Haar Wavelet-based Downsampling (HWD) and bilinear upsampling to preserve both contextual and fine-grained details across multiple scales. (4) A Feature Enhancement Module (FEM), specifically tailored to refine feature representations in small-structure detection scenarios. Extensive comparative experiments and ablation studies conducted on three camouflaged object detection datasets, as well as our proposed small-structure test datasets, demonstrated that our framework outperformed existing state-of-the-art (SOTA) methods. Notably, our approach achieved superior performance in detecting small-structured targets, highlighting its effectiveness and robustness in addressing the challenges of SSCOD tasks. Additionally, we conducted polyp segmentation experiments on four datasets, and the results showed that our framework is also well-suited for polyp segmentation, consistently outperforming other recent methods. Full article
Show Figures

Figure 1

18 pages, 6037 KB  
Article
Cross-Layer Semantic Guidance Network for Camouflaged Object Detection
by Shiyu He, Chao Yin and Xiaoqiang Li
Electronics 2025, 14(4), 779; https://doi.org/10.3390/electronics14040779 - 17 Feb 2025
Viewed by 1365
Abstract
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high visual similarity between camouflaged objects and their surrounding environments. Traditional methods relying on the late-stage fusion of high-level semantic features and low-level visual features have reached a performance [...] Read more.
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high visual similarity between camouflaged objects and their surrounding environments. Traditional methods relying on the late-stage fusion of high-level semantic features and low-level visual features have reached a performance plateau, limiting their ability to accurately segment object boundaries or enhance object localization. This paper proposes the Cross-layer Semantic Guidance Network (CSGNet), a novel framework designed to progressively integrate semantic and visual features across multiple stages, addressing these limitations. CSGNet introduces two innovative modules: the Cross-Layer Interaction Module (CLIM) and the Semantic Refinement Module (SRM). CLIM facilitates continuous cross-layer semantic interaction, refining high-level semantic information to provide consistent and effective guidance for detecting camouflaged objects. Meanwhile, SRM leverages this refined semantic guidance to enhance low-level visual features, employing feature-level attention mechanisms to suppress background noise and highlight critical object details. This progressive integration strategy ensures precise object localization and accurate boundary segmentation across challenging scenarios. Extensive experiments on three widely used COD benchmark datasets—CAMO, COD10K, and NC4K—demonstrate the effectiveness of CSGNet, achieving state-of-the-art performance with a mean error (M) of 0.042 on CAMO, 0.020 on COD10K, and 0.029 on NC4K. Full article
Show Figures

Figure 1

18 pages, 6655 KB  
Article
Curiosity-Driven Camouflaged Object Segmentation
by Mengyin Pang, Meijun Sun and Zheng Wang
Appl. Sci. 2025, 15(1), 173; https://doi.org/10.3390/app15010173 - 28 Dec 2024
Viewed by 1358
Abstract
Camouflaged object segmentation refers to the task of accurately extracting objects that are seamlessly integrated within their surrounding environment. Existing deep-learning methods frequently encounter challenges in accurately segmenting camouflaged objects, particularly in capturing their complete and intricate details. To this end, we propose [...] Read more.
Camouflaged object segmentation refers to the task of accurately extracting objects that are seamlessly integrated within their surrounding environment. Existing deep-learning methods frequently encounter challenges in accurately segmenting camouflaged objects, particularly in capturing their complete and intricate details. To this end, we propose a novel method based on the Curiosity-Driven network, which is motivated by the innate human tendency for curiosity when encountering ambiguous regions and the subsequent drive to explore and observe objects’ details. Specifically, the proposed fusion bridge module aims to exploit the model’s inherent curiosity to fuse these features extracted by the dual-branch feature encoder to capture the complete details of the object. Then, drawing inspiration from curiosity, the curiosity-refinement module is proposed to progressively refine the initial predictions by exploring unknown regions within the object’s surrounding environment. Notably, we develop a novel curiosity-calculation operation to discover and remove curiosity, leading to accurate segmentation results. Extensive quantitative and qualitative experiments demonstrate that the proposed model significantly outperforms the existing competitors on three challenging benchmark datasets. Compared with the recently proposed state-of-the-art method, our model achieves performance gains of 1.80% on average for Sα. Moreover, our model can be extended to the polyp and industrial defects segmentation tasks, validating its robustness and effectiveness. Full article
Show Figures

Figure 1

20 pages, 6728 KB  
Article
Diffusion Model for Camouflaged Object Segmentation with Frequency Domain
by Wei Cai, Weijie Gao, Yao Ding, Xinhao Jiang, Xin Wang and Xingyu Di
Electronics 2024, 13(19), 3922; https://doi.org/10.3390/electronics13193922 - 3 Oct 2024
Cited by 1 | Viewed by 3393
Abstract
The task of camouflaged object segmentation (COS) is a challenging endeavor that entails the identification of objects that closely blend in with their surrounding background. Furthermore, the camouflaged object’s obscure form and its subtle differentiation from the background present significant challenges during the [...] Read more.
The task of camouflaged object segmentation (COS) is a challenging endeavor that entails the identification of objects that closely blend in with their surrounding background. Furthermore, the camouflaged object’s obscure form and its subtle differentiation from the background present significant challenges during the feature extraction phase of the network. In order to extract more comprehensive information, thereby improving the accuracy of COS, we propose a diffusion model for a COS network that utilizes frequency domain information as auxiliary input, and we name it FreDiff. Firstly, we proposed a frequency auxiliary module (FAM) to extract frequency domain features. Then, we designed a Global Fusion Module (GFM) to make FreDiff pay attention to the global features. Finally, we proposed an Upsample Enhancement Module (UEM) to enhance the detailed information of the features and perform upsampling before inputting them into the diffusion model. Additionally, taking into account the specific characteristics of COS, we develop the specialized training strategy for FreDiff. We compared FreDiff with 17 COS models on the four challenging COS datasets. Experimental results showed that FreDiff outperforms or is consistent with other state-of-the-art methods under five evaluation metrics. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

20 pages, 2384 KB  
Article
A Cross-Level Iterative Subtraction Network for Camouflaged Object Detection
by Tongtong Hu, Chao Zhang, Xin Lyu, Xiaowen Sun, Shangjing Chen, Tao Zeng and Jiale Chen
Appl. Sci. 2024, 14(17), 8063; https://doi.org/10.3390/app14178063 - 9 Sep 2024
Viewed by 1662
Abstract
Camouflaged object detection (COD) is a challenging task, aimed at segmenting objects that are similar in color and texture to their background. Sufficient multi-scale feature fusion is crucial for accurately segmenting object regions. However, most methods usually focus on information compensation, overlooking the [...] Read more.
Camouflaged object detection (COD) is a challenging task, aimed at segmenting objects that are similar in color and texture to their background. Sufficient multi-scale feature fusion is crucial for accurately segmenting object regions. However, most methods usually focus on information compensation, overlooking the difference between features, which is important for distinguishing the object from the background. To this end, we propose the cross-level iterative subtraction network (CISNet), which integrates information from cross-layer features and enhances details through iteration mechanisms. CISNet involves a cross-level iterative structure (CIS) for feature complementarity, where texture information is used to enrich high-level features and semantic information is used to enhance low-level features. In particular, we present a multi-scale strip convolution subtraction (MSCSub) module within CIS to extract difference information between cross-level features and fuse multi-scale features, which improves the feature representation and guides accurate segmentation. Furthermore, an enhanced guided attention (EGA) module is presented to refine features by deeply mining local context information and capturing a broader range of relationships between different feature maps in a top-down manner. Extensive experiments conducted on four benchmark datasets demonstrate that our model outperforms the state-of-the-art COD models in all evaluation metrics. Full article
Show Figures

Figure 1

Back to TopTop