Next Article in Journal
Layered Multi-Objective Optimization of Permanent Magnet Synchronous Linear Motor Considering Thrust Ripple Suppression
Previous Article in Journal
Detecting Escherichia coli on Conventional Food Processing Surfaces Using UV-C Fluorescence Imaging and Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Dual-Task Learning for Fine-Grained Bird Species and Behavior Recognition via Token Re-Segmentation, Multi-Scale Mixed Attention, and Feature Interleaving

College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(2), 966; https://doi.org/10.3390/app16020966 (registering DOI)
Submission received: 11 December 2025 / Revised: 10 January 2026 / Accepted: 14 January 2026 / Published: 17 January 2026
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

In the ecosystem, birds are important indicators that can sensitively reflect changes in the ecological environment and its health. However, bird monitoring has challenges due to species diversity, variable behaviors, and distinct morphological characteristics. Therefore, we propose a parallel dual-branch hybrid CNN–Transformer architecture for feature extraction that simultaneously captures local and global image features to address the “local feature similarity” issue in dual tasks of bird species and behaviors. The dual-task framework comprises three main components: the Token Re-segmentation Module (TRM), the Multi-scale Adaptive Module (MAM), and the Feature Interleaving Structure (FIS). The designed MAM fuses hybrid attention to address the problem of different-scale birds. MAM models the interdependencies between spatial and channel dimensions of features from different scales. It enables the model to adaptively choose scale-specific feature representations, accommodating inputs of different scales. In addition, we designed an efficient feature-sharing mechanism, called FIS, between parallel CNN branches. FIS interleaving delivers and fuses CNN feature maps across parallel layers, combining them with the features of the corresponding Transformer layer to share local and global information at different depths and promote deep feature fusion across parallel networks. Finally, we designed the TRM to address the challenge of visually similar but distinct bird species and of similar poses with distinct behaviors. TRM adopts a two-step approach: first, it locates discriminative regions, and then performs fine segmentation on them. This module enables the network to allocate relatively more attention to key areas while merging non-essential information and reducing interference from irrelevant details. Experiments on the self-made dataset demonstrate that, compared with state-of-the-art classification networks, the proposed network achieves the best performance, achieving 79.70% accuracy in bird species recognition, 76.21% in behavior recognition, and the best performance in dual-task recognition.
Keywords: token re-segmentation; multi-scale mixed attention; feature interleaving structure; dual-task; fine-grained classification; posture recognition token re-segmentation; multi-scale mixed attention; feature interleaving structure; dual-task; fine-grained classification; posture recognition

Share and Cite

MDPI and ACS Style

Zhang, C.; Chen, Z.; Lin, Y.; Huang, X.; Lin, C.-W. Dual-Task Learning for Fine-Grained Bird Species and Behavior Recognition via Token Re-Segmentation, Multi-Scale Mixed Attention, and Feature Interleaving. Appl. Sci. 2026, 16, 966. https://doi.org/10.3390/app16020966

AMA Style

Zhang C, Chen Z, Lin Y, Huang X, Lin C-W. Dual-Task Learning for Fine-Grained Bird Species and Behavior Recognition via Token Re-Segmentation, Multi-Scale Mixed Attention, and Feature Interleaving. Applied Sciences. 2026; 16(2):966. https://doi.org/10.3390/app16020966

Chicago/Turabian Style

Zhang, Cong, Zhichao Chen, Ye Lin, Xiuping Huang, and Chih-Wei Lin. 2026. "Dual-Task Learning for Fine-Grained Bird Species and Behavior Recognition via Token Re-Segmentation, Multi-Scale Mixed Attention, and Feature Interleaving" Applied Sciences 16, no. 2: 966. https://doi.org/10.3390/app16020966

APA Style

Zhang, C., Chen, Z., Lin, Y., Huang, X., & Lin, C.-W. (2026). Dual-Task Learning for Fine-Grained Bird Species and Behavior Recognition via Token Re-Segmentation, Multi-Scale Mixed Attention, and Feature Interleaving. Applied Sciences, 16(2), 966. https://doi.org/10.3390/app16020966

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop