Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (2)

Search Parameters:
Keywords = TDANet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 7008 KB  
Article
Improving Top-Down Attention Network in Speech Separation by Employing Hand-Crafted Filterbank and Parameter-Sharing Transformer
by Aye Nyein Aung and Jeih-weih Hung
Electronics 2024, 13(21), 4174; https://doi.org/10.3390/electronics13214174 - 24 Oct 2024
Viewed by 1675
Abstract
The “cocktail party problem”, the challenge of isolating individual speech signals from a noisy mixture, has traditionally been addressed using statistical methods. However, deep neural networks (DNNs), with their ability to learn complex patterns, have emerged as superior solutions. DNNs excel at capturing [...] Read more.
The “cocktail party problem”, the challenge of isolating individual speech signals from a noisy mixture, has traditionally been addressed using statistical methods. However, deep neural networks (DNNs), with their ability to learn complex patterns, have emerged as superior solutions. DNNs excel at capturing intricate relationships between mixed audio signals and their respective speech sources, enabling them to effectively separate overlapping speech signals in challenging acoustic environments. Recent advances in speech separation systems have drawn inspiration from the brain’s hierarchical sensory information processing, incorporating top-down attention mechanisms. The top-down attention network (TDANet) employs an encoder–decoder architecture with top-down attention to enhance feature modulation and separation performance. By leveraging attention signals from multi-scale input features, TDANet effectively modifies features across different scales using a global attention (GA) module in the encoder–decoder design. Local attention (LA) layers then convert these modulated signals into high-resolution auditory characteristics. In this study, we propose two key modifications to TDANet. First, we substitute the fully trainable convolutional encoder with a deterministic hand-crafted multi-phase gammatone filterbank (MP-GTF), which mimics human hearing. Experimental results demonstrated that this substitution yielded comparable or even slightly superior performance to the original TDANet with a trainable encoder. Second, we replace the single multi-head self-attention (MHSA) layer in the global attention module with a transformer encoder block consisting of multiple MHSA layers. To optimize GPU memory utilization, we introduce a parameter sharing mechanism, dubbed “Reverse Cycle”, across layers in the transformer-based encoder. Our experimental findings indicated that these proposed modifications enabled TDANet to achieve competitive separation performance, rivaling state-of-the-art techniques, while maintaining superior computational efficiency. Full article
(This article belongs to the Special Issue Natural Language Processing Method: Deep Learning and Deep Semantics)
Show Figures

Figure 1

17 pages, 5959 KB  
Article
TDA-Net: A Novel Transfer Deep Attention Network for Rapid Response to Building Damage Discovery
by Haiming Zhang, Mingchang Wang, Yongxian Zhang and Guorui Ma
Remote Sens. 2022, 14(15), 3687; https://doi.org/10.3390/rs14153687 - 1 Aug 2022
Cited by 7 | Viewed by 2971
Abstract
The rapid and accurate discovery of damage information of the affected buildings is of great significance for postdisaster emergency rescue. In some related studies, the models involved can detect damaged buildings relatively accurately, but their time cost is high. Models that can guarantee [...] Read more.
The rapid and accurate discovery of damage information of the affected buildings is of great significance for postdisaster emergency rescue. In some related studies, the models involved can detect damaged buildings relatively accurately, but their time cost is high. Models that can guarantee both detection accuracy and high efficiency are urgently needed. In this paper, we propose a new transfer-learning deep attention network (TDA-Net). It can achieve a balance of accuracy and efficiency. The benchmarking network for TDA-Net uses a pair of deep residual networks and is pretrained on a large-scale dataset of disaster-damaged buildings. The pretrained deep residual networks have strong sensing properties on the damage information, which ensures the effectiveness of the network in prefeature grasping. In order to make the network have a more robust perception of changing features, a set of deep attention bidirectional encoding and decoding modules is connected after the TDA-Net benchmark network. When performing a new task, only a small number of samples are needed to train the network, and the damage information of buildings in the whole area can be extracted. The bidirectional encoding and decoding structure of the network allows two images to be input into the model independently, which can effectively capture the features of a single image, thereby improving the detection accuracy. Our experiments on the xView2 dataset and three datasets of disaster regions achieve high detection accuracy, which demonstrates the feasibility of our method. Full article
(This article belongs to the Special Issue Recent Progress of Change Detection Based on Remote Sensing)
Show Figures

Figure 1

Back to TopTop