Submit to Electronics Review for Electronics Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Natural Language Processing Method: Deep Learning and Deep Semantics

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".

Deadline for manuscript submissions: closed (15 September 2024) | Viewed by 5914

Share This Special Issue

Special Issue Editors

Dr. Wei Ji

E-Mail Website
Guest Editor

School of Computing, National University of Singapore, Singapore 117417, Singapore
Interests: computer vision; video understanding; vision and language
Special Issues, Collections and Topics in MDPI journals

Dr. Yiming Wu

E-Mail Website
Guest Editor

School of EIE, The University of Sydney, Sydney, NSW 2006, Australia
Interests: computer vision; machine learning; vision and language

Special Issue Information

Dear Colleagues,

With the rapid development of deep learning technology, intelligent cross-modal systems have garnered a great deal interest from academics and industry alike. Accordingly, we have witnessed the recent dramatic emergence of various AI-based vision–language applications in various fields. This Special Issues invites original research addressing important, innovative and timely challenges in the community. Potential topics include, but are not limited to:

visual captioning (image, video);
visual question answering (image, video);
visual text retrieval (image, video);
storytelling; dense visual captioning;
visual dialog (image, video);
visual grounding;
scene graph generation.

Dr. Wei Ji
Dr. Yiming Wu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

computer vision
natural language processing
machine learning
artificial intelligence
visual understanding and recognition
deep learning

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

16 pages, 7008 KiB

Open AccessArticle

Improving Top-Down Attention Network in Speech Separation by Employing Hand-Crafted Filterbank and Parameter-Sharing Transformer

by Aye Nyein Aung and Jeih-weih Hung

Electronics 2024, 13(21), 4174; https://doi.org/10.3390/electronics13214174 - 24 Oct 2024

Viewed by 1272

Abstract

The “cocktail party problem”, the challenge of isolating individual speech signals from a noisy mixture, has traditionally been addressed using statistical methods. However, deep neural networks (DNNs), with their ability to learn complex patterns, have emerged as superior solutions. DNNs excel at capturing intricate relationships between mixed audio signals and their respective speech sources, enabling them to effectively separate overlapping speech signals in challenging acoustic environments. Recent advances in speech separation systems have drawn inspiration from the brain’s hierarchical sensory information processing, incorporating top-down attention mechanisms. The top-down attention network (TDANet) employs an encoder–decoder architecture with top-down attention to enhance feature modulation and separation performance. By leveraging attention signals from multi-scale input features, TDANet effectively modifies features across different scales using a global attention (GA) module in the encoder–decoder design. Local attention (LA) layers then convert these modulated signals into high-resolution auditory characteristics. In this study, we propose two key modifications to TDANet. First, we substitute the fully trainable convolutional encoder with a deterministic hand-crafted multi-phase gammatone filterbank (MP-GTF), which mimics human hearing. Experimental results demonstrated that this substitution yielded comparable or even slightly superior performance to the original TDANet with a trainable encoder. Second, we replace the single multi-head self-attention (MHSA) layer in the global attention module with a transformer encoder block consisting of multiple MHSA layers. To optimize GPU memory utilization, we introduce a parameter sharing mechanism, dubbed “Reverse Cycle”, across layers in the transformer-based encoder. Our experimental findings indicated that these proposed modifications enabled TDANet to achieve competitive separation performance, rivaling state-of-the-art techniques, while maintaining superior computational efficiency. Full article

(This article belongs to the Special Issue Natural Language Processing Method: Deep Learning and Deep Semantics)

► Show Figures

Figure 1

12 pages, 297 KiB

Open AccessArticle

Cross-Domain Document Summarization Model via Two-Stage Curriculum Learning

by Seungsoo Lee, Gyunyeop Kim and Sangwoo Kang

Electronics 2024, 13(17), 3425; https://doi.org/10.3390/electronics13173425 - 29 Aug 2024

Viewed by 1142

Abstract

Generative document summarization is a natural language processing technique that generates short summary sentences while preserving the content of long texts. Various fine-tuned pre-trained document summarization models have been proposed using a specific single text-summarization dataset. However, each text-summarization dataset usually specializes in a particular downstream task. Therefore, it is difficult to treat all cases involving multiple domains using a single dataset. Accordingly, when a generative document summarization model is fine-tuned to a specific dataset, it performs well, whereas the performance is degraded by up to 45% for datasets that are not used during learning. In short, summarization models perform well with in-domain cases, as the dataset domain during training and evaluation is the same but perform poorly with out-domain inputs. In this paper, we propose a new curriculum-learning method using mixed datasets while training a generative summarization model to be more robust on out-domain datasets. Our method performed better than XSum with 10%, 20%, and 10% lower performance degradation in CNN/DM, which comprised one of two test datasets used, compared to baseline model performance. Full article

(This article belongs to the Special Issue Natural Language Processing Method: Deep Learning and Deep Semantics)

► Show Figures

Figure 1

13 pages, 1724 KiB

Open AccessArticle

Context-Dependent Multimodal Sentiment Analysis Based on a Complex Attention Mechanism

by Lujuan Deng, Boyi Liu, Zuhe Li, Jiangtao Ma and Hanbing Li

Electronics 2023, 12(16), 3516; https://doi.org/10.3390/electronics12163516 - 20 Aug 2023

Cited by 3 | Viewed by 2767

Abstract

Multimodal sentiment analysis aims to understand people’s attitudes and opinions from different data forms. Traditional modality fusion methods for multimodal sentiment analysis con-catenate or multiply various modalities without fully utilizing context information and the correlation between modalities. To solve this problem, this article provides a new model based on a multimodal sentiment analysis framework based on a recurrent neural network with a complex attention mechanism. First, after the raw data is preprocessed, the numerical feature representation is obtained using feature extraction. Next, the numerical features are input into the recurrent neural network, and the output results are multimodally fused using a complex attention mechanism layer. The objective of the complex attention mechanism is to leverage enhanced non-linearity to more effectively capture the inter-modal correlations, thereby improving the performance of multimodal sentiment analysis. Finally, the processed results are fed into the classification layer and the sentiment output is obtained using the classification layer. This process can effectively capture the semantic information and contextual relationship of the input sequence and fuse different pieces of modal information. Our model was tested on the CMU-MOSEI datasets, achieving an accuracy of 82.04%. Full article

(This article belongs to the Special Issue Natural Language Processing Method: Deep Learning and Deep Semantics)

► Show Figures

Journal Menu

Journal Browser

Natural Language Processing Method: Deep Learning and Deep Semantics

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (3 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI