Zero-Shot Learning in Natural Language Processing and Its Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (15 March 2025) | Viewed by 20121

Special Issue Editors

Department of Computer Science, Durham University, Durham DH1 3LE, UK
Interests: machine learning; semantic analysis; natural language processing; deep learning; zero-shot learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, Durham University, Durham DH1 3LE, UK
Interests: explainable machine learning; natural language processing; optimisation

Special Issue Information

Dear Colleagues,

The growing industrial demand of natural language processing (NLP) and computer vision (CV) has motivated the development of semantic–visual modelling. Zero-shot learning (ZSL) has received increasing attention in the past two decades due to its superiority in coping with novel classes and out-of-distribution (OOD) scenarios. This Special Issue focuses on how to design new NLP paradigms to enhance the generalization to 1) novel distribution tasks; 2) novel modalities, e.g., visual images; 3) novel representations, e.g., knowledge graphs. Backbone models and representation learning for general NLP tasks are out of the scope of the issue. The purpose of the issue is to thoroughly explore all possible solutions to update existing ZSL paradigms in NLP, CV, and other modalities that involve unlabelled novel classes and tasks. Publications in this Special Issue will contribute to the existing literature on benchmark establishment, paradigm design, model development, and application deployment of NLP in CV and other real-world modalities and issues.

Dr. Long Yang
Dr. Noura Al Moubayed
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • zero-shot learning
  • natural language processing
  • computer vision
  • machine learning
  • deep learning
  • artificial intelligence

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 1359 KiB  
Article
An Adaptive Hybrid Prototypical Network for Interactive Few-Shot Relation Extraction
by Bei Liu, Sanmin Liu, Subin Huang and Lei Zheng
Electronics 2025, 14(7), 1344; https://doi.org/10.3390/electronics14071344 - 27 Mar 2025
Viewed by 199
Abstract
Few-shot relation extraction constitutes a critical task in natural language processing. Its aim is to train a model using a limited number of labeled samples when labeled data are scarce, thereby enabling the model to rapidly learn and accurately identify relationships between entities [...] Read more.
Few-shot relation extraction constitutes a critical task in natural language processing. Its aim is to train a model using a limited number of labeled samples when labeled data are scarce, thereby enabling the model to rapidly learn and accurately identify relationships between entities within textual data. Prototypical networks are extensively utilized for simplicity and efficiency in few-shot relation extraction scenarios. Nevertheless, the prototypical networks derive their prototypes by averaging the feature instances within a given category. In cases where the instance size is limited, the prototype may not represent the true category centroid adequately, consequently diminishing the accuracy of classification. In this paper, we propose an innovative approach for few-shot relation extraction, leveraging instances from the query set to enhance the construction of prototypical networks based on the support set. Then, the weights are dynamically assigned by quantifying the semantic similarity between sentences. It can strengthen the emphasis on critical samples while preventing potential bias in class prototypes, which are computed using the mean value within prototype networks under small-size scenarios. Furthermore, an adaptive fusion module is introduced to integrate prototype and relational information more deeply, resulting in more accurate prototype representations. Extensive experiments have been performed on the widely used FewRel benchmark dataset. The experimental findings demonstrate that our AIRE model surpasses the existing baseline models, especially the accuracy, which can reach 91.53% and 86.36% on the 5-way 1-shot and 10-way 1-shot tasks, respectively. Full article
Show Figures

Figure 1

16 pages, 467 KiB  
Article
A Zero-Shot Framework for Low-Resource Relation Extraction via Distant Supervision and Large Language Models
by Peisheng Han, Geng Liang and Yongfei Wang
Electronics 2025, 14(3), 593; https://doi.org/10.3390/electronics14030593 - 2 Feb 2025
Viewed by 675
Abstract
While Large Language Models (LLMs) have significantly advanced various benchmarks in Natural Language Processing (NLP), the challenge of low-resource tasks persists, primarily due to the scarcity of data and difficulties in annotation. This study introduces LoRE, a framework designed for zero-shot relation extraction [...] Read more.
While Large Language Models (LLMs) have significantly advanced various benchmarks in Natural Language Processing (NLP), the challenge of low-resource tasks persists, primarily due to the scarcity of data and difficulties in annotation. This study introduces LoRE, a framework designed for zero-shot relation extraction in low-resource settings, which blends distant supervision with the powerful capabilities of LLMs. LoRE addresses the challenges of data sparsity and noise inherent in traditional distant supervision methods, enabling high-quality relation extraction without requiring extensive labeled data. By leveraging LLMs for zero-shot open information extraction and incorporating heuristic entity and relation alignment with semantic disambiguation, LoRE enhances the accuracy and relevance of the extracted data. Low-resource tasks refer to scenarios where labeled data are extremely limited, making traditional supervised learning approaches impractical. This study aims to develop a robust framework that not only tackles these challenges but also demonstrates the theoretical and practical implications of zero-shot relation extraction. The Chinese Person Relationship Extraction (CPRE) dataset, developed under this framework, demonstrates LoRE’s proficiency in extracting person-related triples. The CPRE dataset consists of 1000 word pairs, capturing diverse semantic relationships. Extensive experiments on the CPRE, IPRE, and DuIE datasets show significant improvements in dataset quality and a reduction in manual annotation efforts. These findings highlight the potential of LoRE to advance both the theoretical understanding and practical applications of relation extraction in low-resource settings. Notably, the performance of LoRE on the manually annotated DuIE dataset attests to the quality of the CPRE dataset, rivaling that of manually curated datasets, and highlights LoRE’s potential for reducing the complexities and costs associated with dataset construction for zero-shot and low-resource tasks. Full article
Show Figures

Figure 1

24 pages, 4734 KiB  
Article
A Benchmark Evaluation of Multilingual Large Language Models for Arabic Cross-Lingual Named-Entity Recognition
by Mashael Al-Duwais, Hend Al-Khalifa and Abdulmalik Al-Salman
Electronics 2024, 13(17), 3574; https://doi.org/10.3390/electronics13173574 - 9 Sep 2024
Viewed by 2570
Abstract
Multilingual large language models (MLLMs) have demonstrated remarkable performance across a wide range of cross-lingual Natural Language Processing (NLP) tasks. The emergence of MLLMs made it possible to achieve knowledge transfer from high-resource to low-resource languages. Several MLLMs have been released for cross-lingual [...] Read more.
Multilingual large language models (MLLMs) have demonstrated remarkable performance across a wide range of cross-lingual Natural Language Processing (NLP) tasks. The emergence of MLLMs made it possible to achieve knowledge transfer from high-resource to low-resource languages. Several MLLMs have been released for cross-lingual transfer tasks. However, no systematic evaluation comparing all models for Arabic cross-lingual Named-Entity Recognition (NER) is available. This paper presents a benchmark evaluation to empirically investigate the performance of the state-of-the-art multilingual large language models for Arabic cross-lingual NER. Furthermore, we investigated the performance of different MLLMs adaptation methods to better model the Arabic language. An error analysis of the different adaptation methods is presented. Our experimental results indicate that GigaBERT outperforms other models for Arabic cross-lingual NER, while language-adaptive pre-training (LAPT) proves to be the most effective adaptation method across all datasets. Our findings highlight the importance of incorporating language-specific knowledge to enhance the performance in distant language pairs like English and Arabic. Full article
Show Figures

Figure 1

14 pages, 1397 KiB  
Article
Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning
by Tianbo An, Pingping Yan, Jiaai Zuo, Xing Jin, Mingliang Liu and Jingrui Wang
Electronics 2024, 13(11), 2163; https://doi.org/10.3390/electronics13112163 - 1 Jun 2024
Viewed by 2034
Abstract
Given their intricate nature and inherent ambiguity, sarcastic texts often mask deeper emotions, making it challenging to discern the genuine feelings behind the words. The proposal of the sarcasm detection task is to assist us with more accurately understanding the true intention of [...] Read more.
Given their intricate nature and inherent ambiguity, sarcastic texts often mask deeper emotions, making it challenging to discern the genuine feelings behind the words. The proposal of the sarcasm detection task is to assist us with more accurately understanding the true intention of the speaker. Advanced methods, such as deep learning and neural networks, are widely used in the field of sarcasm detection. However, most research mainly focuses on sarcastic texts in English, as other languages lack corpora and annotated datasets. To address the challenge of low-resource languages in sarcasm detection tasks, a zero-shot cross-lingual transfer learning method is proposed in this paper. The proposed approach is based on prompt learning and aims to assist the model with understanding downstream tasks through prompts. Specifically, the model uses prompt templates to construct training data into cloze-style questions and then trains them using a pre-trained cross-lingual language model. Combining data augmentation and contrastive learning can further improve the capacity of the model for cross-lingual transfer learning. To evaluate the performance of the proposed model, we utilize a publicly accessible sarcasm dataset in English as training data in a zero-shot cross-lingual setting. When tested with Chinese as the target language for transfer, our model achieves F1-scores of 72.14% and 76.7% on two test datasets, outperforming the strong baselines by significant margins. Full article
Show Figures

Figure 1

20 pages, 802 KiB  
Article
Towards Cognition-Aligned Visual Language Models via Zero-Shot Instance Retrieval
by Teng Ma, Daniel Organisciak, Wenbao Ma and Yang Long
Electronics 2024, 13(9), 1660; https://doi.org/10.3390/electronics13091660 - 25 Apr 2024
Cited by 1 | Viewed by 1290
Abstract
The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can seamlessly integrate into societal frameworks requiring nuanced understanding and decision-making. Zero-Shot Instance Retrieval (ZSIR) stands at the forefront of this [...] Read more.
The pursuit of Artificial Intelligence (AI) that emulates human cognitive processes is a cornerstone of ethical AI development, ensuring that emerging technologies can seamlessly integrate into societal frameworks requiring nuanced understanding and decision-making. Zero-Shot Instance Retrieval (ZSIR) stands at the forefront of this endeavour, potentially providing a robust platform for AI systems, particularly large visual language models, to demonstrate and refine cognition-aligned learning without the need for direct experience. In this paper, we critically evaluate current cognition alignment methodologies within traditional zero-shot learning paradigms using visual attributes and word embedding generated by large AI models. We propose a unified similarity function that quantifies the cognitive alignment level, bridging the gap between AI processes and human-like understanding. Through extensive experimentation, our findings illustrate that this similarity function can effectively mirror the visual–semantic gap, steering the model towards enhanced performance in Zero-Shot Instance Retrieval. Our models achieve state-of-the-art performance on both the SUN (92.8% and 82.2%) and CUB datasets (59.92% and 48.82%) for bi-directional image-attribute retrieval accuracy. This work not only benchmarks the cognition alignment of AI but also sets a new precedent for the development of visual language models attuned to the complexities of human cognition. Full article
Show Figures

Figure 1

18 pages, 3070 KiB  
Article
Harnessing Causal Structure Alignment for Enhanced Cross-Domain Named Entity Recognition
by Xiaoming Liu, Mengyuan Cao, Guan Yang, Jie Liu, Yang Liu and Hang Wang
Electronics 2024, 13(1), 67; https://doi.org/10.3390/electronics13010067 - 22 Dec 2023
Viewed by 1360
Abstract
Cross-domain named entity recognition (NER) is a crucial task in various practical applications, particularly when faced with the challenge of limited data availability in target domains. Existing methodologies primarily depend on feature representation or model parameter sharing mechanisms to enable the transfer of [...] Read more.
Cross-domain named entity recognition (NER) is a crucial task in various practical applications, particularly when faced with the challenge of limited data availability in target domains. Existing methodologies primarily depend on feature representation or model parameter sharing mechanisms to enable the transfer of entity recognition capabilities across domains. However, these approaches often ignore the latent causal relationships inherent in invariant features. To address this limitation, we propose a novel framework, the Causal Structure Alignment-based Cross-Domain Named Entity Recognition (CSA-NER) framework, designed to harness the causally invariant features within causal structures to enhance the cross-domain transfer of entity recognition competence. Initially, CSA-NER constructs a causal feature graph utilizing causal discovery to ascertain causal relationships between entities and contextual features across source and target domains. Subsequently, it performs graph structure alignment to extract causal invariant knowledge across domains via the graph optimal transport (GOT) method. Finally, the acquired causal invariant knowledge is refined and utilized through the integration of Gated Attention Units (GAUs). Comprehensive experiments conducted on five English datasets and a specific CD-NER dataset exhibit a notable improvement in the average performance of the CSA-NER model in comparison to existing cross-domain methods. These findings underscore the significance of unearthing and employing latent causal invariant knowledge to effectively augment the entity recognition capabilities in target domains, thereby contributing a robust methodology to the broader realm of cross-domain natural language processing. Full article
Show Figures

Figure 1

14 pages, 6184 KiB  
Article
Explainable B2B Recommender System for Potential Customer Prediction Using KGAT
by Gyungah Cho, Pyoung-seop Shim and Jaekwang Kim
Electronics 2023, 12(17), 3536; https://doi.org/10.3390/electronics12173536 - 22 Aug 2023
Cited by 2 | Viewed by 2888
Abstract
The adoption of recommender systems in business-to-business (B2B) can make the management of companies more efficient. Although the importance of recommendation is increasing with the expansion of B2B e-commerce, not enough studies on B2B recommendations have been conducted. Due to several differences between [...] Read more.
The adoption of recommender systems in business-to-business (B2B) can make the management of companies more efficient. Although the importance of recommendation is increasing with the expansion of B2B e-commerce, not enough studies on B2B recommendations have been conducted. Due to several differences between B2B and business-to-consumer (B2C), the B2B recommender system should be defined differently. This paper presents a new perspective on the explainable B2B recommender system using the knowledge graph attention network for recommendation (KGAT). Unlike traditional recommendation systems that suggest products to consumers, this study focuses on recommending potential buyers to sellers. Additionally, the utilization of the KGAT attention mechanisms enables the provision of explanations for each company’s recommendations. The Korea Electronic Taxation System Association provides the Market Transaction Dataset in South Korea, and this research shows how the dataset is utilized in the knowledge graph (KG). The main tasks can be summarized in three points: (i) suggesting the application of an explainable recommender system in B2B for recommending potential customers, (ii) extracting the performance-enhancing features of a knowledge graph, and (iii) enhancing keyword extraction for trading items to improve recommendation performance. We can anticipate providing good insight into the development of the industry via the utilization of the B2B recommendation of potential customer prediction. Full article
Show Figures

Figure 1

22 pages, 671 KiB  
Article
LLM-Informed Multi-Armed Bandit Strategies for Non-Stationary Environments
by J. de Curtò, I. de Zarzà, Gemma Roig, Juan Carlos Cano, Pietro Manzoni and Carlos T. Calafate
Electronics 2023, 12(13), 2814; https://doi.org/10.3390/electronics12132814 - 25 Jun 2023
Cited by 17 | Viewed by 8011
Abstract
In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle [...] Read more.
In this paper, we introduce an innovative approach to handling the multi-armed bandit (MAB) problem in non-stationary environments, harnessing the predictive power of large language models (LLMs). With the realization that traditional bandit strategies, including epsilon-greedy and upper confidence bound (UCB), may struggle in the face of dynamic changes, we propose a strategy informed by LLMs that offers dynamic guidance on exploration versus exploitation, contingent on the current state of the bandits. We bring forward a new non-stationary bandit model with fluctuating reward distributions and illustrate how LLMs can be employed to guide the choice of bandit amid this variability. Experimental outcomes illustrate the potential of our LLM-informed strategy, demonstrating its adaptability to the fluctuating nature of the bandit problem, while maintaining competitive performance against conventional strategies. This study provides key insights into the capabilities of LLMs in enhancing decision-making processes in dynamic and uncertain scenarios. Full article
Show Figures

Figure 1

Back to TopTop