Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (60)

Search Parameters:
Keywords = pre-trained contextualized embedding

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 528 KiB  
Article
Quantum-Inspired Attention-Based Semantic Dependency Fusion Model for Aspect-Based Sentiment Analysis
by Chenyang Xu, Xihan Wang, Jiacheng Tang, Yihang Wang, Lianhe Shao and Quanli Gao
Axioms 2025, 14(7), 525; https://doi.org/10.3390/axioms14070525 - 9 Jul 2025
Viewed by 377
Abstract
Aspect-Based Sentiment Analysis (ABSA) has gained significant popularity in recent years, which emphasizes the aspect-level sentiment representation of sentences. Current methods for ABSA often use pre-trained models and graph convolution to represent word dependencies. However, they struggle with long-range dependency issues in lengthy [...] Read more.
Aspect-Based Sentiment Analysis (ABSA) has gained significant popularity in recent years, which emphasizes the aspect-level sentiment representation of sentences. Current methods for ABSA often use pre-trained models and graph convolution to represent word dependencies. However, they struggle with long-range dependency issues in lengthy texts, resulting in averaging and loss of contextual semantic information. In this paper, we explore how richer semantic relationships can be encoded more efficiently. Inspired by quantum theory, we construct superposition states from text sequences and utilize them with quantum measurements to explicitly capture complex semantic relationships within word sequences. Specifically, we propose an attention-based semantic dependency fusion method for ABSA, which employs a quantum embedding module to create a superposition state of real-valued word sequence features in a complex-valued Hilbert space. This approach yields a word sequence density matrix representation that enhances the handling of long-range dependencies. Furthermore, we introduce a quantum cross-attention mechanism to integrate sequence features with dependency relationships between specific word pairs, aiming to capture the associations between particular aspects and comments more comprehensively. Our experiments on the SemEval-2014 and Twitter datasets demonstrate the effectiveness of the quantum-inspired attention-based semantic dependency fusion model for the ABSA task. Full article
Show Figures

Figure 1

24 pages, 2410 KiB  
Article
UA-HSD-2025: Multi-Lingual Hate Speech Detection from Tweets Using Pre-Trained Transformers
by Muhammad Ahmad, Muhammad Waqas, Ameer Hamza, Sardar Usman, Ildar Batyrshin and Grigori Sidorov
Computers 2025, 14(6), 239; https://doi.org/10.3390/computers14060239 - 18 Jun 2025
Cited by 1 | Viewed by 1190
Abstract
The rise in social media has improved communication but also amplified the spread of hate speech, creating serious societal risks. Automated detection remains difficult due to subjectivity, linguistic diversity, and implicit language. While prior research focuses on high-resource languages, this study addresses the [...] Read more.
The rise in social media has improved communication but also amplified the spread of hate speech, creating serious societal risks. Automated detection remains difficult due to subjectivity, linguistic diversity, and implicit language. While prior research focuses on high-resource languages, this study addresses the underexplored multilingual challenges of Arabic and Urdu hate speech through a comprehensive approach. To achieve this objective, this study makes four different key contributions. First, we have created a unique multi-lingual, manually annotated binary and multi-class dataset (UA-HSD-2025) sourced from X, which contains the five most important multi-class categories of hate speech. Secondly, we created detailed annotation guidelines to make a robust and perfect hate speech dataset. Third, we explore two strategies to address the challenges of multilingual data: a joint multilingual and translation-based approach. The translation-based approach involves converting all input text into a single target language before applying a classifier. In contrast, the joint multilingual approach employs a unified model trained to handle multiple languages simultaneously, enabling it to classify text across different languages without translation. Finally, we have employed state-of-the-art 54 different experiments using different machine learning using TF-IDF, deep learning using advanced pre-trained word embeddings such as FastText and Glove, and pre-trained language-based models using advanced contextual embeddings. Based on the analysis of the results, our language-based model (XLM-R) outperformed traditional supervised learning approaches, achieving 0.99 accuracy in binary classification for Arabic, Urdu, and joint-multilingual datasets, and 0.95, 0.94, and 0.94 accuracy in multi-class classification for joint-multilingual, Arabic, and Urdu datasets, respectively. Full article
(This article belongs to the Special Issue Recent Advances in Social Networks and Social Media)
Show Figures

Figure 1

19 pages, 3185 KiB  
Article
Short Text Classification Based on Enhanced Word Embedding and Hybrid Neural Networks
by Cunhe Li, Zian Xie and Haotian Wang
Appl. Sci. 2025, 15(9), 5102; https://doi.org/10.3390/app15095102 - 4 May 2025
Cited by 1 | Viewed by 1663
Abstract
In recent years, text classification has found wide application in diverse real-world scenarios. In Chinese news classification tasks, limitations such as sparse contextual information and semantic ambiguity exist in the title text. To improve the performance of short text classification, this paper proposes [...] Read more.
In recent years, text classification has found wide application in diverse real-world scenarios. In Chinese news classification tasks, limitations such as sparse contextual information and semantic ambiguity exist in the title text. To improve the performance of short text classification, this paper proposes a Word2Vec-based enhanced word embedding method and exhibits the design of a dual-channel hybrid neural network architecture to effectively extract semantic features. Specifically, we introduce a novel weighting scheme, Term Frequency-Document Frequency Category-Distribution Weight (TF-IDF-CDW), where Category Distribution Weight (CDW) reflects the distribution pattern of words across different categories. By weighting the pretrained Word2Vec vectors with TF-IDF-CDW and concatenating them with part-of-speech (POS) feature vectors, semantically enriched and more discriminative word embedding vectors are generated. Furthermore, we propose a dual-channel hybrid model based on a Gated Convolutional Neural Network (GCNN) and Bidirectional Long Short-Term Memory (BiLSTM), which jointly captures local features and long-range global dependencies. To evaluate the overall performance of the model, experiments were conducted on the Chinese short text datasets THUCNews and TNews. The proposed model achieved classification accuracies of 91.85% and 87.70%, respectively, outperforming several comparative models and demonstrating the effectiveness of the proposed method. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

38 pages, 2033 KiB  
Article
DCAT: A Novel Transformer-Based Approach for Dynamic Context-Aware Image Captioning in the Tamil Language
by Jothi Prakash Venugopal, Arul Antran Vijay Subramanian, Manikandan Murugan, Gopikrishnan Sundaram, Marco Rivera and Patrick Wheeler
Appl. Sci. 2025, 15(9), 4909; https://doi.org/10.3390/app15094909 - 28 Apr 2025
Viewed by 653
Abstract
The task of image captioning in low-resource languages like Tamil is fraught with challenges due to limited linguistic resources and complex semantic structures. This paper addresses the problem of generating contextually and linguistically coherent captions in Tamil. We introduce the Dynamic Context-Aware Transformer [...] Read more.
The task of image captioning in low-resource languages like Tamil is fraught with challenges due to limited linguistic resources and complex semantic structures. This paper addresses the problem of generating contextually and linguistically coherent captions in Tamil. We introduce the Dynamic Context-Aware Transformer (DCAT), a novel approach that synergizes the Vision Transformer (ViT) with the Generative Pre-trained Transformer (GPT-3), reinforced by a unique Context Embedding Layer. The DCAT model, tailored for Tamil, innovatively employs dynamic attention mechanisms during its Initialization, Training, and Inference phases to focus on pertinent visual and textual elements. Our method distinctively leverages the nuances of Tamil syntax and semantics, a novelty in the realm of low-resource language image captioning. Comparative evaluations against established models on datasets like Flickr8k, Flickr30k, and MSCOCO reveal DCAT’s superiority, with a notable 12% increase in BLEU score (0.7425) and a 15% enhancement in METEOR score (0.4391) over leading models. Despite its computational demands, DCAT sets a new benchmark for image captioning in Tamil, demonstrating potential applicability to other similar languages. Full article
Show Figures

Figure 1

32 pages, 6581 KiB  
Article
Unveiling Technological Evolution with a Patent-Based Dynamic Topic Modeling Framework: A Case Study of Advanced 6G Technologies
by Jieru Jiang, Fangli Ying and Riyad Dhuny
Appl. Sci. 2025, 15(7), 3783; https://doi.org/10.3390/app15073783 - 30 Mar 2025
Cited by 1 | Viewed by 1353
Abstract
As the next frontier in wireless communication, the landscape of 6G technologies is characterized by its rapid evolution and increasing complexity, driven by the need to address global challenges such as ubiquitous connectivity, ultra-high data rates, and intelligent applications. Given the significance of [...] Read more.
As the next frontier in wireless communication, the landscape of 6G technologies is characterized by its rapid evolution and increasing complexity, driven by the need to address global challenges such as ubiquitous connectivity, ultra-high data rates, and intelligent applications. Given the significance of 6G in shaping the future of communication and its potential to revolutionize various industries, understanding the technological evolution within this domain is crucial. Traditional topic modeling approaches fall short in adapting to the rapidly changing and highly complex nature of patent-based topic analysis in this field, thereby impeding a comprehensive understanding of the advanced technological evolution in terms of capturing temporal changes and uncovering semantic relationships. This study delves into the exploration of the evolving technologies of 6G in patent data through a novel dynamic topic modeling framework. Specifically, this work harnesses the power of large language models to effectively reduce the noise in patent data pre-processing using a prompt-based summarization technique. Then, we propose an enhanced dynamic topic modeling framework based on BERTopic to capture the time-aware features of evolving topics across periods. Additionally, we conduct comparative analysis in contextual embedding techniques and leverage SBERT pre-trained on patent data to extract the content semantics in domain-specific patent data within this framework. Finally, we apply the weak signal analysis method to identify the emerging topics in 6G technology over the periods, which makes the topic evolution analysis more interpretable than traditional topic modeling methods. The empirical results, which were validated by human experts, show that the proposed method can effectively uncover patterns of technological evolution, thus enabling its potential application to enhance strategic decision-making and stay ahead in the highly competitive and rapidly evolving technological sector. Full article
Show Figures

Figure 1

25 pages, 1451 KiB  
Article
A Graph Neural Network-Based Context-Aware Framework for Sentiment Analysis Classification in Chinese Microblogs
by Zhesheng Jin and Yunhua Zhang
Mathematics 2025, 13(6), 997; https://doi.org/10.3390/math13060997 - 18 Mar 2025
Cited by 1 | Viewed by 1137
Abstract
Sentiment analysis in Chinese microblogs is challenged by complex syntactic structures and fine-grained sentiment shifts. To address these challenges, a Contextually Enriched Graph Neural Network (CE-GNN) is proposed, integrating self-supervised learning, context-aware sentiment embeddings, and Graph Neural Networks (GNNs) to enhance sentiment classification. [...] Read more.
Sentiment analysis in Chinese microblogs is challenged by complex syntactic structures and fine-grained sentiment shifts. To address these challenges, a Contextually Enriched Graph Neural Network (CE-GNN) is proposed, integrating self-supervised learning, context-aware sentiment embeddings, and Graph Neural Networks (GNNs) to enhance sentiment classification. First, CE-GNN is pre-trained on a large corpus of unlabeled text through self-supervised learning, where Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) are leveraged to obtain contextualized embeddings. These embeddings are then refined through a context-aware sentiment embedding layer, which is dynamically adjusted based on the surrounding text to improve sentiment sensitivity. Next, syntactic dependencies are captured by Graph Neural Networks (GNNs), where words are represented as nodes and syntactic relationships are denoted as edges. Through this graph-based structure, complex sentence structures, particularly in Chinese, can be interpreted more effectively. Finally, the model is fine-tuned on a labeled dataset, achieving state-of-the-art performance in sentiment classification. Experimental results demonstrate that CE-GNN achieves superior accuracy, with a Macro F-measure of 80.21% and a Micro F-measure of 82.93%. Ablation studies further confirm that each module contributes significantly to the overall performance. Full article
(This article belongs to the Section E2: Control Theory and Mechanics)
Show Figures

Figure 1

14 pages, 423 KiB  
Article
A Small-Scale Evaluation of Large Language Models Used for Grammatical Error Correction in a German Children’s Literature Corpus: A Comparative Study
by Phuong Thao Nguyen, Bernd Nuss, Roswita Dressler and Katie Ovens
Appl. Sci. 2025, 15(5), 2476; https://doi.org/10.3390/app15052476 - 25 Feb 2025
Viewed by 1365
Abstract
Grammatical error correction (GEC) has become increasingly important for enhancing the quality of OCR-scanned texts. This small-scale study explores the application of Large Language Models (LLMs) for GEC in German children’s literature, a genre with unique linguistic challenges due to modified language, colloquial [...] Read more.
Grammatical error correction (GEC) has become increasingly important for enhancing the quality of OCR-scanned texts. This small-scale study explores the application of Large Language Models (LLMs) for GEC in German children’s literature, a genre with unique linguistic challenges due to modified language, colloquial expressions, and complex layouts that often lead to OCR-induced errors. While conventional rule-based and statistical approaches have been used in the past, advancements in machine learning and artificial intelligence have introduced models capable of more contextually nuanced corrections. Despite these developments, limited research has been conducted on evaluating the effectiveness of state-of-the-art LLMs, specifically in the context of German children’s literature. To address this gap, we fine-tuned encoder-based models GBERT and GELECTRA on German children’s literature, and compared their performance to decoder-based models GPT-4o and Llama series (versions 3.2 and 3.1) in a zero-shot setting. Our results demonstrate that all pretrained models, both encoder-based (GBERT, GELECTRA) and decoder-based (GPT-4o, Llama series), failed to effectively remove OCR-generated noise in children’s literature, highlighting the necessity of a preprocessing step to handle structural inconsistencies and artifacts introduced during scanning. This study also addresses the lack of comparative evaluations between encoder-based and decoder-based models for German GEC, with most prior work focusing on English. Quantitative analysis reveals that decoder-based models significantly outperform fine-tuned encoder-based models, with GPT-4o and Llama-3.1-70B achieving the highest accuracy in both error detection and correction. Qualitative assessment further highlights distinct model behaviors: GPT-4o demonstrates the most consistent correction performance, handling grammatical nuances effectively while minimizing overcorrection. Llama-3.1-70B excels in error detection but occasionally relies on frequency-based substitutions over meaning-driven corrections. Unlike earlier decoder-based models, which often exhibited overcorrection tendencies, our findings indicate that state-of-the-art decoder-based models strike a better balance between correction accuracy and semantic preservation. By identifying the strengths and limitations of different model architectures, this study enhances the accessibility and readability of OCR-scanned German children’s literature. It also provides new insights into the role of preprocessing in digitized text correction, the comparative performance of encoder- and decoder-based models, and the evolving correction tendencies of modern LLMs. These findings contribute to language preservation, corpus linguistics, and digital archiving, offering an AI-driven solution for improving the quality of digitized children’s literature while ensuring linguistic and cultural integrity. Future research should explore multimodal approaches that integrate visual context to further enhance correction accuracy for children’s books with image-embedded text. Full article
(This article belongs to the Special Issue Applications of Natural Language Processing to Data Science)
Show Figures

Figure 1

21 pages, 3621 KiB  
Article
SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting
by Ahmed Zgaren, Wassim Bouachir and Nizar Bouguila
J. Imaging 2025, 11(2), 52; https://doi.org/10.3390/jimaging11020052 - 10 Feb 2025
Viewed by 2529
Abstract
Zero-shot counting is a subcategory of Generic Visual Object Counting, which aims to count objects from an arbitrary class in a given image. While few-shot counting relies on delivering exemplars to the model to count similar class objects, zero-shot counting automates the operation [...] Read more.
Zero-shot counting is a subcategory of Generic Visual Object Counting, which aims to count objects from an arbitrary class in a given image. While few-shot counting relies on delivering exemplars to the model to count similar class objects, zero-shot counting automates the operation for faster processing. This paper proposes a fully automated zero-shot method outperforming both zero-shot and few-shot methods. By exploiting feature maps from a pre-trained detection-based backbone, we introduce a new Visual Embedding Module designed to generate semantic embeddings within object contextual information. These embeddings are then fed to a Self-Attention Matching Module to generate an encoded representation for the head counter. Our proposed method has outperformed recent zero-shot approaches, achieving the best Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) results of 8.89 and 35.83, respectively, on the FSC147 dataset. Additionally, our method demonstrates competitive performance compared to few-shot methods, advancing the capabilities of visual object counting in various industrial applications such as tree counting, wildlife animal counting, and medical applications like blood cell counting. Full article
(This article belongs to the Special Issue Recent Trends in Computer Vision with Neural Networks)
Show Figures

Figure 1

23 pages, 1202 KiB  
Article
CSP-DCPE: Category-Specific Prompt with Deep Contextual Prompt Enhancement for Vision–Language Models
by Chunlei Wu, Yixiang Wu, Qinfu Xu and Xuebin Zi
Electronics 2025, 14(4), 673; https://doi.org/10.3390/electronics14040673 - 9 Feb 2025
Viewed by 1070
Abstract
Recently, prompt learning has emerged as a viable technique for fine-tuning pre-trained vision–language models (VLMs). The use of prompts allows pre-trained VLMs to be quickly adapted to specific downstream tasks, bypassing the necessity to update the original pre-trained weights. Nevertheless, much of the [...] Read more.
Recently, prompt learning has emerged as a viable technique for fine-tuning pre-trained vision–language models (VLMs). The use of prompts allows pre-trained VLMs to be quickly adapted to specific downstream tasks, bypassing the necessity to update the original pre-trained weights. Nevertheless, much of the existing work on prompt learning has focused primarily on the utilization of non-specific prompts, with little attention paid to the category-specific data. In this paper, we present a novel method, the Category-Specific Prompt (CSP), which integrates task-oriented information into our model, thereby augmenting its capacity to comprehend and execute complex tasks. In order to enhance the exploitation of features, thereby optimizing the utilization of the combination of category-specific and non-specific prompts, we introduce a novel deep prompt-learning method, Deep Contextual Prompt Enhancement (DCPE). DCPE outputs features with rich text embedding knowledge that changes in response to input through attention-based interactions, thereby ensuring that our model contains instance-oriented information. Combining the above two methods, our architecture CSP-DCPE contains both task-oriented and instance-oriented information, and achieves state-of-the-art average scores on 11 benchmark image-classification datasets. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

18 pages, 393 KiB  
Article
LLM-Augmented Linear Transformer–CNN for Enhanced Stock Price Prediction
by Lei Zhou, Yuqi Zhang, Jian Yu, Guiling Wang, Zhizhong Liu, Sira Yongchareon and Nancy Wang
Mathematics 2025, 13(3), 487; https://doi.org/10.3390/math13030487 - 31 Jan 2025
Cited by 6 | Viewed by 5720
Abstract
Accurately predicting stock prices remains a challenging task due to the volatile and complex nature of financial markets. In this study, we propose a novel hybrid deep learning framework that integrates a large language model (LLM), a Linear Transformer (LT), and a Convolutional [...] Read more.
Accurately predicting stock prices remains a challenging task due to the volatile and complex nature of financial markets. In this study, we propose a novel hybrid deep learning framework that integrates a large language model (LLM), a Linear Transformer (LT), and a Convolutional Neural Network (CNN) to enhance stock price prediction using solely historical market data. The framework leverages the LLM as a professional financial analyst to perform daily technical analysis. The technical indicators, including moving averages (MAs), relative strength index (RSI), and Bollinger Bands (BBs), are calculated directly from historical stock data. These indicators are then analyzed by the LLM, generating descriptive textual summaries. The textual summaries are further transformed into vector representations using FinBERT, a pre-trained financial language model, to enhance the dataset with contextual insights. The FinBERT embeddings are integrated with features from two additional branches: the Linear Transformer branch, which captures long-term dependencies in time-series stock data through a linearized self-attention mechanism, and the CNN branch, which extracts spatial features from visual representations of stock chart data. The combined features from these three modalities are then processed by a Feedforward Neural Network (FNN) for final stock price prediction. Experimental results on the S&P 500 dataset demonstrate that the proposed framework significantly improves stock prediction accuracy by effectively capturing temporal, spatial, and contextual dependencies in the data. This multimodal approach highlights the importance of integrating advanced technical analysis with deep learning architectures for enhanced financial forecasting. Full article
(This article belongs to the Special Issue New Insights in Machine Learning (ML) and Deep Neural Networks)
Show Figures

Figure 1

29 pages, 863 KiB  
Article
Fake News Detection and Classification: A Comparative Study of Convolutional Neural Networks, Large Language Models, and Natural Language Processing Models
by Konstantinos I. Roumeliotis, Nikolaos D. Tselikas and Dimitrios K. Nasiopoulos
Future Internet 2025, 17(1), 28; https://doi.org/10.3390/fi17010028 - 9 Jan 2025
Cited by 7 | Viewed by 11138
Abstract
In an era where fake news detection has become a pressing issue due to its profound impacts on public opinion, democracy, and social trust, accurately identifying and classifying false information is a critical challenge. In this study, the effectiveness is investigated of advanced [...] Read more.
In an era where fake news detection has become a pressing issue due to its profound impacts on public opinion, democracy, and social trust, accurately identifying and classifying false information is a critical challenge. In this study, the effectiveness is investigated of advanced machine learning models—convolutional neural networks (CNNs), bidirectional encoder representations from transformers (BERT), and generative pre-trained transformers (GPTs)—for robust fake news classification. Each model brings unique strengths to the task, from CNNs’ pattern recognition capabilities to BERT and GPTs’ contextual understanding in the embedding space. Our results demonstrate that the fine-tuned GPT-4 Omni models achieve 98.6% accuracy, significantly outperforming traditional models like CNNs, which achieved only 58.6%. Notably, the smaller GPT-4o mini model performed comparably to its larger counterpart, highlighting the cost-effectiveness of smaller models for specialized tasks. These findings emphasize the importance of fine-tuning large language models (LLMs) to optimize the performance for complex tasks such as fake news classifier development, where capturing subtle contextual relationships in text is crucial. However, challenges such as computational costs and suboptimal outcomes in zero-shot classification persist, particularly when distinguishing fake content from legitimate information. By highlighting the practical application of fine-tuned LLMs and exploring the potential of few-shot learning for fake news detection, this research provides valuable insights for news organizations seeking to implement scalable and accurate solutions. Ultimately, this work contributes to fostering transparency and integrity in journalism through innovative AI-driven methods for fake news classification and automated fake news classifier systems. Full article
Show Figures

Graphical abstract

23 pages, 4893 KiB  
Article
Enhancing Software Effort Estimation with Pre-Trained Word Embeddings: A Small-Dataset Solution for Accurate Story Point Prediction
by Issa Atoum and Ahmed Ali Otoom
Electronics 2024, 13(23), 4843; https://doi.org/10.3390/electronics13234843 - 8 Dec 2024
Cited by 2 | Viewed by 1990
Abstract
Traditional software effort estimation methods, such as term frequency–inverse document frequency (TF-IDF), are widely used due to their simplicity and interpretability. However, they struggle with limited datasets, fail to capture intricate semantics, and suffer from dimensionality, sparsity, and computational inefficiency. This study used [...] Read more.
Traditional software effort estimation methods, such as term frequency–inverse document frequency (TF-IDF), are widely used due to their simplicity and interpretability. However, they struggle with limited datasets, fail to capture intricate semantics, and suffer from dimensionality, sparsity, and computational inefficiency. This study used pre-trained word embeddings, including FastText and GPT-2, to improve estimation accuracy in such cases. Seven pre-trained models were evaluated for their ability to effectively represent textual data, addressing the fundamental limitations of TF-IDF through contextualized embeddings. The results show that combining FastText embeddings with support vector machines (SVMs) consistently outperforms traditional approaches, reducing the mean absolute error (MAE) by 5–18% while achieving accuracy comparable to deep learning models like GPT-2. This approach demonstrated the adaptability of pre-trained embeddings for small datasets, balancing semantic richness with computational efficiency. The proposed method optimized project planning and resource allocation while enhancing software development through accurate story point prediction while safeguarding privacy and security through data anonymization. Future research will explore task-specific embeddings tailored to software engineering domains and investigate how dataset characteristics, such as cultural variations, influence model performance, ensuring the development of adaptable, robust, and secure machine learning models for diverse contexts. Full article
Show Figures

Figure 1

24 pages, 3815 KiB  
Article
A Multi-Level Embedding Framework for Decoding Sarcasm Using Context, Emotion, and Sentiment Feature
by Maryam Khanian Najafabadi, Thoon Zar Chi Ko, Saman Shojae Chaeikar and Nasrin Shabani
Electronics 2024, 13(22), 4429; https://doi.org/10.3390/electronics13224429 - 12 Nov 2024
Cited by 3 | Viewed by 1603
Abstract
Sarcasm detection in text poses significant challenges for traditional sentiment analysis, as it often requires an understanding of context, word meanings, and emotional undertones. For example, in the sentence “I totally love working on Christmas holiday”, detecting sarcasm depends on capturing the contrast [...] Read more.
Sarcasm detection in text poses significant challenges for traditional sentiment analysis, as it often requires an understanding of context, word meanings, and emotional undertones. For example, in the sentence “I totally love working on Christmas holiday”, detecting sarcasm depends on capturing the contrast between affective words and their context. Existing methods often focus on single-embedding levels, such as word-level or affective-level, neglecting the importance of multi-level context. In this paper, we propose SAWE (Sentence, Affect, and Word Embeddings), a framework that combines sentence-level, affect-level, and context-dependent word embeddings to improve sarcasm detection. We use pre-trained transformer models SBERT and RoBERTa, enhanced with a bidirectional GRU and self-attention, alongside SenticNet to extract affective words. The combined embeddings are processed through a CNN and classified using a multilayer perceptron (MLP). SAWE is evaluated on two benchmark datasets, Sarcasm Corpus V2 (SV2) and Self-Annotated Reddit Corpus 2.0 (SARC 2.0), outperforming previous methods, particularly on long texts, with a 4.2% improvement on F1-Score for SV2. Our results emphasize the importance of multi-level embeddings and contextual information in detecting sarcasm, demonstrating a new direction for future research. Full article
(This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence)
Show Figures

Figure 1

20 pages, 1853 KiB  
Article
Chinese Named Entity Recognition Based on Multi-Level Representation Learning
by Weijun Li, Jianping Ding, Shixia Liu, Xueyang Liu, Yilei Su and Ziyi Wang
Appl. Sci. 2024, 14(19), 9083; https://doi.org/10.3390/app14199083 - 8 Oct 2024
Cited by 3 | Viewed by 1736
Abstract
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which [...] Read more.
Named Entity Recognition (NER) is a crucial component of Natural Language Processing (NLP). When dealing with the high diversity and complexity of the Chinese language, existing Chinese NER models face challenges in addressing word sense ambiguity, capturing long-range dependencies, and maintaining robustness, which hinders the accuracy of entity recognition. To this end, a Chinese NER model based on multi-level representation learning is proposed. The model leverages a pre-trained word-based embedding to capture contextual information. A linear layer adjusts dimensions to fit an Extended Long Short-Term Memory (XLSTM) network, enabling the capture of long-range dependencies and contextual information, and providing deeper representations. An adaptive multi-head attention mechanism is proposed to enhance the ability to capture global dependencies and comprehend deep semantic context. Additionally, GlobalPointer with rotational position encoding integrates global information for entity category prediction. Projected Gradient Descent (PGD) is incorporated, introducing perturbations in the embedding layer of the pre-trained model to enhance stability in noisy environments. The proposed model achieves F1-scores of 96.89%, 74.89%, 72.19%, and 80.96% on the Resume, Weibo, CMeEE, and CLUENER2020 datasets, respectively, demonstrating improvements over baseline and comparison models. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

26 pages, 6325 KiB  
Article
Improving the Accuracy and Effectiveness of Text Classification Based on the Integration of the Bert Model and a Recurrent Neural Network (RNN_Bert_Based)
by Chanthol Eang and Seungjae Lee
Appl. Sci. 2024, 14(18), 8388; https://doi.org/10.3390/app14188388 - 18 Sep 2024
Cited by 11 | Viewed by 7785
Abstract
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This [...] Read more.
This paper proposes a new robust model for text classification on the Stanford Sentiment Treebank v2 (SST-2) dataset in terms of model accuracy. We developed a Recurrent Neural Network Bert based (RNN_Bert_based) model designed to improve classification accuracy on the SST-2 dataset. This dataset consists of movie review sentences, each labeled with either positive or negative sentiment, making it a binary classification task. Recurrent Neural Networks (RNNs) are effective for text classification because they capture the sequential nature of language, which is crucial for understanding context and meaning. Bert excels in text classification by providing bidirectional context, generating contextual embeddings, and leveraging pre-training on large corpora. This allows Bert to capture nuanced meanings and relationships within the text effectively. Combining Bert with RNNs can be highly effective for text classification. Bert’s bidirectional context and rich embeddings provide a deep understanding of the text, while RNNs capture sequential patterns and long-range dependencies. Together, they leverage the strengths of both architectures, leading to improved performance on complex classification tasks. Next, we also developed an integration of the Bert model and a K-Nearest Neighbor based (KNN_Bert_based) method as a comparative scheme for our proposed work. Based on the results of experimentation, our proposed model outperforms traditional text classification models as well as existing models in terms of accuracy. Full article
(This article belongs to the Special Issue Natural Language Processing: Novel Methods and Applications)
Show Figures

Figure 1

Back to TopTop