MDPI - Publisher of Open Access Journals

21 pages, 29822 KB

Open AccessArticle

Research on Deep Learning-Based Identification Methods for Geological Interface Types and Their Application in Mineral Exploration Prediction—A Case Study of the Gouli Region in Qinghai, China

by Yawen Zong, Linfu Xue, Jianbang Wang, Peng Wang and Xiangjin Ran

Minerals 2025, 15(12), 1281; https://doi.org/10.3390/min15121281 - 4 Dec 2025

Viewed by 168

Abstract

Geological interfaces are crucial elements governing deposit formation, such as silica–calcium surfaces, intrusive contact interfaces, and unconformities can serve as key symbols for mineral exploration prediction. Geological maps provide relatively detailed representations of primary geological interfaces and their interrelationships. However, in previous mineral [...] Read more.

Geological interfaces are crucial elements governing deposit formation, such as silica–calcium surfaces, intrusive contact interfaces, and unconformities can serve as key symbols for mineral exploration prediction. Geological maps provide relatively detailed representations of primary geological interfaces and their interrelationships. However, in previous mineral resource predictions, the type differences in different geological interfaces were ignored, and the types of different geological interfaces vary greatly, thus affecting the validity of the mineral prediction results. Manual interpretation and analysis of geological interfaces involve substantial workloads and make it difficult to effectively apply the rich geological information depicted on geological maps to mineral exploration prediction processes. Therefore, this study proposes a model for intelligent identification of geological interface types based on deep learning. The model extracts the attribute information, such as the age and lithology of the geological bodies on both sides of the geological boundary arc, based on the digital geological map of the Gouli gold mining area in Dulan County, Qinghai Province, China. The learning dataset comprising 5900 sets of geological interface types was constructed through manual annotation of geological interfaces. The arc segment is taken as the basic element; the model adopts natural language processing technology to conduct word vector embedding processing on the text attribute information of geological bodies on both sides of the geological interface. The processed embedding vectors are fed into the convolutional neural network (CNN) for training to generate the geological interface type recognition model. This method can effectively identify the type of geological interface, and the identification accuracy can reach 96.52%. Through quantitative analysis of the spatial relationship between different types of geological interfaces and ore points, it is known that they have a good correlation in spatial distribution. Experimental results show that the proposed method can effectively improve the accuracy and efficiency of geological interface recognition, and the accuracy of mineral prediction can be improved to some extent by adding geological interface type information in the process of mineral prediction. Full article

(This article belongs to the Section Mineral Exploration Methods and Applications)

► Show Figures

Figure 1

22 pages, 3760 KB

Open AccessArticle

Embedded Implementation of Real-Time Voice Command Recognition on PIC Microcontroller

by Mohamed Shili, Salah Hammedi, Amjad Gawanmeh and Khaled Nouri

Automation 2025, 6(4), 79; https://doi.org/10.3390/automation6040079 - 28 Nov 2025

Viewed by 252

Abstract

This paper describes a real-time system for recognizing voice commands for resource-constrained embedded devices, specifically a PIC microcontroller. While most existing speech ordering support solutions rely on high-performance processing platforms or cloud computation, the system described here performs fully embedded low-power processing locally [...] Read more.

This paper describes a real-time system for recognizing voice commands for resource-constrained embedded devices, specifically a PIC microcontroller. While most existing speech ordering support solutions rely on high-performance processing platforms or cloud computation, the system described here performs fully embedded low-power processing locally on the device. Sound is captured through a low-cost MEMS microphone, segmented into short audio frames, and time domain features are extracted (i.e., Zero-Crossing Rate (ZCR) and Short-Time Energy (STE)). These features were chosen for low power and computational efficiency and the ability to be processed in real time on a microcontroller. For the purposes of this experimental system, a small vocabulary of four command words (i.e., “ON”, “OFF”, “LEFT”, and “RIGHT”) were used to simulate real sound-ordering interfaces. The main contribution is demonstrated in the clever combination of low-complex, lightweight signal-processing techniques with embedded neural network inference, completing a classification cycle in real time (under 50 ms). It was demonstrated that the classification accuracy was over 90% using confusion matrices and timing analysis of the classifier’s performance across vocabularies with varying levels of complexity. This method is very applicable to IoT and portable embedded applications, offering a low-latency classification alternative to more complex and resource intensive classification architectures. Full article

► Show Figures

Graphical abstract

36 pages, 1090 KB

Open AccessArticle

Integrating Linguistic and Eye Movements Features for Arabic Text Readability Assessment Using ML and DL Models

by Ibtehal Baazeem, Hend Al-Khalifa and Abdulmalik Al-Salman

Computation 2025, 13(11), 258; https://doi.org/10.3390/computation13110258 - 3 Nov 2025

Viewed by 853

Abstract

Evaluating text readability is crucial for supporting both language learners and native readers in selecting appropriate materials. Cognitive psychology research, leveraging behavioral data such as eye-tracking and electroencephalogram (EEG) signals, has demonstrated effectiveness in identifying cognitive activities associated with text difficulty during reading. [...] Read more.

Evaluating text readability is crucial for supporting both language learners and native readers in selecting appropriate materials. Cognitive psychology research, leveraging behavioral data such as eye-tracking and electroencephalogram (EEG) signals, has demonstrated effectiveness in identifying cognitive activities associated with text difficulty during reading. However, the distinctive linguistic characteristics of Arabic present unique challenges for applying such data in readability assessments. While behavioral signals have been explored for this purpose, their potential for Arabic remains underutilized. This study aims to advance Arabic readability assessments by integrating eye-tracking features into computational models. It presents a series of experiments that utilize both text-based and gaze-based features within machine learning (ML) and deep learning (DL) frameworks. The gaze-based features were extracted from the AraEyebility corpus, which contains eye-tracking data collected from 15 native Arabic speakers. The experimental results show that ensemble ML models, particularly AdaBoost with linguistic and eye-tracking handcrafted features, outperform ML models using TF-IDF and DL models employing word embedding vectorization. Among the DL models, convolutional neural networks (CNNs) achieved the best performance with combined linguistic and eye-tracking features. These findings underscore the value of cognitive data and emphasize the need for exploration to fully realize its potential in Arabic readability assessment. Full article

(This article belongs to the Special Issue Recent Advances on Computational Linguistics and Natural Language Processing)

► Show Figures

Figure 1

22 pages, 1250 KB

Open AccessArticle

Entity Span Suffix Classification for Nested Chinese Named Entity Recognition

by Jianfeng Deng, Ruitong Zhao, Wei Ye and Suhong Zheng

Information 2025, 16(10), 822; https://doi.org/10.3390/info16100822 - 23 Sep 2025

Viewed by 490

Abstract

Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise [...] Read more.

Named entity recognition (NER) is one of the fundamental tasks in building knowledge graphs. For some domain-specific corpora, the text descriptions exhibit limited standardization, and some entity structures have entity nesting. The existing entity recognition methods have problems such as word matching noise interference and difficulty in distinguishing different entity labels for the same character in sequence label prediction. This paper proposes a span-based feature reuse stacked bidirectional long short term memory network (BiLSTM) nested named entity recognition (SFRSN) model, which transforms the entity recognition of sequence prediction into the problem of entity span suffix category classification. Firstly, character feature embedding is generated through bidirectional encoder representation of transformers (BERT). Secondly, a feature reuse stacked BiLSTM is proposed to obtain deep context features while alleviating the problem of deep network degradation. Thirdly, the span feature is obtained through the dilated convolution neural network (DCNN), and at the same time, a single-tail selection function is introduced to obtain the classification feature of the entity span suffix, with the aim of reducing the training parameters. Fourthly, a global feature gated attention mechanism is proposed, integrating span features and span suffix classification features to achieve span suffix classification. The experimental results on four Chinese-specific domain datasets demonstrate the effectiveness of our approach: SFRSN achieves micro-F1 scores of 83.34% on ontonotes, 73.27% on weibo, 96.90% on resume, and 86.77% on the supply chain management dataset. This represents a maximum improvement of 1.55%, 4.94%, 2.48%, and 3.47% over state-of-the-art baselines, respectively. The experimental results demonstrate the effectiveness of the model in addressing nested entities and entity label ambiguity issues. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Graphical abstract

18 pages, 2884 KB

Open AccessArticle

Research on Multi-Path Feature Fusion Manchu Recognition Based on Swin Transformer

by Yu Zhou, Mingyan Li, Hang Yu, Jinchi Yu, Mingchen Sun and Dadong Wang

Symmetry 2025, 17(9), 1408; https://doi.org/10.3390/sym17091408 - 29 Aug 2025

Viewed by 636

Abstract

Recognizing Manchu words can be challenging due to their complex character variations, subtle differences between similar characters, and homographic polysemy. Most studies rely on character segmentation techniques for character recognition or use convolutional neural networks (CNNs) to encode word images for word recognition. [...] Read more.

Recognizing Manchu words can be challenging due to their complex character variations, subtle differences between similar characters, and homographic polysemy. Most studies rely on character segmentation techniques for character recognition or use convolutional neural networks (CNNs) to encode word images for word recognition. However, these methods can lead to segmentation errors or a loss of semantic information, which reduces the accuracy of word recognition. To address the limitations in the long-range dependency modeling of CNNs and enhance semantic coherence, we propose a hybrid architecture to fuse the spatial features of original images and spectral features. Specifically, we first leverage the Short-Time Fourier Transform (STFT) to preprocess the raw input images and thereby obtain their multi-view spectral features. Then, we leverage a primary CNN block and a pair of symmetric CNN blocks to construct a symmetric spectral enhancement module, which is used to encode the raw input features and the multi-view spectral features. Subsequently, we design a feature fusion module via Swin Transformer to fuse multi-view spectral embedding and thereby concat it with the raw input embedding. Finally, we leverage a Transformer decoder to obtain the target output. We conducted extensive experiments on Manchu words benchmark datasets to evaluate the effectiveness of our proposed framework. The experimental results demonstrated that our framework performs robustly in word recognition tasks and exhibits excellent generalization capabilities. Additionally, our model outperformed other baseline methods in multiple writing-style font-recognition tasks. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

28 pages, 3746 KB

Open AccessArticle

BERNN: A Transformer-BiLSTM Hybrid Model for Cross-Domain Short Text Classification in Agricultural Expert Systems

by Xueyong Li, Menghao Zhang, Xiaojuan Guo, Jiaxin Zhang, Jiaxia Sun, Xianqin Yun, Liyuan Zheng, Wenyue Zhao, Lican Li and Haohao Zhang

Symmetry 2025, 17(9), 1374; https://doi.org/10.3390/sym17091374 - 22 Aug 2025

Cited by 1 | Viewed by 835

Abstract

With the advancement of artificial intelligence, Agricultural Expert Systems (AESs) show great potential in enhancing agricultural management efficiency and resource utilization. Accurate extraction of semantic features from agricultural short texts is fundamental to enabling key functions such as intelligent question answering, semantic retrieval, [...] Read more.

With the advancement of artificial intelligence, Agricultural Expert Systems (AESs) show great potential in enhancing agricultural management efficiency and resource utilization. Accurate extraction of semantic features from agricultural short texts is fundamental to enabling key functions such as intelligent question answering, semantic retrieval, and decision support. However, existing single-structure deep neural networks struggle to capture the hierarchical linguistic patterns and contextual dependencies inherent in domain-specific texts. To address this limitation, we propose a hybrid deep learning model—Bidirectional Encoder Recurrent Neural Network (BERNN)—which combines a domain-specific pre-trained Transformer encoder (AgQsBERT) with a Bidirectional Long Short-Term Memory (BiLSTM) network. AgQsBERT generates contextualized word embeddings by leveraging domain-specific pretraining, effectively capturing the semantics of agricultural terminology. These embeddings are then passed to the BiLSTM, which models sequential dependencies in both directions, enhancing the model’s understanding of contextual flow and word disambiguation. Importantly, the bidirectional nature of the BiLSTM introduces a form of architectural symmetry, allowing the model to process input in both forward and backward directions. This symmetric design enables balanced context modeling, which improves the understanding of fragmented and ambiguous phrases frequently encountered in agricultural texts. The synergy between semantic abstraction from AgQsBERT and symmetric contextual modeling from BiLSTM significantly enhances the expressiveness and generalizability of the model. Evaluated on a self-constructed agricultural question dataset with 110,647 annotated samples, BERNN achieved a classification accuracy of 97.19%, surpassing the baseline by 3.2%. Cross-domain validation on the Tsinghua News dataset further demonstrates its robust generalization capability. This architecture provides a powerful foundation for intelligent agricultural question-answering systems, semantic retrieval, and decision support within smart agriculture applications. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

39 pages, 3230 KB

Open AccessArticle

Decoding Wine Narratives with Hierarchical Attention: Classification, Visual Prompts, and Emerging E-Commerce Possibilities

by Vlad Diaconita, Anda Belciu, Alexandra Maria Ioana Corbea and Iuliana Simonca

J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 212; https://doi.org/10.3390/jtaer20030212 - 14 Aug 2025

Viewed by 1617

Abstract

Wine reviews can connect words to flavours; they entwine sensory experiences into vivid stories. This research explores the intersection of artificial intelligence and oenology by using state-of-the-art neural networks to decipher the nuances in wine reviews. For more accurate wine classification and to [...] Read more.

Wine reviews can connect words to flavours; they entwine sensory experiences into vivid stories. This research explores the intersection of artificial intelligence and oenology by using state-of-the-art neural networks to decipher the nuances in wine reviews. For more accurate wine classification and to capture the essence of what matters most to aficionados, we use Hierarchical Attention Networks enhanced with pre-trained embeddings. We also propose an approach to create captivating marketing images using advanced text-to-image generation models, mining a large review corpus for the most important descriptive terms and thus linking textual tasting notes to automatically generated imagery. Compared to more conventional models, our results show that hierarchical attention processes fused with rich linguistic embeddings better reflect the complexities of wine language. In addition to improving the accuracy of wine classification, this method provides consumers with immersive experiences by turning sensory descriptors into striking visual stories. Ultimately, our research helps modernise wine marketing and consumer engagement by merging deep learning with sensory analytics, proving how technology-driven solutions can amplify storytelling and shopping experiences in the digital marketplace. Full article

(This article belongs to the Topic Data Science and Intelligent Management)

► Show Figures

Figure 1

19 pages, 821 KB

Open AccessArticle

Multimodal Multisource Neural Machine Translation: Building Resources for Image Caption Translation from European Languages into Arabic

by Roweida Mohammed, Inad Aljarrah, Mahmoud Al-Ayyoub and Ali Fadel

Computation 2025, 13(8), 194; https://doi.org/10.3390/computation13080194 - 8 Aug 2025

Viewed by 1757

Abstract

Neural machine translation (NMT) models combining textual and visual inputs generate more accurate translations compared with unimodal models. Moreover, translation models with an under-resourced target language benefit from multisource inputs (source sentences are provided in different languages). Building MultiModal MutliSource NMT (M³ [...] Read more.

Neural machine translation (NMT) models combining textual and visual inputs generate more accurate translations compared with unimodal models. Moreover, translation models with an under-resourced target language benefit from multisource inputs (source sentences are provided in different languages). Building MultiModal MutliSource NMT (M³S-NMT) systems require significant efforts to curate datasets suitable for such a multifaceted task. This work uses image caption translation as an example of multimodal translation and presents a novel public dataset for translating captions from multiple European languages (viz., English, German, French, and Czech) into the distant and under-resourced Arabic language. Moreover, it presents multitask learning models trained and tested on this dataset to serve as solid baselines to help further research in this area. These models involve two parts: one for learning the visual representations of the input images, and the other for translating the textual input based on these representations. The translations are produced from a framework of attention-based encoder–decoder architectures. The visual features are learned from a pretrained convolutional neural network (CNN). These features are then integrated with textual features learned through the very basic yet well-known recurrent neural networks (RNNs) with GloVe or BERT word embeddings. Despite the challenges associated with the task at hand, the results of these systems are very promising, reaching 34.57 and 42.52 METEOR scores. Full article

(This article belongs to the Section Computational Social Science)

► Show Figures

Figure 1

56 pages, 3118 KB

Open AccessArticle

Semantic Reasoning Using Standard Attention-Based Models: An Application to Chronic Disease Literature

by Yalbi Itzel Balderas-Martínez, José Armando Sánchez-Rojas, Arturo Téllez-Velázquez, Flavio Juárez Martínez, Raúl Cruz-Barbosa, Enrique Guzmán-Ramírez, Iván García-Pacheco and Ignacio Arroyo-Fernández

Big Data Cogn. Comput. 2025, 9(6), 162; https://doi.org/10.3390/bdcc9060162 - 19 Jun 2025

Viewed by 1988

Abstract

Large-language-model (LLM) APIs demonstrate impressive reasoning capabilities, but their size, cost, and closed weights limit the deployment of knowledge-aware AI within biomedical research groups. At the other extreme, standard attention-based neural language models (SANLMs)—including encoder–decoder architectures such as Transformers, Gated Recurrent Units (GRUs), [...] Read more.

Large-language-model (LLM) APIs demonstrate impressive reasoning capabilities, but their size, cost, and closed weights limit the deployment of knowledge-aware AI within biomedical research groups. At the other extreme, standard attention-based neural language models (SANLMs)—including encoder–decoder architectures such as Transformers, Gated Recurrent Units (GRUs), and Long Short-Term Memory (LSTM) networks—are computationally inexpensive. However, their capacity for semantic reasoning in noisy, open-vocabulary knowledge bases (KBs) remains unquantified. Therefore, we investigate whether compact SANLMs can (i) reason over hybrid OpenIE-derived KBs that integrate commonsense, general-purpose, and non-communicable-disease (NCD) literature; (ii) operate effectively on commodity GPUs; and (iii) exhibit semantic coherence as assessed through manual linguistic inspection. To this end, we constructed four training KBs by integrating ConceptNet (600k triples), a 39k-triple general-purpose OpenIE set, and an 18.6k-triple OpenNCDKB extracted from 1200 PubMed abstracts. Encoder–decoder GRU, LSTM, and Transformer models (1–2 blocks) were trained to predict the object phrase given the subject + predicate. Beyond token-level cross-entropy, we introduced the Meaning-based Selectional-Preference Test (MSPT): for each withheld triple, we masked the object, generated a candidate, and measured its surplus cosine similarity over a random baseline using word embeddings, with significance assessed via a one-sided t-test. Hyperparameter sensitivity (311 GRU/168 LSTM runs) was analyzed, and qualitative frame–role diagnostics completed the evaluation. Our results showed that all SANLMs learned effectively from the point of view of the cross entropy loss. In addition, our MSPT provided meaningful semantic insights: for the GRUs (256-dim, 2048-unit, 1-layer): mean similarity

(μ_{s t s})

of 0.641 to the ground truth vs. 0.542 to the random baseline (gap 12.1%;

p < 10^{- 180}

). For the 1-block Transformer:

μ_{s t s} = 0.551

vs.

0.511

(gap 4%;

p < 10^{- 25}

). While Transformers minimized loss and accuracy variance, GRUs captured finer selectional preferences. Both architectures trained within <24 GB GPU VRAM and produced linguistically acceptable, albeit over-generalized, biomedical assertions. Due to their observed performance, LSTM results were designated as baseline models for comparison. Therefore, properly tuned SANLMs can achieve statistically robust semantic reasoning over noisy, domain-specific KBs without reliance on massive LLMs. Their interpretability, minimal hardware footprint, and open weights promote equitable AI research, opening new avenues for automated NCD knowledge synthesis, surveillance, and decision support. Full article

► Show Figures

Figure 1

28 pages, 1007 KB

Open AccessArticle

Predicting the Event Types in the Human Brain: A Modeling Study Based on Embedding Vectors and Large-Scale Situation Type Datasets in Mandarin Chinese

by Xiaorui Ma and Hongchao Liu

Appl. Sci. 2025, 15(11), 5916; https://doi.org/10.3390/app15115916 - 24 May 2025

Viewed by 812

Abstract

Event types classify Chinese verbs based on the internal temporal structure of events. The categorization of verb event types is the most fundamental classification of concept types represented by verbs in the human brain. Meanwhile, event types exhibit strong predictive capabilities for exploring [...] Read more.

Event types classify Chinese verbs based on the internal temporal structure of events. The categorization of verb event types is the most fundamental classification of concept types represented by verbs in the human brain. Meanwhile, event types exhibit strong predictive capabilities for exploring collocational patterns between words, making them crucial for Chinese teaching. This work focuses on constructing a statistically validated gold-standard dataset, forming the foundation for achieving high accuracy in recognizing verb event types. Utilizing a manually annotated dataset of verbs and aspectual markers’ co-occurrence features, the research conducts hierarchical clustering of Chinese verbs. The resulting dendrogram indicates that verbs can be categorized into three event types—state, activity and transition—based on semantic distance. Two approaches are employed to construct vector matrices: a supervised method that derives word vectors based on linguistic features, and an unsupervised method that uses four models to extract embedding vectors, including Word2Vec, FastText, BERT and ChatGPT. The classification of verb event types is performed using three classifiers: multinomial logistic regression, support vector machines and artificial neural networks. Experimental results demonstrate the superior performance of embedding vectors. Employing the pre-trained FastText model in conjunction with an artificial neural network classifier, the model achieves an accuracy of 98.37% in predicting 3133 verbs, thereby enabling the automatic identification of event types at the level of Chinese verbs and validating the high accuracy and practical value of embedding vectors in addressing complex semantic relationships and classification tasks. This work constructs datasets of considerable semantic complexity, comprising a substantial volume of verbs along with their feature vectors and situation type labels, which can be used for evaluating large language models in the future. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence and Semantic Mining Technology)

► Show Figures

Figure 1

19 pages, 3185 KB

Open AccessArticle

Short Text Classification Based on Enhanced Word Embedding and Hybrid Neural Networks

by Cunhe Li, Zian Xie and Haotian Wang

Appl. Sci. 2025, 15(9), 5102; https://doi.org/10.3390/app15095102 - 4 May 2025

Cited by 3 | Viewed by 3867

Abstract

In recent years, text classification has found wide application in diverse real-world scenarios. In Chinese news classification tasks, limitations such as sparse contextual information and semantic ambiguity exist in the title text. To improve the performance of short text classification, this paper proposes [...] Read more.

In recent years, text classification has found wide application in diverse real-world scenarios. In Chinese news classification tasks, limitations such as sparse contextual information and semantic ambiguity exist in the title text. To improve the performance of short text classification, this paper proposes a Word2Vec-based enhanced word embedding method and exhibits the design of a dual-channel hybrid neural network architecture to effectively extract semantic features. Specifically, we introduce a novel weighting scheme, Term Frequency-Document Frequency Category-Distribution Weight (TF-IDF-CDW), where Category Distribution Weight (CDW) reflects the distribution pattern of words across different categories. By weighting the pretrained Word2Vec vectors with TF-IDF-CDW and concatenating them with part-of-speech (POS) feature vectors, semantically enriched and more discriminative word embedding vectors are generated. Furthermore, we propose a dual-channel hybrid model based on a Gated Convolutional Neural Network (GCNN) and Bidirectional Long Short-Term Memory (BiLSTM), which jointly captures local features and long-range global dependencies. To evaluate the overall performance of the model, experiments were conducted on the Chinese short text datasets THUCNews and TNews. The proposed model achieved classification accuracies of 91.85% and 87.70%, respectively, outperforming several comparative models and demonstrating the effectiveness of the proposed method. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 3529 KB

Open AccessArticle

Protecting Intellectual Security Through Hate Speech Detection Using an Artificial Intelligence Approach

by Sadeem Alrasheed, Suliman Aladhadh and Abdulatif Alabdulatif

Algorithms 2025, 18(4), 179; https://doi.org/10.3390/a18040179 - 21 Mar 2025

Cited by 2 | Viewed by 1650

Abstract

Online social networks (OSNs) have become an integral part of daily life, with platforms such as X (formerly Twitter) being among the most popular in the Middle East. However, X faces the problem of widespread hate speech aimed at spreading hostility between communities, [...] Read more.

Online social networks (OSNs) have become an integral part of daily life, with platforms such as X (formerly Twitter) being among the most popular in the Middle East. However, X faces the problem of widespread hate speech aimed at spreading hostility between communities, especially among Arabic-speaking users. This problem is exacerbated by the lack of effective tools for processing Arabic content and the complexity of the Arabic language, including its diverse grammar and dialects. This study developed a two-layer framework to detect and classify Arabic hate speech using machine learning and deep learning with various features and word embedding techniques. A large dataset of Arabic tweets was collected using the X API. The first layer of the framework focused on detecting hate speech, while the second layer classified it into religious, social, or political hate speech. Convolutional neural networks (CNN) outperformed other models, achieving an accuracy of 92% in hate speech detection and 93% in classification. These results highlight the framework’s effectiveness in addressing Arabic language complexities and improving content monitoring tools, thereby contributing to intellectual security and fostering a safer digital space. Full article

(This article belongs to the Special Issue Machine Learning for Pattern Recognition (2nd Edition))

► Show Figures

Figure 1

25 pages, 1451 KB

Open AccessFeature PaperArticle

A Graph Neural Network-Based Context-Aware Framework for Sentiment Analysis Classification in Chinese Microblogs

by Zhesheng Jin and Yunhua Zhang

Mathematics 2025, 13(6), 997; https://doi.org/10.3390/math13060997 - 18 Mar 2025

Cited by 1 | Viewed by 2455

Abstract

Sentiment analysis in Chinese microblogs is challenged by complex syntactic structures and fine-grained sentiment shifts. To address these challenges, a Contextually Enriched Graph Neural Network (CE-GNN) is proposed, integrating self-supervised learning, context-aware sentiment embeddings, and Graph Neural Networks (GNNs) to enhance sentiment classification. [...] Read more.

Sentiment analysis in Chinese microblogs is challenged by complex syntactic structures and fine-grained sentiment shifts. To address these challenges, a Contextually Enriched Graph Neural Network (CE-GNN) is proposed, integrating self-supervised learning, context-aware sentiment embeddings, and Graph Neural Networks (GNNs) to enhance sentiment classification. First, CE-GNN is pre-trained on a large corpus of unlabeled text through self-supervised learning, where Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) are leveraged to obtain contextualized embeddings. These embeddings are then refined through a context-aware sentiment embedding layer, which is dynamically adjusted based on the surrounding text to improve sentiment sensitivity. Next, syntactic dependencies are captured by Graph Neural Networks (GNNs), where words are represented as nodes and syntactic relationships are denoted as edges. Through this graph-based structure, complex sentence structures, particularly in Chinese, can be interpreted more effectively. Finally, the model is fine-tuned on a labeled dataset, achieving state-of-the-art performance in sentiment classification. Experimental results demonstrate that CE-GNN achieves superior accuracy, with a Macro F-measure of 80.21% and a Micro F-measure of 82.93%. Ablation studies further confirm that each module contributes significantly to the overall performance. Full article

(This article belongs to the Section E2: Control Theory and Mechanics)

► Show Figures

Figure 1

22 pages, 4990 KB

Open AccessArticle

Edge-Centric Embeddings of Digraphs: Properties and Stability Under Sparsification

by Ahmed Begga, Francisco Escolano Ruiz and Miguel Ángel Lozano

Entropy 2025, 27(3), 304; https://doi.org/10.3390/e27030304 - 14 Mar 2025

Viewed by 1311

Abstract

In this paper, we define and characterize the embedding of edges and higher-order entities in directed graphs (digraphs) and relate these embeddings to those of nodes. Our edge-centric approach consists of the following: (a) Embedding line digraphs (or their iterated versions); (b) Exploiting [...] Read more.

In this paper, we define and characterize the embedding of edges and higher-order entities in directed graphs (digraphs) and relate these embeddings to those of nodes. Our edge-centric approach consists of the following: (a) Embedding line digraphs (or their iterated versions); (b) Exploiting the rank properties of these embeddings to show that edge/path similarity can be posed as a linear combination of node similarities; (c) Solving scalability issues through digraph sparsification; (d) Evaluating the performance of these embeddings for classification and clustering. We commence by identifying the motive behind the need for edge-centric approaches. Then we proceed to introduce all the elements of the approach, and finally, we validate it. Our edge-centric embedding entails a top-down mining of links, instead of inferring them from the similarities of node embeddings. This analysis is key to discovering inter-subgraph links that hold the whole graph connected, i.e., central edges. Using directed graphs (digraphs) allows us to cluster edge-like hubs and authorities. In addition, since directed edges inherit their labels from destination (origin) nodes, their embedding provides a proxy representation for node classification and clustering as well. This representation is obtained by embedding the line digraph of the original one. The line digraph provides nice formal properties with respect to the original graph; in particular, it produces more entropic latent spaces. With these properties at hand, we can relate edge embeddings to node embeddings. The main contribution of this paper is to set and prove the linearity theorem, which poses each element of the transition matrix for an edge embedding as a linear combination of the elements of the transition matrix for the node embedding. As a result, the rank preservation property explains why embedding the line digraph and using the labels of the destination nodes provides better classification and clustering performances than embedding the nodes of the original graph. In other words, we do not only facilitate edge mining but enforce node classification and clustering. However, computing the line digraph is challenging, and a sparsification strategy is implemented for the sake of scalability. Our experimental results show that the line digraph representation of the sparsified input graph is quite stable as we increase the sparsification level, and also that it outperforms the original (node-centric) representation. For the sake of simplicity, our theorem relies on node2vec-like (factorization) embeddings. However, we also include several experiments showing how line digraphs may improve the performance of Graph Neural Networks (GNNs), also following the principle of maximum entropy. Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

22 pages, 1390 KB

Open AccessArticle

Emotion-Aware Embedding Fusion in Large Language Models (Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation

by Abdur Rasool, Muhammad Irfan Shahzad, Hafsa Aslam, Vincent Chan and Muhammad Ali Arshad

AI 2025, 6(3), 56; https://doi.org/10.3390/ai6030056 - 13 Mar 2025

Cited by 25 | Viewed by 6096

Abstract

Empathetic and coherent responses are critical in automated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention [...] Read more.

Empathetic and coherent responses are critical in automated chatbot-facilitated psychotherapy. This study addresses the challenge of enhancing the emotional and contextual understanding of large language models (LLMs) in psychiatric applications. We introduce Emotion-Aware Embedding Fusion, a novel framework integrating hierarchical fusion and attention mechanisms to prioritize semantic and emotional features in therapy transcripts. Our approach combines multiple emotion lexicons, including NRC Emotion Lexicon, VADER, WordNet, and SentiWordNet, with state-of-the-art LLMs such as Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4. Therapy session transcripts, comprising over 2000 samples, are segmented into hierarchical levels (word, sentence, and session) using neural networks, while hierarchical fusion combines these features with pooling techniques to refine emotional representations. Attention mechanisms, including multi-head self-attention and cross-attention, further prioritize emotional and contextual features, enabling the temporal modeling of emotional shifts across sessions. The processed embeddings, computed using BERT, GPT-3, and RoBERTa, are stored in the Facebook AI similarity search vector database, which enables efficient similarity search and clustering across dense vector spaces. Upon user queries, relevant segments are retrieved and provided as context to LLMs, enhancing their ability to generate empathetic and contextually relevant responses. The proposed framework is evaluated across multiple practical use cases to demonstrate real-world applicability, including AI-driven therapy chatbots. The system can be integrated into existing mental health platforms to generate personalized responses based on retrieved therapy session data. The experimental results show that our framework enhances empathy, coherence, informativeness, and fluency, surpassing baseline models while improving LLMs’ emotional intelligence and contextual adaptability for psychotherapy. Full article

(This article belongs to the Special Issue Multimodal Artificial Intelligence in Healthcare)

► Show Figures

Figure 1

Search Results (251)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (251)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI