Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (278)

Search Parameters:
Keywords = BERT fine-tuning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 1016 KB  
Article
Combining User and Venue Personality Proxies with Customers’ Preferences and Opinions to Enhance Restaurant Recommendation Performance
by Andreas Gregoriades, Herodotos Herodotou, Maria Pampaka and Evripides Christodoulou
AI 2026, 7(1), 19; https://doi.org/10.3390/ai7010019 - 9 Jan 2026
Abstract
Recommendation systems are popular information systems that help consumers manage information overload. Whilst personality has been recognised as an important factor influencing consumers’ choice, it has not yet been fully exploited in recommendation systems. This study proposes a restaurant recommendation approach that integrates [...] Read more.
Recommendation systems are popular information systems that help consumers manage information overload. Whilst personality has been recognised as an important factor influencing consumers’ choice, it has not yet been fully exploited in recommendation systems. This study proposes a restaurant recommendation approach that integrates customer personality traits, opinions and preferences, extracted either directly from online review platforms or derived from electronic word of mouth (eWOM) text using information extraction techniques. The proposed method leverages the concept of venue personality grounded in personality–brand congruence theory, which posits that customers are more satisfied with brands whose personalities align with their own. A novel model is introduced that combines fine-tuned BERT embeddings with linguistic features to infer users’ personality traits from the text of their reviews. Customers’ preferences are identified using a custom named-entity recogniser, while their opinions are extracted through structural topic modelling. The overall framework integrates neural collaborative filtering (NCF) features with both directly observed and derived information from eWOM to train an extreme gradient boosting (XGBoost) regression model. The proposed approach is compared to baseline collaborative filtering methods and state-of-the-art neural network techniques commonly used in industry. Results across multiple performance metrics demonstrate that incorporating personality, preferences and opinions significantly improves recommendation performance. Full article
20 pages, 945 KB  
Article
A Pilot Study on Multilingual Detection of Irregular Migration Discourse on X and Telegram Using Transformer-Based Models
by Dimitrios Taranis, Gerasimos Razis and Ioannis Anagnostopoulos
Electronics 2026, 15(2), 281; https://doi.org/10.3390/electronics15020281 - 8 Jan 2026
Abstract
The rise of Online Social Networks has reshaped global discourse, enabling real-time conversations on complex issues such as irregular migration. Yet the informal, multilingual, and often noisy nature of content on platforms like X (formerly Twitter) and Telegram presents significant challenges for reliable [...] Read more.
The rise of Online Social Networks has reshaped global discourse, enabling real-time conversations on complex issues such as irregular migration. Yet the informal, multilingual, and often noisy nature of content on platforms like X (formerly Twitter) and Telegram presents significant challenges for reliable automated analysis. This study presents an exploratory multilingual natural language processing (NLP) framework for detecting irregular migration discourse across five languages. Conceived as a pilot study addressing extreme data scarcity in sensitive migration contexts, this work evaluates transformer-based models on a curated multilingual corpus. It provides an initial baseline for monitoring informal migration narratives on X and Telegram. We evaluate a broad range of approaches, including traditional machine learning classifiers, SetFit sentence-embedding models, fine-tuned multilingual BERT (mBERT) transformers, and a Large Language Model (GPT-4o). The results show that GPT-4o achieves the highest performance overall (F1-score: 0.84), with scores reaching 0.89 in French and 0.88 in Greek. While mBERT excels in English, SetFit outperforms mBERT in low-resource settings, specifically in Arabic (0.79 vs. 0.70) and Greek (0.88 vs. 0.81). The findings highlight the effectiveness of transformer-based and large-language-model approaches, particularly in low-resource or linguistically heterogeneous environments. Overall, the proposed framework provides an initial, compact benchmark for multilingual detection of irregular migration discourse under extreme, low-resource conditions. The results should be viewed as exploratory indicators of model behavior on this synthetic, small-scale corpus, not as statistically generalizable evidence or deployment-ready tools. In this context, “multilingual” refers to robustness across different linguistic realizations of identical migration narratives under translation, rather than coverage of organically diverse multilingual public discourse. Full article
(This article belongs to the Special Issue Artificial Intelligence-Driven Emerging Applications)
Show Figures

Figure 1

22 pages, 1308 KB  
Article
From Edge Transformer to IoT Decisions: Offloaded Embeddings for Lightweight Intrusion Detection
by Frédéric Adjewa, Moez Esseghir and Leïla Merghem-Boulahia
Sensors 2026, 26(2), 356; https://doi.org/10.3390/s26020356 - 6 Jan 2026
Viewed by 95
Abstract
The convergence of Artificial Intelligence (AI) and the Internet of Things (IoT) is enabling a new class of intelligent applications. Specifically, Large Language Models (LLMs) are emerging as powerful tools not only for natural language understanding but also for enhancing IoT security. However, [...] Read more.
The convergence of Artificial Intelligence (AI) and the Internet of Things (IoT) is enabling a new class of intelligent applications. Specifically, Large Language Models (LLMs) are emerging as powerful tools not only for natural language understanding but also for enhancing IoT security. However, the integration of these computationally intensive models into resource-constrained IoT environments presents significant challenges. This paper provides an in-depth examination of how LLMs can be adapted to secure IoT ecosystems. We identify key application areas, discuss major challenges, and propose optimization strategies for resource-limited settings. Our primary contribution is a novel collaborative embeddings offloading mechanism for IoT intrusion detection named SEED (Semantic Embeddings for Efficient Detection). This system leverages a lightweight, fine-tuned BERT model, chosen for its proven contextual and semantic understanding of sequences, to generate rich network embeddings at the edge. A compact neural network deployed on the end-device then queries these embeddings to assess network flow normality. This architecture alleviates the computational burden of running a full transformer on the device while capitalizing on its analytical performance. Our optimized BERT model is reduced by approximately 90% from its original size, now representing approximately 41 MB, suitable for the Edge. The resulting compact neural network is a mere 137 KB, appropriate for the IoT devices. This system achieves 99.9% detection accuracy with an average inference time of under 70 ms on a standard CPU. Finally, the paper discusses the ethical implications of LLM-IoT integration and evaluates the resilience of LLMs in dynamic and adversarial environments. Full article
(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2025)
Show Figures

Figure 1

22 pages, 3277 KB  
Article
FusionBullyNet: A Robust English—Arabic Cyberbullying Detection Framework Using Heterogeneous Data and Dual-Encoder Transformer Architecture with Attention Fusion
by Mohammed A. Mahdi, Muhammad Asad Arshed and Shahzad Mumtaz
Mathematics 2026, 14(1), 170; https://doi.org/10.3390/math14010170 - 1 Jan 2026
Viewed by 201
Abstract
Cyberbullying has become a pervasive threat on social media, impacting the safety and wellbeing of users worldwide. Most existing studies focus on monolingual content, limiting their applicability to online environments. This study aims to develop an approach that accurately detects abusive content in [...] Read more.
Cyberbullying has become a pervasive threat on social media, impacting the safety and wellbeing of users worldwide. Most existing studies focus on monolingual content, limiting their applicability to online environments. This study aims to develop an approach that accurately detects abusive content in bilingual settings. Given the large volume of online content in English and Arabic, we propose a bilingual cyberbullying detection approach designed to deliver efficient, scalable, and robust performance. Several datasets were combined, processed, and augmented before proposing a cyberbullying identification approach. The proposed model (FusionBullyNet) is based on fine-tuning of two transformer models (RoBERTa-base + bert-base-arabertv02-twitter), attention-based fusion, gradually unfreezing the layers, and label smoothing to enhance generalization. The test accuracy of 0.86, F1 scores of 0.83 for bullying and 0.88 for no bullying, and an overall ROC-AUC of 0.929 were achieved with the proposed approach. To assess the robustness of the proposed models, several multilingual models, such as XLM-RoBERTa-Base, Microsoft/mdeberta-v3-base, and google-bert/bert-base-multilingual-cased, were also trained in this study, and all achieved a test accuracy of 0.84. Furthermore, several machine learning models were trained in this study, and Logistic Regression, XGBoost Classifier, and Light GBM Classifier achieved the highest accuracy of 0.82. These results demonstrate that the proposed approach provides a reliable, high-performance solution for cyberbullying detection, contributing to safer online communication environments. Full article
(This article belongs to the Special Issue Computational Intelligence in Addressing Data Heterogeneity)
Show Figures

Figure 1

25 pages, 573 KB  
Article
Enhancing IoT Security with Generative AI: Threat Detection and Countermeasure Design
by Alex Oacheșu, Kayode S. Adewole, Andreas Jacobsson and Paul Davidsson
Electronics 2026, 15(1), 92; https://doi.org/10.3390/electronics15010092 - 24 Dec 2025
Viewed by 188
Abstract
The rapid proliferation of Internet of Things (IoT) devices has increased the attack surface for cyber threats. Traditional intrusion detection systems often struggle to keep pace with novel or evolving threats. This study proposes an end-to-end generative AI-based intrusion detection and response pipeline [...] Read more.
The rapid proliferation of Internet of Things (IoT) devices has increased the attack surface for cyber threats. Traditional intrusion detection systems often struggle to keep pace with novel or evolving threats. This study proposes an end-to-end generative AI-based intrusion detection and response pipeline designed for automated threat mitigation in smart home IoT environments. It leverages a Variational Autoencoder (VAE) trained on benign traffic to flag anomalies, a fine-tuned Bidirectional Encoder Representations from Transformers (BERT) model to classify anomalies into five attack categories (C&C, DDoS, Okiru, PortScan, and benign), and Grok3—a large language model—to generate tailored countermeasure recommendations. Using the Aposemat IoT-23 dataset, the VAE model achieves a recall of 0.999 and a precision of 0.961 for anomaly detection. The BERT model achieves an overall accuracy of 99.90% with per-class F1 scores exceeding 0.99. End-to-end prototype simulation involving 10,000 network traffic samples demonstrate a 98% accuracy in identifying cyber attacks and generating countermeasures to mitigate them. The pipeline integrates generative models for improved detection and automated security policy formulation in IoT settings, enhancing detection and enabling quicker and actionable security responses to mitigate cyber threats targeting smart home environments. Full article
Show Figures

Figure 1

16 pages, 2131 KB  
Article
A Generalizable Agentic AI Pipeline for Developing Chatbots Using Small Language Models: A Case Study on Thai Student Loan Fund Services
by Jakkaphong Inpun, Watcharaporn Cholamjiak, Piyada Phrueksawatnon and Kanokwatt Shiangjen
Computation 2025, 13(12), 297; https://doi.org/10.3390/computation13120297 - 18 Dec 2025
Viewed by 503
Abstract
The rising deployment of artificial intelligence in public services is constrained by computational costs and limited domain-specific data, particularly in multilingual contexts. This study proposes a generalizable Agentic AI pipeline for developing question–answer chatbot systems using small language models (SLMs), demonstrated through a [...] Read more.
The rising deployment of artificial intelligence in public services is constrained by computational costs and limited domain-specific data, particularly in multilingual contexts. This study proposes a generalizable Agentic AI pipeline for developing question–answer chatbot systems using small language models (SLMs), demonstrated through a case study on the Thai Student Loan Fund (TSLF). The pipeline integrates four stages: OCR-based document digitization using Typhoon2-3B, agentic question–answer dataset construction via a clean–check–plan–generate (CCPG) workflow, parameter-efficient fine-tuning with QLoRA on Typhoon2-1B and Typhoon2-3B models, and retrieval-augmented generation (RAG) for source-grounded responses. Evaluation using BERTScore and CondBERT confirmed high semantic consistency (FBERT = 0.9807) and stylistic reliability (FBERT = 0.9839) of the generated QA corpus. Fine-tuning improved the 1B model’s domain alignment (FBERT: 0.8593 → 0.8641), while RAG integration further enhanced factual grounding (FBERT = 0.8707) and citation transparency. Cross-validation with GPT-5 and Gemini 2.5 Pro demonstrated dataset transferability and reliability. The results establish that Agentic AI combined with SLMs offers a cost-effective, interpretable, and scalable framework for automating bilingual advisory services in resource-constrained government and educational institutions. Full article
(This article belongs to the Special Issue Generative AI in Action: Trends, Applications, and Implications)
Show Figures

Figure 1

38 pages, 8382 KB  
Article
Ontology-Driven Emotion Multi-Class Classification and Influence Analysis of User Opinions on Online Travel Agency
by Putri Utami Rukmana, Muharman Lubis, Hanif Fakhrurroja, Asriana and Alif Noorachmad Muttaqin
Future Internet 2025, 17(12), 582; https://doi.org/10.3390/fi17120582 - 17 Dec 2025
Viewed by 399
Abstract
The rise in social media has transformed Online Travel Agencies (OTAs) into platforms where users actively share their experiences and opinions. However, conventional opinion mining approaches often fail to capture nuanced emotional expressions or connect them to user influence. To address this gap, [...] Read more.
The rise in social media has transformed Online Travel Agencies (OTAs) into platforms where users actively share their experiences and opinions. However, conventional opinion mining approaches often fail to capture nuanced emotional expressions or connect them to user influence. To address this gap, this study introduces an ontology-driven opinion mining framework that integrates multi-class emotion classification, aspect-based analysis, and influence modeling using Indonesian-language discussions from the social media platform X. The framework combines an OTA-specific ontology that formally represents service aspects such as booking support, financial, platform experience, and event with fine-tuned IndoBERT for emotion recognition and sentiment polarity detection, and Social Network Analysis (SNA) enhanced by entropy weighting and TOPSIS to quantify and rank user influence. The results show that the fine-tuned IndoBERT performs strongly with respect to identification and sentiment polarity detection, with moderate results for multi-class emotion classification. Emotion labels enrich the ontology by linking user opinions to their affective context, enabling the deeper interpretation of customer experiences and service-related issues. The influence analysis further reveals that structural network properties, particularly betweenness, closeness, and eigenvector centrality, serve as the primary determinants of user influence, while engagement indicators act as discriminative amplifiers that highlight users whose content attains high visibility. Overall, the proposed framework offers a comprehensive and interpretable approach to understanding public perception in Indonesian-language OTA discussions. It advances opinion mining for low-resource languages by bridging semantic ontology modeling, emotional understanding, and influence analysis, while providing practical insights for OTAs to enhance service responsiveness, manage emotional engagement, and strengthen digital communication strategies. Full article
Show Figures

Graphical abstract

36 pages, 8767 KB  
Article
AI-Powered Multimodal System for Haiku Appreciation Based on Intelligent Data Analysis: Validation and Cross-Cultural Extension Potential
by Renjie Fan and Yuanyuan Wang
Electronics 2025, 14(24), 4921; https://doi.org/10.3390/electronics14244921 - 15 Dec 2025
Viewed by 340
Abstract
This study proposes an artificial intelligence (AI)-powered multimodal system designed to enhance the appreciation of traditional poetry, using Japanese haiku as the primary application domain. At the core of the system is an intelligent data analysis pipeline that extracts key emotional features from [...] Read more.
This study proposes an artificial intelligence (AI)-powered multimodal system designed to enhance the appreciation of traditional poetry, using Japanese haiku as the primary application domain. At the core of the system is an intelligent data analysis pipeline that extracts key emotional features from poetic texts. A fine-tuned Japanese BERT model is employed to compute three affective indices—valence, energy, and dynamism—which form a quantitative emotional representation of each haiku. These features guide a generative AI workflow: ChatGPT constructs structured image prompts based on the extracted affective cues and contextual information, and these prompts are used by DALL·E to synthesize stylistically consistent watercolor illustrations. Simultaneously, background music is automatically selected from an open-source collection by matching each poem’s affective vector with that of instrumental tracks, producing a coherent multimodal (text, image, sound) experience. A series of validation experiments demonstrated the reliability and stability of the extracted emotional features, as well as their effectiveness in supporting consistent cross-modal alignment. These results indicate that poetic emotion can be represented within a low-dimensional affective space and used as a bridge across linguistic and artistic modalities. The proposed framework illustrates a novel integration of affective computing and natural language processing (NLP) within cultural computing. Because the underlying emotional representation is linguistically agnostic, the system holds strong potential for cross-cultural extensions, including applications to Chinese classical poetry and other forms of traditional literature. Full article
Show Figures

Figure 1

22 pages, 1764 KB  
Article
A Domain-Finetuned Semantic Matching Framework Based on Dynamic Masking and Contrastive Learning for Specialized Text Retrieval
by Yiming Zhang, Yong Zhu, Zijie Zhu, Pengzhong Liu, Pengfei Xie and Cong Wu
Electronics 2025, 14(24), 4882; https://doi.org/10.3390/electronics14244882 - 11 Dec 2025
Viewed by 310
Abstract
Semantic matching is essential for understanding natural language, but traditional models like BERT face challenges with random masking strategies, limiting their ability to capture key information. Additionally, BERT’s sentence vectors may “collapse,” making it difficult to distinguish between different sentences. This paper introduces [...] Read more.
Semantic matching is essential for understanding natural language, but traditional models like BERT face challenges with random masking strategies, limiting their ability to capture key information. Additionally, BERT’s sentence vectors may “collapse,” making it difficult to distinguish between different sentences. This paper introduces a domain-finetuned semantic matching framework that uses dynamic masking and contrastive learning techniques to address these issues. The dynamic masking strategy enhances the model’s ability to retain critical information, while contrastive learning improves sentence vector representations using a small amount of unlabeled text. This approach helps the model better align with the needs of various downstream tasks. Experimental results show that after private domain training, the model improves semantic similarity between entities by 16.9%, outperforming existing models. It also demonstrates an 8.0% average improvement in semantic matching for diverse text. Performance metrics such as A@1, A@3, and A@5 are at least 26.1% higher than those of competing models. For newly added entities, the model achieves a 44.3% average improvement, consistently surpassing other models by at least 30%. These results collectively validate the effectiveness and superiority of the proposed framework in domain-specific semantic matching tasks. Full article
(This article belongs to the Special Issue Advances in Text Mining and Analytics)
Show Figures

Figure 1

17 pages, 12946 KB  
Article
A Comparative Analysis of LLM-Based Customer Representation Learning Techniques
by Sangyeop Lee, Jong Seo Kim, Kisoo Kim, Bojung Ko, Junho Moon and Minsik Park
Electronics 2025, 14(24), 4783; https://doi.org/10.3390/electronics14244783 - 5 Dec 2025
Viewed by 448
Abstract
Recent advances in large language models (LLMs) have enabled the effective representation of customer behaviors, including purchases, repairs, and consultations. These LLM-based customer representation models apply to predicting future behavior of the customer or clustering customers with similar representations by latent vectors. Since [...] Read more.
Recent advances in large language models (LLMs) have enabled the effective representation of customer behaviors, including purchases, repairs, and consultations. These LLM-based customer representation models apply to predicting future behavior of the customer or clustering customers with similar representations by latent vectors. Since these representation technologies depend on data, this paper examines whether training a recommendation model (BERT4Rec) from scratch or fine-tuning a pre-trained LLM (ELECTRA) is more effective for our customer data. To address this, a three-step approach is conducted: (1) defining a sequence of customer behaviors into textual inputs for LLM-based representation learning, (2) extracting customer representation as latent vectors by training or fine-tuning representation models on a dataset of 14 million customers, and (3) training classifiers to predict purchase outcomes for eight products. Our focus is on comparing two primary approaches in step (2): training BERT4Rec from scratch versus fine-tuning pre-trained ELECTRA. The average AUC and F1-score of classifiers across eight products reveal that both methods achieve gaps of only 0.012 in AUC and 0.007 in F1-score. On the other hand, the fine-tuned ELECTRA achieves a 0.27 improvement in the top 10% lift for targeted marketing strategies. This result is particularly meaningful given that buyers of products constitute only about 0.5% of the entire dataset. Beyond the three-step approach, we make an effort to interpret latent space in two-dimensional and attention shifts in fine-tuned ELECTRA. Furthermore, we compare its efficiency advantages against fine-tuned LLaMA2. These findings provide practical insights for optimizing LLM-based representation models in industrial applications. Full article
(This article belongs to the Special Issue Machine Learning for Data Mining)
Show Figures

Figure 1

23 pages, 2741 KB  
Article
Subjective Evaluation of Operator Responses for Mobile Defect Identification in Remanufacturing: Application of NLP and Disagreement Tagging
by Abbirah Ahmed, Reenu Mohandas, Arash Joorabchi and Martin J. Hayes
Big Data Cogn. Comput. 2025, 9(12), 312; https://doi.org/10.3390/bdcc9120312 - 4 Dec 2025
Viewed by 372
Abstract
In the context of remanufacturing, particularly mobile device refurbishing, effective operator training is crucial for accurate defect identification and process inspection efficiency. This study examines the application of Natural Language Processing (NLP) techniques to evaluate operator expertise based on subjective textual responses gathered [...] Read more.
In the context of remanufacturing, particularly mobile device refurbishing, effective operator training is crucial for accurate defect identification and process inspection efficiency. This study examines the application of Natural Language Processing (NLP) techniques to evaluate operator expertise based on subjective textual responses gathered during a defect analysis task. Operators were asked to describe screen defects using open-ended questions, and their responses were compared with expert responses to evaluate their accuracy and consistency. We employed four NLP models, including finetuned Sentence-BERT (SBERT), pre-trained SBERT, Word2Vec, and Dice similarity, to determine their effectiveness in interpreting short, domain-specific text. A novel disagreement tagging framework was introduced to supplement traditional similarity metrics with explainable insights. This framework identifies the root causes of model–human misalignment across four categories: defect type, severity, terminology, and location. Results show that a finetuned SBERT model significantly outperforms other models by achieving Pearsons’s correlation of 0.93 with MAE and RMSE scores of 0.07 and 0.12, respectively, providing more accurate and context-aware evaluations. In contrast, other models exhibit limitations in semantic understanding and consistency. The results highlight the importance of finetuning NLP models for domain-specific applications and demonstrate how qualitative tagging methods can enhance interpretability and model debugging. This combined approach indicates a scalable and transparent methodology for the evaluation of operator responses, supporting the development of more effective training programmes in industrial settings where remanufacturing and sustainability generally are a key performance metric. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

22 pages, 5082 KB  
Article
A Two-Stage Deep Learning Framework for AI-Driven Phishing Email Detection Based on Persuasion Principles
by Peter Tooher and Harjinder Singh Lallie
Computers 2025, 14(12), 523; https://doi.org/10.3390/computers14120523 - 1 Dec 2025
Viewed by 926
Abstract
AI-generated phishing emails present a growing cybersecurity threat, exploiting human psychology with high-quality, context-aware language. This paper introduces a novel two-stage detection framework that combines deep learning with psychological analysis to address this challenge. A new dataset containing 2995 GPT-o1-generated phishing emails, each [...] Read more.
AI-generated phishing emails present a growing cybersecurity threat, exploiting human psychology with high-quality, context-aware language. This paper introduces a novel two-stage detection framework that combines deep learning with psychological analysis to address this challenge. A new dataset containing 2995 GPT-o1-generated phishing emails, each labelled with Cialdini’s six persuasion principles, is created across five organisational sectors—forming one of the largest and most behaviourally annotated corpora in the field. The first stage employs a fine-tuned DistilBERT model to predict the presence of persuasion principles in each email. These confidence scores then feed into a lightweight dense neural network at the second stage for final binary classification. This interpretable design balances performance with insight into attacker strategies. The full system achieves 94% accuracy and 98% AUC, outperforming comparable methods while offering a clearer explanation of model decisions. Analysis shows that principles like authority, scarcity, and social proof are highly indicative of phishing, while reciprocation and likeability occur more often in legitimate emails. This research contributes an interpretable, psychology-informed framework for phishing detection, alongside a unique dataset for future study. Results demonstrate the value of behavioural cues in identifying sophisticated phishing attacks and suggest broader applications in detecting malicious AI-generated content. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Figure 1

18 pages, 2060 KB  
Article
A Context-Aware Representation-Learning-Based Model for Detecting Human-Written and AI-Generated Cryptocurrency Tweets Across Large Language Models
by Muhammad Asad Arshed, Ştefan Cristian Gherghina, Iqra Khalil, Hasnain Muavia, Anum Saleem and Hajran Saleem
Math. Comput. Appl. 2025, 30(6), 130; https://doi.org/10.3390/mca30060130 - 29 Nov 2025
Viewed by 704
Abstract
The extensive use of large language models (LLMs), particularly in the finance sector, raises concerns about the authenticity and reliability of generated text. Developing a robust method for distinguishing between human-written and AI-generated financial content is therefore essential. This study addressed this challenge [...] Read more.
The extensive use of large language models (LLMs), particularly in the finance sector, raises concerns about the authenticity and reliability of generated text. Developing a robust method for distinguishing between human-written and AI-generated financial content is therefore essential. This study addressed this challenge by constructing a dataset based on financial tweets, where original financial tweet texts were regenerated using six LLMs, resulting in seven distinct classes: human-authored text, LLaMA3.2, Phi3.5, Gemma2, Qwen2.5, Mistral, and LLaVA. A context-aware representation-learning-based model, namely DeBERTa, was extensively fine-tuned for this task. Its performance was compared to that of other transformer variants (DistilBERT, BERT Base Uncased, ELECTRA, and ALBERT Base V1) as well as traditional machine learning models (logistic regression, naive Bayes, random forest, decision trees, XGBoost, AdaBoost, and voting (AdaBoost, GradientBoosting, XGBoost)) using Word2Vec embeddings. The proposed DeBERTa-based model achieved an impressive test accuracy, precision, recall, and F1-score, all reaching 94%. In contrast, competing transformer models achieved test accuracies ranging from 0.78 to 0.80, while traditional machine learning models yielded a significantly lower performance (0.39–0.80). These results highlight the effectiveness of context-aware representation learning in distinguishing between human-written and AI-generated financial text, with significant implications for text authentication, authorship verification, and financial information security. Full article
Show Figures

Figure 1

32 pages, 6691 KB  
Article
Fine-Tuning and Explaining FinBERT for Sector-Specific Financial News: A Reproducible Workflow
by Marian Pompiliu Cristescu, Claudiu Brândaș, Dumitru Alexandru Mara and Petrea Ioana
Electronics 2025, 14(23), 4680; https://doi.org/10.3390/electronics14234680 - 27 Nov 2025
Viewed by 1191
Abstract
The increasing use of complex “black-box” models for financial news sentiment analysis presents a challenge in high-stakes settings where transparency and trust are paramount. This study introduces and validates a finance-focused, fully reproducible, open-source workflow for building, explaining, and evaluating sector-specific sentiment models [...] Read more.
The increasing use of complex “black-box” models for financial news sentiment analysis presents a challenge in high-stakes settings where transparency and trust are paramount. This study introduces and validates a finance-focused, fully reproducible, open-source workflow for building, explaining, and evaluating sector-specific sentiment models mapped to standard market taxonomies and investable proxies. We benchmark interpretable and transformer-based models on public datasets and a newly constructed, manually annotated gold-standard corpus of 1500 U.S. sector-tagged financial headlines. While a zero-shot FinBERT establishes a reasonable baseline (macro F1 = 0.555), fine-tuning on our gold data yields a robust macro F1 = 0.707, a substantial uplift. We extend explainability to the fine-tuned FinBERT with Integrated Gradients (IG) and LIME and perform a quantitative faithfulness audit via deletion curves and AOPC; LIME is most faithful (AOPC = 0.365). We also quantify the risks of weak supervision: accuracy drops (−21.0%) and explanations diverge (SHAP rank ρ = 0.11) relative to gold-label training. Crucially, econometric tests show the sentiment signal is reactive, not predictive, of next-day returns; yet it still supports profitable sector strategies (e.g., Technology long-short Sharpe 1.88). Novelty lies in a finance-aligned, sector-aware, trustworthiness blueprint that pairs fine-tuned FinBERT with audited explanations and uncertainty checks, all end-to-end reproducible and tied to investable sector ETFs. Full article
(This article belongs to the Special Issue AI-Driven Data Analytics and Mining)
Show Figures

Figure 1

18 pages, 862 KB  
Article
Leveraging Large Language Models for Automating Outpatients’ Message Classifications of Electronic Medical Records
by Amima Shifa, G. G. Md. Nawaz Ali and Roopa Foulger
Healthcare 2025, 13(23), 3052; https://doi.org/10.3390/healthcare13233052 - 25 Nov 2025
Viewed by 387
Abstract
Background: The widespread adoption of digital systems in healthcare has produced large volumes of unstructured text data, including outpatient messages sent through electronic medical record (EMR) portals. Efficient classification of these messages is essential for improving workflow automation and enabling timely clinical responses. [...] Read more.
Background: The widespread adoption of digital systems in healthcare has produced large volumes of unstructured text data, including outpatient messages sent through electronic medical record (EMR) portals. Efficient classification of these messages is essential for improving workflow automation and enabling timely clinical responses. Methods: This study investigates the use of large language models (LLMs) for classifying real-world outpatient messages collected from a healthcare system in central Illinois. We compare general-purpose (GPT-4o) and domain-specific (BioBERT and ClinicalBERT) models, evaluating both fine-tuned and few-shot configurations against a TF-IDF + Logistic Regression baseline. Experiments were performed under a HIPAA-compliant environment using de-identified and physician-labeled data. Results and Conclusions: Fine-tuned GPT-4o achieved 97.5% accuracy in urgency detection and 97.8% in full message classification, outperforming BioBERT and ClinicalBERT. These results demonstrate the feasibility and validity of applying modern LLMs to outpatient communication triage while ensuring both interpretability and privacy compliance. Full article
Show Figures

Figure 1

Back to TopTop