MDPI - Publisher of Open Access Journals

24 pages, 1991 KiB

Open AccessArticle

A Multi-Feature Semantic Fusion Machine Learning Architecture for Detecting Encrypted Malicious Traffic

by Shiyu Tang, Fei Du, Zulong Diao and Wenjun Fan

J. Cybersecur. Priv. 2025, 5(3), 47; https://doi.org/10.3390/jcp5030047 (registering DOI) - 17 Jul 2025

With the increasing sophistication of network attacks, machine learning (ML)-based methods have showcased promising performance in attack detection. However, ML-based methods often suffer from high false rates when tackling encrypted malicious traffic. To break through these bottlenecks, we propose EFTransformer, an encrypted flow [...] Read more.

With the increasing sophistication of network attacks, machine learning (ML)-based methods have showcased promising performance in attack detection. However, ML-based methods often suffer from high false rates when tackling encrypted malicious traffic. To break through these bottlenecks, we propose EFTransformer, an encrypted flow transformer framework which inherits semantic perception and multi-scale feature fusion, can robustly and efficiently detect encrypted malicious traffic, and make up for the shortcomings of ML in the context of modeling ability and feature adequacy. EFTransformer introduces a channel-level extraction mechanism based on quintuples and a noise-aware clustering strategy to enhance the recognition ability of traffic patterns; adopts a dual-channel embedding method, using Word2Vec and FastText to capture global semantics and subword-level changes; and uses a Transformer-based classifier and attention pooling module to achieve dynamic feature-weighted fusion, thereby improving the robustness and accuracy of malicious traffic detection. Our systematic experiments on the ISCX2012 dataset demonstrate that EFTransformer achieves the best detection performance, with an accuracy of up to 95.26%, a false positive rate (FPR) of 6.19%, and a false negative rate (FNR) of only 5.85%. These results show that EFTransformer achieves high detection performance against encrypted malicious traffic. Full article

(This article belongs to the Section Security Engineering & Applications)

► Show Figures

Figure 1

21 pages, 1500 KiB

Open AccessArticle

Concurrent Acute Appendicitis and Cholecystitis: A Systematic Literature Review

by Adem Tuncer, Sami Akbulut, Emrah Sahin, Zeki Ogut and Ertugrul Karabulut

J. Clin. Med. 2025, 14(14), 5019; https://doi.org/10.3390/jcm14145019 - 15 Jul 2025

Viewed by 44

Abstract

Background: This systematic review aimed to comprehensively evaluate the clinical, diagnostic, and therapeutic features of synchronous acute cholecystitis (AC) and acute appendicitis (AAP). Methods: The review protocol was prospectively registered in PROSPERO (CRD420251086131) and conducted in accordance with PRISMA 2020 guidelines. [...] Read more.

Background: This systematic review aimed to comprehensively evaluate the clinical, diagnostic, and therapeutic features of synchronous acute cholecystitis (AC) and acute appendicitis (AAP). Methods: The review protocol was prospectively registered in PROSPERO (CRD420251086131) and conducted in accordance with PRISMA 2020 guidelines. A systematic search was performed across PubMed, MEDLINE, Web of Science, Scopus, Google Scholar, and Google databases for studies published from January 1975 to May 2025. Search terms included variations of “synchronous,” “simultaneous,” “concurrent,” and “coexistence” combined with “appendicitis,” “appendectomy,” “cholecystitis,” and “cholecystectomy.” Reference lists of included studies were screened. Studies reporting human cases with sufficient patient-level clinical data were included. Data extraction and quality assessment were performed independently by pairs of reviewers, with discrepancies resolved through consensus. No meta-analysis was conducted due to the descriptive nature of the data. Results: A total of 44 articles were included in this review. Of these, thirty-four were available in full text, one was accessible only as an abstract, and one was a literature review, while eight articles were inaccessible. Clinical data from forty patients, including two from our own cases, were evaluated, with a median age of 41 years. The gender distribution was equal, with a median age of 50 years among male patients and 36 years among female patients. Leukocytosis was observed in 25 of 33 patients with available laboratory data. Among 37 patients with documented diagnostic methods, ultrasonography and computed tomography were the most frequently utilized modalities, followed by physical examination. Twenty-seven patients underwent laparoscopic cholecystectomy and appendectomy. The remaining patients were managed with open surgery or conservative treatment. Postoperative complications occurred in five patients, including sepsis, perforation, leakage, diarrhea, and wound infections. Histopathological analysis revealed AAP in 25 cases and AC in 14. Additional findings included gangrenous inflammation and neoplastic lesions. Conclusions: Synchronous AC and AAP are rare and diagnostically challenging conditions. Early recognition via imaging and clinical evaluation is critical. Laparoscopic management remains the preferred approach. Histopathological examination of surgical specimens is essential for identifying unexpected pathology, thereby guiding appropriate patient management. Full article

(This article belongs to the Section Gastroenterology & Hepatopancreatobiliary Medicine)

► Show Figures

Figure 1

35 pages, 1458 KiB

Open AccessArticle

User Comment-Guided Cross-Modal Attention for Interpretable Multimodal Fake News Detection

by Zepu Yi, Chenxu Tang and Songfeng Lu

Appl. Sci. 2025, 15(14), 7904; https://doi.org/10.3390/app15147904 - 15 Jul 2025

Viewed by 50

Abstract

In order to address the pressing challenge posed by the proliferation of fake news in the digital age, we emphasize its profound and harmful impact on societal structures, including the misguidance of public opinion, the erosion of social trust, and the exacerbation of [...] Read more.

In order to address the pressing challenge posed by the proliferation of fake news in the digital age, we emphasize its profound and harmful impact on societal structures, including the misguidance of public opinion, the erosion of social trust, and the exacerbation of social polarization. Current fake news detection methods are largely limited to superficial text analysis or basic text–image integration, which face significant limitations in accurately identifying deceptive information. To bridge this gap, we propose the UC-CMAF framework, which comprehensively integrates news text, images, and user comments through an adaptive co-attention fusion mechanism. The UC-CMAF workflow consists of four key subprocesses: multimodal feature extraction, cross-modal adaptive collaborative attention fusion of news text and images, cross-modal attention fusion of user comments with news text and images, and finally, input of fusion features into a fake news detector. Specifically, we introduce multi-head cross-modal attention heatmaps and comment importance visualizations to provide interpretability support for the model’s predictions, revealing key semantic areas and user perspectives that influence judgments. Through the cross-modal adaptive collaborative attention mechanism, UC-CMAF achieves deep semantic alignment between news text and images and uses social signals from user comments to build an enhanced credibility evaluation path, offering a new paradigm for interpretable fake information detection. Experimental results demonstrate that UC-CMAF consistently outperforms 15 baseline models across two benchmark datasets, achieving F1 Scores of 0.894 and 0.909. These results validate the effectiveness of its adaptive cross-modal attention mechanism and the incorporation of user comments in enhancing both detection accuracy and interpretability. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)

► Show Figures

Figure 1

20 pages, 5700 KiB

Open AccessArticle

Multimodal Personality Recognition Using Self-Attention-Based Fusion of Audio, Visual, and Text Features

by Hyeonuk Bhin and Jongsuk Choi

Electronics 2025, 14(14), 2837; https://doi.org/10.3390/electronics14142837 - 15 Jul 2025

Viewed by 155

Abstract

Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human–Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose [...] Read more.

Personality is a fundamental psychological trait that exerts a long-term influence on human behavior patterns and social interactions. Automatic personality recognition (APR) has exhibited increasing importance across various domains, including Human–Robot Interaction (HRI), personalized services, and psychological assessments. In this study, we propose a multimodal personality recognition model that classifies the Big Five personality traits by extracting features from three heterogeneous sources: audio processed using Wav2Vec2, video represented as Skeleton Landmark time series, and text encoded through Bidirectional Encoder Representations from Transformers (BERT) and Doc2Vec embeddings. Each modality is handled through an independent Self-Attention block that highlights salient temporal information, and these representations are then summarized and integrated using a late fusion approach to effectively reflect both the inter-modal complementarity and cross-modal interactions. Compared to traditional recurrent neural network (RNN)-based multimodal models and unimodal classifiers, the proposed model achieves an improvement of up to 12 percent in the F1-score. It also maintains a high prediction accuracy and robustness under limited input conditions. Furthermore, a visualization based on t-distributed Stochastic Neighbor Embedding (t-SNE) demonstrates clear distributional separation across the personality classes, enhancing the interpretability of the model and providing insights into the structural characteristics of its latent representations. To support real-time deployment, a lightweight thread-based processing architecture is implemented, ensuring computational efficiency. By leveraging deep learning-based feature extraction and the Self-Attention mechanism, we present a novel personality recognition framework that balances performance with interpretability. The proposed approach establishes a strong foundation for practical applications in HRI, counseling, education, and other interactive systems that require personalized adaptation. Full article

(This article belongs to the Special Issue Explainable Machine Learning and Data Mining)

► Show Figures

Figure 1

23 pages, 3614 KiB

Open AccessArticle

A Multimodal Semantic-Enhanced Attention Network for Fake News Detection

by Weijie Chen, Yuzhuo Dang and Xin Zhang

Entropy 2025, 27(7), 746; https://doi.org/10.3390/e27070746 - 12 Jul 2025

Viewed by 272

Abstract

The proliferation of social media platforms has triggered an unprecedented increase in multimodal fake news, creating pressing challenges for content authenticity verification. Current fake news detection systems predominantly rely on isolated unimodal analysis (text or image), failing to exploit critical cross-modal correlations or [...] Read more.

The proliferation of social media platforms has triggered an unprecedented increase in multimodal fake news, creating pressing challenges for content authenticity verification. Current fake news detection systems predominantly rely on isolated unimodal analysis (text or image), failing to exploit critical cross-modal correlations or leverage latent social context cues. To bridge this gap, we introduce the SCCN (Semantic-enhanced Cross-modal Co-attention Network), a novel framework that synergistically combines multimodal features with refined social graph signals. Our approach innovatively combines text, image, and social relation features through a hierarchical fusion framework. First, we extract modality-specific features and enhance semantics by identifying entities in both text and visual data. Second, an improved co-attention mechanism selectively integrates social relations while removing irrelevant connections to reduce noise and exploring latent informative links. Finally, the model is optimized via cross-entropy loss with entropy minimization. Experimental results for benchmark datasets (PHEME and Weibo) show that SCCN consistently outperforms existing approaches, achieving relative accuracy enhancements of 1.7% and 1.6% over the best-performing baseline methods in each dataset. Full article

(This article belongs to the Section Multidisciplinary Applications)

► Show Figures

Figure 1

17 pages, 1472 KiB

Open AccessArticle

A Wallboard Outsourcing Recommendation Method Based on Dual-Channel Neural Networks and Probabilistic Matrix Factorization

by Hongen Yang, Shanhui Liu, Yangzhen Cao, Yuanyang Wang and Chaoyang Li

Electronics 2025, 14(14), 2792; https://doi.org/10.3390/electronics14142792 - 11 Jul 2025

Viewed by 119

Abstract

Wallboard outsourcing is a critical task in cloud-based manufacturing, where demand enterprises seek suitable suppliers for machining services through online platforms. However, the recommendation process faces significant challenges, including sparse rating data, unstructured textual descriptions from suppliers, and complex, non-linear user preferences. To [...] Read more.

Wallboard outsourcing is a critical task in cloud-based manufacturing, where demand enterprises seek suitable suppliers for machining services through online platforms. However, the recommendation process faces significant challenges, including sparse rating data, unstructured textual descriptions from suppliers, and complex, non-linear user preferences. To address these issues, this paper proposes AttVAE-PMF, a novel recommendation method based on dual-channel neural networks and probabilistic matrix factorization. Specifically, an attention-enhanced long short-term memory (LSTM) is employed to extract semantic features from free-text supplier descriptions, while a variational autoencoder (VAE) is used to model latent preferences from sparse demand-side ratings. These two types of latent representations are then fused via probabilistic matrix factorization (PMF) to complete the rating matrix and infer enterprise preferences. Experiments conducted on both the wallboard dataset and the MovieLens-100K dataset demonstrate that AttVAE-PMF outperforms baseline methods—including PMF, DLCRS, and SSAERec—in terms of convergence speed and robustness to data sparsity, validating its effectiveness in handling sparse and heterogeneous information in wallboard outsourcing recommendation scenarios. Full article

► Show Figures

Graphical abstract

16 pages, 1535 KiB

Open AccessArticle

Clinical Text Classification for Tuberculosis Diagnosis Using Natural Language Processing and Deep Learning Model with Statistical Feature Selection Technique

by Shaik Fayaz Ahamed, Sundarakumar Karuppasamy and Ponnuraja Chinnaiyan

Informatics 2025, 12(3), 64; https://doi.org/10.3390/informatics12030064 - 7 Jul 2025

Viewed by 301

Abstract

Background: In the medical field, various deep learning (DL) algorithms have been effectively used to extract valuable information from unstructured clinical text data, potentially leading to more effective outcomes. This study utilized clinical text data to classify clinical case reports into tuberculosis (TB) [...] Read more.

Background: In the medical field, various deep learning (DL) algorithms have been effectively used to extract valuable information from unstructured clinical text data, potentially leading to more effective outcomes. This study utilized clinical text data to classify clinical case reports into tuberculosis (TB) and non-tuberculosis (non-TB) groups using natural language processing (NLP), a pre-processing technique, and DL models. Methods: This study used 1743 open-source respiratory disease clinical text data, labeled via fuzzy matching with ICD-10 codes to create a labeled dataset. Two tokenization methods preprocessed the clinical text data, and three models were evaluated: the existing Text-CNN, the proposed Text-CNN with t-test, and Bio_ClinicalBERT. Performance was assessed using multiple metrics and validated on 228 baseline screening clinical case text data collected from ICMR–NIRT to demonstrate effective TB classification. Results: The proposed model achieved the best results in both the test and validation datasets. On the test dataset, it attained a precision of 88.19%, a recall of 90.71%, an F1-score of 89.44%, and an AUC of 0.91. Similarly, on the validation dataset, it achieved 100% precision, 98.85% recall, 99.42% F1-score, and an AUC of 0.982, demonstrating its effectiveness in TB classification. Conclusions: This study highlights the effectiveness of DL models in classifying TB cases from clinical notes. The proposed model outperformed the other two models. The TF-IDF and t-test showed statistically significant feature selection and enhanced model interpretability and efficiency, demonstrating the potential of NLP and DL in automating TB diagnosis in clinical decision settings. Full article

► Show Figures

Figure 1

27 pages, 7617 KiB

Open AccessArticle

Emoji-Driven Sentiment Analysis for Social Bot Detection with Relational Graph Convolutional Networks

by Kaqian Zeng, Zhao Li and Xiujuan Wang

Sensors 2025, 25(13), 4179; https://doi.org/10.3390/s25134179 - 4 Jul 2025

Viewed by 286

Abstract

The proliferation of malicious social bots poses severe threats to cybersecurity and social media information ecosystems. Existing detection methods often overlook the semantic value and emotional cues conveyed by emojis in user-generated tweets. To address this gap, we propose ESA-BotRGCN, an emoji-driven multi-modal [...] Read more.

The proliferation of malicious social bots poses severe threats to cybersecurity and social media information ecosystems. Existing detection methods often overlook the semantic value and emotional cues conveyed by emojis in user-generated tweets. To address this gap, we propose ESA-BotRGCN, an emoji-driven multi-modal detection framework that integrates semantic enhancement, sentiment analysis, and multi-dimensional feature modeling. Specifically, we first establish emoji–text mapping relationships using the Emoji Library, leverage GPT-4 to improve textual coherence, and generate tweet embeddings via RoBERTa. Subsequently, seven sentiment-based features are extracted to quantify statistical disparities in emotional expression patterns between bot and human accounts. An attention gating mechanism is further designed to dynamically fuse these sentiment features with user description, tweet content, numerical attributes, and categorical features. Finally, a Relational Graph Convolutional Network (RGCN) is employed to model heterogeneous social topology for robust bot detection. Experimental results on the TwiBot-20 benchmark dataset demonstrate that our method achieves a superior accuracy of 87.46%, significantly outperforming baseline models and validating the effectiveness of emoji-driven semantic and sentiment enhancement strategies. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

18 pages, 2148 KiB

Open AccessArticle

A Cross-Spatial Differential Localization Network for Remote Sensing Change Captioning

by Ruijie Wu, Hao Ye, Xiangying Liu, Zhenzhen Li, Chenhao Sun and Jiajia Wu

Remote Sens. 2025, 17(13), 2285; https://doi.org/10.3390/rs17132285 - 3 Jul 2025

Viewed by 253

Abstract

Remote Sensing Image Change Captioning (RSICC) aims to generate natural language descriptions of changes in bi-temporal remote sensing images, providing more semantically interpretable results than conventional pixel-level change detection methods. However, existing approaches often rely on stacked Transformer modules, leading to suboptimal feature [...] Read more.

Remote Sensing Image Change Captioning (RSICC) aims to generate natural language descriptions of changes in bi-temporal remote sensing images, providing more semantically interpretable results than conventional pixel-level change detection methods. However, existing approaches often rely on stacked Transformer modules, leading to suboptimal feature discrimination. Moreover, direct difference computation after feature extraction tends to retain task-irrelevant noise, limiting the model’s ability to capture meaningful changes. This study proposes a novel cross-spatial Transformer and symmetric difference localization network (CTSD-Net) for RSICC to address these limitations. The proposed Cross-Spatial Transformer adaptively enhances spatial-aware feature representations by guiding the model to focus on key regions across temporal images. Additionally, a hierarchical difference feature integration strategy is introduced to suppress noise by fusing multi-level differential features, while residual-connected high-level features serve as query vectors to facilitate bidirectional change representation learning. Finally, a causal Transformer decoder creates accurate descriptions by linking visual information with text. CTSD-Net achieved BLEU-4 scores of 66.32 and 73.84 on the LEVIR-CC and WHU-CDC datasets, respectively, outperforming existing methods in accurately locating change areas and describing them semantically. This study provides a promising solution for enhancing interpretability in remote sensing change analysis. Full article

(This article belongs to the Special Issue Machine Learning and Deep Learning Applied to Remote Sensing Image Analysis)

► Show Figures

Figure 1

28 pages, 2850 KiB

Open AccessArticle

Quantification and Evolution of Online Public Opinion Heat Considering Interactive Behavior and Emotional Conflict

by Zhengyi Sun, Deyao Wang and Zhaohui Li

Entropy 2025, 27(7), 701; https://doi.org/10.3390/e27070701 - 29 Jun 2025

Viewed by 266

Abstract

With the rapid development of the Internet, the speed and scope of sudden public events disseminating in cyberspace have grown significantly. Current methods of quantifying public opinion heat often neglect emotion-driven factors and user interaction behaviors, making it difficult to accurately capture fluctuations [...] Read more.

With the rapid development of the Internet, the speed and scope of sudden public events disseminating in cyberspace have grown significantly. Current methods of quantifying public opinion heat often neglect emotion-driven factors and user interaction behaviors, making it difficult to accurately capture fluctuations during dissemination. To address these issues, first, this study addressed the complexity of interaction behaviors by introducing an approach that employs the information gain ratio as a weighting indicator to measure the “interaction heat” contributed by different interaction attributes during event evolution. Second, this study built on SnowNLP and expanded textual features to conduct in-depth sentiment mining of large-scale opinion texts, defining the variance of netizens’ emotional tendencies as an indicator of emotional fluctuations, thereby capturing “emotional heat”. We then integrated interactive behavior and emotional conflict assessment to achieve comprehensive heat index to quantification and dynamic evolution analysis of online public opinion heat. Subsequently, we used Hodrick–Prescott filter to separate long-term trends and short-term fluctuations, extract six key quantitative features (number of peaks, time of first peak, maximum amplitude, decay time, peak emotional conflict, and overall duration), and applied K-means clustering algorithm (K-means) to classify events into three propagation patterns, which are extreme burst, normal burst, and long-tail. Finally, this study conducted ablation experiments on critical external intervention nodes to quantify the distinct contribution of each intervention to the propagation trend by observing changes in the model’s goodness-of-fit (

R^{2}

) after removing different interventions. Through an empirical analysis of six representative public opinion events from 2024, this study verified the effectiveness of the proposed framework and uncovered critical characteristics of opinion dissemination, including explosiveness versus persistence, multi-round dissemination with recurring emotional fluctuations, and the interplay of multiple driving factors. Full article

(This article belongs to the Special Issue Statistical Physics Approaches for Modeling Human Social Systems)

► Show Figures

Figure 1

24 pages, 1664 KiB

Open AccessReview

A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions

by You Wu, Qingwei Mi and Tianhan Gao

Biomimetics 2025, 10(7), 418; https://doi.org/10.3390/biomimetics10070418 - 27 Jun 2025

Viewed by 793

Abstract

This paper presents a comprehensive review of multimodal emotion recognition (MER), a process that integrates multiple data modalities such as speech, visual, and text to identify human emotions. Grounded in biomimetics, the survey frames MER as a bio-inspired sensing paradigm that emulates the [...] Read more.

This paper presents a comprehensive review of multimodal emotion recognition (MER), a process that integrates multiple data modalities such as speech, visual, and text to identify human emotions. Grounded in biomimetics, the survey frames MER as a bio-inspired sensing paradigm that emulates the way humans seamlessly fuse multisensory cues to communicate affect, thereby transferring principles from living systems to engineered solutions. By leveraging various modalities, MER systems offer a richer and more robust analysis of emotional states compared to unimodal approaches. The review covers the general structure of MER systems, feature extraction techniques, and multimodal information fusion strategies, highlighting key advancements and milestones. Additionally, it addresses the research challenges and open issues in MER, including lightweight models, cross-corpus generalizability, and the incorporation of additional modalities. The paper concludes by discussing future directions aimed at improving the accuracy, explainability, and practicality of MER systems for real-world applications. Full article

(This article belongs to the Special Issue Intelligent Human–Robot Interaction: 4th Edition)

► Show Figures

Figure 1

22 pages, 1754 KiB

Open AccessArticle

Enhancing Startup Financing Success Prediction Based on Social Media Sentiment

by Zhen Qiu, Yifan Qu, Shaochen Yang, Wuji Zhang, Wei Xu and Hong Zhao

Systems 2025, 13(7), 520; https://doi.org/10.3390/systems13070520 - 27 Jun 2025

Viewed by 429

Abstract

Accurately predicting the success of startup financing is critical for strategic business planning and informed investor decision-making. Traditional financing prediction models typically focus on a company’s financial indicators to explore the impact of factors such as resource allocation and strategic choices on financing [...] Read more.

Accurately predicting the success of startup financing is critical for strategic business planning and informed investor decision-making. Traditional financing prediction models typically focus on a company’s financial indicators to explore the impact of factors such as resource allocation and strategic choices on financing success, yet they often overlook the important role of social media as an external source of information in influencing financing performance. To address this gap, this paper focuses on the role of social media sentiment in predicting startup financing success and proposes a decision support system (DSS) framework that integrates multi-source data. Specifically, this study combines financial data from the Crunchbase platform with company-related social media news data from Twitter. The BERTweet model is used to perform sentiment analysis on the social media texts, extracting sentiment features such as polarity and intensity to capture public attitudes and expectations toward the company. Subsequently, financial indicators, social media numerical features, and sentiment features are combined to construct a decision support system for predicting financing success using a deep neural network (DNN). Experimental results show that the decision support system incorporating social media data significantly outperforms traditional decision support systems in prediction accuracy, with sentiment features further enhancing the model’s ability to identify a company’s financing performance. Our study provides strong support for understanding the profound influence of public sentiment, offering practical guidance for startups to optimize financing strategies and for investors to make informed decisions. Full article

(This article belongs to the Section Systems Practice in Social Science)

► Show Figures

Figure 1

16 pages, 1027 KiB

Open AccessArticle

Enhancing Review-Based Recommendations Through Local and Global Feature Fusion

by Namhun Kim, Haebin Lim, Qinglong Li, Xinzhe Li, Seokkwan Kim and Jaekyeong Kim

Electronics 2025, 14(13), 2540; https://doi.org/10.3390/electronics14132540 - 23 Jun 2025

Viewed by 278

Abstract

With the rapid advancement of information and communication technology, the number of items users encounter increased exponentially. Consequently, the importance of recommendation systems emerged to reduce the time and effort required for users to make item selections. Recently, among various studies on recommendation [...] Read more.

With the rapid advancement of information and communication technology, the number of items users encounter increased exponentially. Consequently, the importance of recommendation systems emerged to reduce the time and effort required for users to make item selections. Recently, among various studies on recommendation systems, there has been significant interest in leveraging review text as auxiliary information. This study proposes a novel model to enhance recommendation performance by effectively analyzing review texts through the fusion of local and global features. By combining convolutional neural networks (CNN), which excel in extracting local features, and the RoBERTa model, renowned for capturing global contextual features, the proposed approach effectively uncovers users’ latent preferences embedded within review texts. The proposed model comprises three key components: the user–item interaction module, which learns complex interactions between users and items; the feature extraction module, which extracts both local and global features using CNN and RoBERTa; and the preference prediction module, which combines the output vectors from the previous modules to predict user preferences for specific items. Extensive experiments conducted on three datasets collected from Amazon platform demonstrate that the proposed model significantly outperforms baseline models. These findings highlight the effectiveness of the proposed approach in considering both local and global features for extracting user preferences from review texts. Full article

(This article belongs to the Special Issue Data-Driven AI Approaches with Applications in Social Network, Media Analytics and Smart Cities)

► Show Figures

Figure 1

21 pages, 2777 KiB

Open AccessArticle

PR-CLIP: Cross-Modal Positional Reconstruction for Remote Sensing Image–Text Retrieval

by Jihong Guan, Yulou Shu, Wengen Li, Zihan Song and Yichao Zhang

Remote Sens. 2025, 17(13), 2117; https://doi.org/10.3390/rs17132117 - 20 Jun 2025

Viewed by 450

Abstract

With the development of satellite technology, remote sensing images have become increasingly accessible, making multi-modal remote sensing retrieval increasingly important. However, most existing methods rely on global visual and textual features to compute similarity, ignoring the positional correspondence between image regions and textual [...] Read more.

With the development of satellite technology, remote sensing images have become increasingly accessible, making multi-modal remote sensing retrieval increasingly important. However, most existing methods rely on global visual and textual features to compute similarity, ignoring the positional correspondence between image regions and textual descriptions. To address this issue, we propose a novel cross-modal retrieval model named PR-CLIP, which leverages a cross-modal positional information reconstruction task to learn position-aware correlations between modalities. Specifically, PR-CLIP first uses a cross-modal positional information extraction module to extract the complementary features between images and texts. Then, the unimodal positional information filtering module filters out the complementary information from the unimodal features to generate embeddings for reconstruction. Finally, the cross-modal positional information reconstruction module reconstructs the unimodal embeddings of the images and texts based on the complete embeddings of the opposite modality, guided by a cross-modal positional consistency loss to ensure reconstruction quality. During the inference stage of retrieval, PR-CLIP directly calculates the similarity between the unimodal features without executing the modules of the reconstruction task. By combining the advantages of dual-stream and single-stream models, PR-CLIP achieves a good balance between performance and efficiency. Extensive experiments on multiple public datasets demonstrated the effectiveness of PR-CLIP. Full article

(This article belongs to the Special Issue Advanced AI Technology for Remote Sensing Analysis)

► Show Figures

Graphical abstract

29 pages, 3879 KiB

Open AccessArticle

Fusion of Sentiment and Market Signals for Bitcoin Forecasting: A SentiStack Network Based on a Stacking LSTM Architecture

by Zhizhou Zhang, Changle Jiang and Meiqi Lu

Big Data Cogn. Comput. 2025, 9(6), 161; https://doi.org/10.3390/bdcc9060161 - 19 Jun 2025

Viewed by 1040

Abstract

This paper proposes a comprehensive deep-learning framework, SentiStack, for Bitcoin price forecasting and trading strategy evaluation by integrating multimodal data sources, including market indicators, macroeconomic variables, and sentiment information extracted from financial news and social media. The model architecture is based on a [...] Read more.

This paper proposes a comprehensive deep-learning framework, SentiStack, for Bitcoin price forecasting and trading strategy evaluation by integrating multimodal data sources, including market indicators, macroeconomic variables, and sentiment information extracted from financial news and social media. The model architecture is based on a Stacking-LSTM ensemble, which captures complex temporal dependencies and non-linear patterns in high-dimensional financial time series. To enhance predictive power, sentiment embeddings derived from full-text analysis using the DeepSeek language model are fused with traditional numerical features through early and late data fusion techniques. Empirical results demonstrate that the proposed model significantly outperforms baseline strategies, including Buy & Hold and Random Trading, in cumulative return and risk-adjusted performances. Feature ablation experiments further reveal the critical role of sentiment and macroeconomic inputs in improving forecasting accuracy. The sentiment-enhanced model also exhibits strong performance in identifying high-return market movements, suggesting its practical value for data-driven investment decision-making. Overall, this study highlights the importance of incorporating soft information, such as investor sentiment, alongside traditional quantitative features in financial forecasting models. Full article

► Show Figures

Figure 1

Search Results (887)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (887)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI