MDPI - Publisher of Open Access Journals

18 pages, 1506 KB

Open AccessArticle

MFA-CNN: An Emotion Recognition Network Integrating 1D–2D Convolutional Neural Network and Cross-Modal Causal Features

by Jing Zhang, Anhong Wang, Suyue Li, Debiao Zhang and Xin Li

Brain Sci. 2025, 15(11), 1165; https://doi.org/10.3390/brainsci15111165 - 29 Oct 2025

Background/Objectives: It has become a major direction of research in affective computing to explore the brain-information-processing mechanisms based on physiological signals such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). However, existing research has mostly focused on feature- and decision-level fusion, with little [...] Read more.

Background/Objectives: It has become a major direction of research in affective computing to explore the brain-information-processing mechanisms based on physiological signals such as electroencephalography (EEG) and functional near-infrared spectroscopy (fNIRS). However, existing research has mostly focused on feature- and decision-level fusion, with little investigation into the causal relationship between these two modalities. Methods: In this paper, we propose a novel emotion recognition framework for the simultaneous acquisition of EEG and fNIRS signals. This framework integrates the Granger causality (GC) method and a modality–frequency attention mechanism within a convolutional neural network backbone (MFA-CNN). First, we employed GC to quantify the causal relationships between the EEG and fNIRS signals. This revealed emotional-processing mechanisms from the perspectives of neuro-electrical activity and hemodynamic interactions. Then, we designed a 1D2D-CNN framework that fuses temporal and spatial representations and introduced the MFA module to dynamically allocate weights across modalities and frequency bands. Results: Experimental results demonstrated that the proposed method outperforms strong baselines under both single-modal and multi-modal conditions, showing the effectiveness of causal features in emotion recognition. Conclusions: These findings indicate that combining GC-based cross-modal causal features with modality–frequency attention improves EEG–fNIRS-based emotion recognition and provides a more physiologically interpretable view of emotion-related brain activity. Full article

(This article belongs to the Special Issue Advances in Emotion Processing and Cognitive Neuropsychology)

32 pages, 2144 KB

Open AccessArticle

Trapezium Cloud Decision-Making Method with Probabilistic Multi-Granularity Symmetric Linguistic Information and Its Application in Standing Timber Evaluation

by Zhiteng Chen, Jian Lin and Zhiwei Gong

Symmetry 2025, 17(11), 1820; https://doi.org/10.3390/sym17111820 - 29 Oct 2025

Abstract

It is crucial to evaluate the quality of standing timber for the rational and effective management of forest land. In practice, it is often difficult to obtain accurate data on various indicators of standing timber due to constraints such as measurement conditions, accuracy, [...] Read more.

It is crucial to evaluate the quality of standing timber for the rational and effective management of forest land. In practice, it is often difficult to obtain accurate data on various indicators of standing timber due to constraints such as measurement conditions, accuracy, and cost. Therefore, this study developed a multi-attribute decision-making method based on trapezium clouds and applied it to evaluate the standing timber quality of forest land. Firstly, a trapezium cloud transformation method was designed to handle multi-granularity symmetric linguistic information problems caused by different knowledge backgrounds of decision-makers, and the symmetric structure inherent in trapezium clouds helped to ensure the balanced processing of information from various asymmetric cognitive perspectives. Secondly, a trapezium cloud generalized weighted Heronian mean was proposed for the information aggregation process of trapezium clouds. Then, the concept of trapezium cloud interval similarity was defined, and an optimization model was constructed to determine the normalized interval weights of attributes. Based on the symmetric numerical feature, the calculation formula for the approximate centroid coordinates of trapezium clouds was derived, and based on this, the ranking method of trapezium clouds was obtained. Finally, taking the evaluation of standing timber quality in forest land as a numerical example, the applicability of the constructed multi-attribute decision-making method was demonstrated. In addition, the corresponding comparison analysis verified the superiority and effectiveness of the proposed method. Full article

(This article belongs to the Section Mathematics)

► Show Figures

Figure 1

31 pages, 2985 KB

Open AccessArticle

Heterogeneous Ensemble Sentiment Classification Model Integrating Multi-View Features and Dynamic Weighting

by Song Yang, Jiayao Xing, Zongran Dong and Zhaoxia Liu

Electronics 2025, 14(21), 4189; https://doi.org/10.3390/electronics14214189 - 27 Oct 2025

Viewed by 16

Abstract

With the continuous growth of user reviews, identifying underlying sentiment across multi-source texts efficiently and accurately has become a significant challenge in NLP. Traditional single models in cross-domain sentiment analysis often exhibit insufficient stability, limited generalization capabilities, and sensitivity to class imbalance. Existing [...] Read more.

With the continuous growth of user reviews, identifying underlying sentiment across multi-source texts efficiently and accurately has become a significant challenge in NLP. Traditional single models in cross-domain sentiment analysis often exhibit insufficient stability, limited generalization capabilities, and sensitivity to class imbalance. Existing ensemble methods predominantly rely on static weighting or voting strategies among homogeneous models, failing to fully leverage the complementary advantages between models. To address these issues, this study proposes a heterogeneous ensemble sentiment classification model integrating multi-view features and dynamic weighting. At the feature learning layer, the model constructs three complementary base learners, a RoBERTa-FC for extracting global semantic features, a BERT-BiGRU for capturing temporal dependencies, and a TextCNN-Attention for focusing on local semantic features, thereby achieving multi-level text representation. At the decision layer, a meta-learner is used to fuse multi-view features, and dynamic uncertainty weighting and attention weighting strategies are employed to adaptively adjust outputs from different base learners. Experimental results across multiple domains demonstrate that the proposed model consistently outperforms single learners and comparison methods in terms of Accuracy, Precision, Recall, F1 Score, and Macro-AUC. On average, the ensemble model achieves a Macro-AUC of 0.9582 ± 0.023 across five datasets, with an Accuracy of 0.9423, an F1 Score of 0.9590, and a Macro-AUC of 0.9797 on the AlY_ds dataset. Moreover, in cross-dataset ranking evaluation based on equally weighted metrics, the model consistently ranks within the top two, confirming its superior cross-domain adaptability and robustness. These findings highlight the effectiveness of the proposed framework in enhancing sentiment classification performance and provide valuable insights for future research on lightweight dynamic ensembles, multilingual, and multimodal applications. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 979 KB

Open AccessArticle

Multi-Modal Semantic Fusion for Smart Contract Vulnerability Detection in Cloud-Based Blockchain Analytics Platforms

by Xingyu Zeng, Qiaoyan Wen and Sujuan Qin

Electronics 2025, 14(21), 4188; https://doi.org/10.3390/electronics14214188 - 27 Oct 2025

Viewed by 56

Abstract

With the growth of trusted computing demand for big data analysis, cloud computing platforms are reshaping trusted data infrastructure by integrating Blockchain as a Service (BaaS), which uses elastic resource scheduling and heterogeneous hardware acceleration to support petabyte level multi-institution data security exchange [...] Read more.

With the growth of trusted computing demand for big data analysis, cloud computing platforms are reshaping trusted data infrastructure by integrating Blockchain as a Service (BaaS), which uses elastic resource scheduling and heterogeneous hardware acceleration to support petabyte level multi-institution data security exchange in medical, financial, and other fields. As the core hub of data-intensive scenarios, the BaaS platform has the dual capabilities of privacy computing and process automation. However, its deep dependence on smart contracts generates new code layer vulnerabilities, resulting in malicious contamination of analysis results. The existing detection schemes are limited to the perspective of single-source data, which makes it difficult to capture both global semantic associations and local structural details in a cloud computing environment, leading to a performance bottleneck in terms of scalability and detection accuracy. To address these challenges, this paper proposes a smart contract vulnerability detection method based on multi-modal semantic fusion for the blockchain analysis platform of cloud computing. Firstly, the contract source code is parsed into an abstract syntax tree, and the key code is accurately located based on the predefined vulnerability feature set. Then, the text features and graph structure features of key codes are extracted in parallel to realize the deep fusion of them. Finally, with the help of attention enhancement, the vulnerability probability is output through the fully connected network. The experiments on Ethereum benchmark datasets show that the detection accuracy of our method for re-entrancy vulnerability, timestamp vulnerability, overflow/underflow vulnerability, and delegatecall vulnerability can reach 92.2%, 96.3%, 91.4%, and 89.5%, surpassing previous methods. Additionally, our method has the potential for practical deployment in cloud-based blockchain service environments. Full article

(This article belongs to the Special Issue New Trends in Cloud Computing for Big Data Analytics)

► Show Figures

Figure 1

19 pages, 2598 KB

Open AccessArticle

DOCB: A Dynamic Online Cross-Batch Hard Exemplar Recall for Cross-View Geo-Localization

by Wenchao Fan, Xuetao Tian, Long Huang, Xiuwei Zhang and Fang Wang

ISPRS Int. J. Geo-Inf. 2025, 14(11), 418; https://doi.org/10.3390/ijgi14110418 - 26 Oct 2025

Viewed by 138

Abstract

Image-based geo-localization is a challenging task that aims to determine the geographic location of a ground-level query image captured by an Unmanned Ground Vehicle (UGV) by matching it to geo-tagged nadir-view (top-down) images from an Unmanned Aerial Vehicle (UAV) stored in a reference [...] Read more.

Image-based geo-localization is a challenging task that aims to determine the geographic location of a ground-level query image captured by an Unmanned Ground Vehicle (UGV) by matching it to geo-tagged nadir-view (top-down) images from an Unmanned Aerial Vehicle (UAV) stored in a reference database. The challenge comes from the perspective inconsistency between matched objects. In this work, we propose a novel metric learning scheme for hard exemplar mining to improve the performance of cross-view geo-localization. Specifically, we introduce a Dynamic Online Cross-Batch (DOCB) hard exemplar mining scheme that solves the problem of the lack of hard exemplars in mini-batches in the middle and late stages of training, which leads to training stagnation. It mines cross-batch hard negative exemplars according to the current network state and reloads them into the network to make the gradient of negative exemplars participating in back-propagation. Since the feature representation of cross-batch negative examples adapts to the current network state, the triplet loss calculation becomes more accurate. Compared with methods only considering the gradient of anchors and positives, adding the gradient of negative exemplars helps us to obtain the correct gradient direction. Therefore, our DOCB scheme can better guide the network to learn valuable metric information. Moreover, we design a simple Siamese-like network called multi-scale feature aggregation (MSFA), which can generate multi-scale feature aggregation by learning and fusing multiple local spatial embeddings. The experimental results demonstrate that our DOCB scheme and MSFA network achieve an accuracy of 95.78% on the CVUSA dataset and 86.34% on the CVACT_val dataset, which outperforms those of other existing methods in the field. Full article

(This article belongs to the Topic Artificial Intelligence Models, Tools and Applications)

► Show Figures

Figure 1

37 pages, 10732 KB

Open AccessReview

Advances on Multimodal Remote Sensing Foundation Models for Earth Observation Downstream Tasks: A Survey

by Guoqing Zhou, Lihuang Qian and Paolo Gamba

Remote Sens. 2025, 17(21), 3532; https://doi.org/10.3390/rs17213532 - 24 Oct 2025

Viewed by 210

Abstract

Remote sensing foundation models (RSFMs) have demonstrated excellent feature extraction and reasoning capabilities under the self-supervised learning paradigm of “unlabeled datasets—model pre-training—downstream tasks”. These models achieve superior accuracy and performance compared to existing models across numerous open benchmark datasets. However, when confronted with [...] Read more.

Remote sensing foundation models (RSFMs) have demonstrated excellent feature extraction and reasoning capabilities under the self-supervised learning paradigm of “unlabeled datasets—model pre-training—downstream tasks”. These models achieve superior accuracy and performance compared to existing models across numerous open benchmark datasets. However, when confronted with multimodal data, such as optical, LiDAR, SAR, text, video, and audio, the RSFMs exhibit limitations in cross-modal generalization and multi-task learning. Although several reviews have addressed the RSFMs, there is currently no comprehensive survey dedicated to vision–X (vision, language, audio, position) multimodal RSFMs (MM-RSFMs). To tackle this gap, this article provides a systematic review of MM-RSFMs from a novel perspective. Firstly, the key technologies underlying MM-RSFMs are reviewed and analyzed, and the available multimodal RS pre-training datasets are summarized. Then, recent advances in MM-RSFMs are classified according to the development of backbone networks and cross-modal interaction methods of vision–X, such as vision–vision, vision–language, vision–audio, vision–position, and vision–language–audio. Finally, potential challenges are analyzed, and perspectives for MM-RSFMs are outlined. This survey from this paper reveals that current MM-RSFMs face the following key challenges: (1) a scarcity of high-quality multimodal datasets, (2) limited capability for multimodal feature extraction, (3) weak cross-task generalization, (4) absence of unified evaluation criteria, and (5) insufficient security measures. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

43 pages, 6958 KB

Open AccessReview

From Multi-Field Coupling Behaviors to Self-Powered Monitoring: Triboelectric Nanogenerator Arrays for Deep-Sea Large-Scale Cages

by Kefan Yang, Shengqing Zeng, Keqi Yang, Dapeng Zhang and Yi Zhang

J. Mar. Sci. Eng. 2025, 13(11), 2042; https://doi.org/10.3390/jmse13112042 - 24 Oct 2025

Viewed by 171

Abstract

As global Marine resource development continues to expand into deep-sea and ultra-deep-sea domains, the intelligent and green transformation of deep-sea aquaculture equipment has become a key direction for high-quality development of the Marine economy. Large deep-sea cages are considered essential equipment for deep-sea [...] Read more.

As global Marine resource development continues to expand into deep-sea and ultra-deep-sea domains, the intelligent and green transformation of deep-sea aquaculture equipment has become a key direction for high-quality development of the Marine economy. Large deep-sea cages are considered essential equipment for deep-sea aquaculture. However, there are significant challenges associated with ensuring their structural integrity and long-term monitoring capabilities in the complex Marine environments characteristic of deep-sea aquaculture. The present study focuses on large deep-sea cages, addressing their dynamic response challenges and long-term monitoring power supply needs in complex Marine environments. The present study investigates the nonlinear vibration characteristics of flexible net structures under complex fluid loads. To this end, a multi-field coupled dynamic model is constructed to reveal vibration response patterns and instability mechanisms. A self-powered sensing system based on triboelectric nanogenerator (TENG) technology has been developed, featuring a curved surface adaptive TENG array for the real-time monitoring of net vibration states. This review aims to focus on the research of optimizing the design of curved surface adaptive TENG arrays and deep-sea cage monitoring. The present study will investigate the mechanisms of energy transfer and cooperative capture within multi-body coupled cage systems. In addition, the biomechanics of fish–cage flow field interactions and micro-energy capture technologies will be examined. By integrating different disciplinary perspectives and adopting innovative approaches, this work aims to break through key technical bottlenecks, thereby laying the necessary theoretical and technical foundations for optimizing the design and safe operation of large deep-sea cages. Full article

(This article belongs to the Special Issue Structural Modelling, Safety Assessment, and Advanced Material Application of Marine Structures)

► Show Figures

Figure 1

23 pages, 6498 KB

Open AccessArticle

A Cross-Modal Deep Feature Fusion Framework Based on Ensemble Learning for Land Use Classification

by Xiaohuan Wu, Houji Qi, Keli Wang, Yikun Liu and Yang Wang

ISPRS Int. J. Geo-Inf. 2025, 14(11), 411; https://doi.org/10.3390/ijgi14110411 - 23 Oct 2025

Viewed by 319

Abstract

Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble [...] Read more.

Land use classification based on multi-modal data fusion has gained significant attention due to its potential to capture the complex characteristics of urban environments. However, effectively extracting and integrating discriminative features derived from heterogeneous geospatial data remain challenging. This study proposes an ensemble learning framework for land use classification by fusing cross-modal deep features from both physical and socioeconomic perspectives. Specifically, the framework utilizes the Masked Autoencoder (MAE) to extract global spatial dependencies from remote sensing imagery and applies long short-term memory (LSTM) networks to model spatial distribution patterns of points of interest (POIs) based on type co-occurrence. Furthermore, we employ inter-modal contrastive learning to enhance the representation of physical and socioeconomic features. To verify the superiority of the ensemble learning framework, we apply it to map the land use distribution of Bejing. By coupling various physical and socioeconomic features, the framework achieves an average accuracy of 84.33 %, surpassing several comparative baseline methods. Furthermore, the framework demonstrates comparable performance when applied to a Shenzhen dataset, confirming its robustness and generalizability. The findings highlight the importance of fully extracting and effectively integrating multi-source deep features in land use classification, providing a robust solution for urban planning and sustainable development. Full article

► Show Figures

Figure 1

20 pages, 645 KB

Open AccessArticle

Developing a Safety Planning Smartphone App to Support Adolescents’ Self-Management During Emotional Crises

by Tamara Großmann, Jana Hörger, Nadine Bayer, Sophie Bückle, Daniel Buschek, Jörg M. Fegert, Peter Laurenz, Matthias Lühr, Franziska Marek, Miriam Rassenhofer and Nathalie Oexle

Int. J. Environ. Res. Public Health 2025, 22(11), 1607; https://doi.org/10.3390/ijerph22111607 - 22 Oct 2025

Viewed by 315

Abstract

Suicide is a leading cause of death among adolescents, highlighting the need for effective suicide prevention strategies. Safety planning is a best-practice intervention that has recently shifted toward smartphone-based formats. This study explored stakeholder perspectives (adolescents, parents, practitioners) and described the development of [...] Read more.

Suicide is a leading cause of death among adolescents, highlighting the need for effective suicide prevention strategies. Safety planning is a best-practice intervention that has recently shifted toward smartphone-based formats. This study explored stakeholder perspectives (adolescents, parents, practitioners) and described the development of an age-tailored app. A qualitative study was conducted in Germany (2023–2024) with focus groups involving adolescents (n = 7), parents (n = 4), and practitioners (n = 4). Adolescents (14–21 years) were eligible if they had received inpatient treatment, experienced suicidal thoughts within the past 24 months, and had prior experience with safety planning. Parents and practitioners had experience or expertise with suicidality among adolescents. Data were analyzed using Kuckartz’s qualitative content analysis. App development was based, among other things, on insights from focus groups and pertinent theories. Stakeholders expressed differing needs regarding app content, settings, and adjustability. The developed emira-app includes an interactive safety plan to support users in self-managing emotional crises, along with additional features (e.g., digital HopeBox and diary) to promote integration into users’ daily routines. This multi-component safety planning app was specifically developed for adolescents, and its participatory development process allowed an intensive exploration of key stakeholders’ perspectives. Full article

(This article belongs to the Section Behavioral and Mental Health)

► Show Figures

Figure 1

23 pages, 3142 KB

Open AccessArticle

Cross-Group EEG Emotion Recognition Based on Phase Space Reconstruction Topology

by Xuanpeng Zhu, Mu Zhu, Dong Li and Yu Song

Entropy 2025, 27(10), 1084; https://doi.org/10.3390/e27101084 - 20 Oct 2025

Viewed by 254

Abstract

Due to the interference of artifacts and the nonlinearity of electroencephalogram (EEG) signals, the extraction of representational features has become a challenge in EEG emotion recognition. In this work, we reduce the dimensionality of phase space trajectories by introducing local linear embedding (LLE), [...] Read more.

Due to the interference of artifacts and the nonlinearity of electroencephalogram (EEG) signals, the extraction of representational features has become a challenge in EEG emotion recognition. In this work, we reduce the dimensionality of phase space trajectories by introducing local linear embedding (LLE), which projects the trajectories onto a 2-D plane while preserving their local topological structure, and innovatively construct 16 topological features from different perspectives to quantitatively describe the nonlinear dynamic patterns induced by emotions on a multi-scale level. By using independent feature evaluation, we select core features with significant discrimination and combine the activation patterns of brain topography with model gain ranking to optimize the electrode channels. Validation of the SEED and HIED datasets resulted in subject-dependent average accuracies of 90.33% for normal-hearing subjects (3-Class) and 77.17% for hearing-impaired subjects (4-Class), and we also used differential entropy (DE) features to explore the potential of integrating topological features. By quantifying topological features, the 6-Class task achieved an average accuracy of 77.5% in distinguishing emotions across different subject groups. Full article

► Show Figures

Figure 1

23 pages, 1986 KB

Open AccessArticle

GMHCA-MCBILSTM: A Gated Multi-Head Cross-Modal Attention-Based Network for Emotion Recognition Using Multi-Physiological Signals

by Xueping Li, Yanbo Li, Yuhang Li and Yuan Yang

Algorithms 2025, 18(10), 664; https://doi.org/10.3390/a18100664 - 20 Oct 2025

Viewed by 444

Abstract

To address the limitations of the single-modal electroencephalogram (EEG), such as its single physiological dimension, weak anti-interference ability, and inability to fully reflect emotional states, this paper proposes a gated multi-head cross-attention module (GMHCA) for multimodal fusion of EEG, electrooculography (EOG),and electrodermal activity [...] Read more.

To address the limitations of the single-modal electroencephalogram (EEG), such as its single physiological dimension, weak anti-interference ability, and inability to fully reflect emotional states, this paper proposes a gated multi-head cross-attention module (GMHCA) for multimodal fusion of EEG, electrooculography (EOG),and electrodermal activity (EDA). This attention module employs three independent and parallel attention computation units to assign independent attention weights to different feature subsets across modalities. Combined with a modality complementarity metric, the gating mechanism suppresses redundant heads and enhances the information transmission of key heads. Through multi-head concatenation, cross-modal interaction results from different perspectives are fused. For the backbone network, a multi-scale convolution and bidirectional long short-term memory network (MC-BiLSTM) is designed for feature extraction, tailored to the characteristics of each modality. Experiments show that this method, which primarily fuses eight-channel EEG with peripheral physiological signals, achieves an emotion recognition accuracy of 89.45%, a 7.68% improvement over single-modal EEG. In addition, in cross-subject experiments conducted on the SEED-IV dataset, the EEG+EOG modality achieved a classification accuracy of 92.73%. All were significantly better than the baseline method. This fully demonstrates the effectiveness of the innovative GMHCA module architecture and MC-BiLSTM feature extraction network proposed in this paper for multimodal fusion methods. Through the novel attention gating mechanism, higher recognition accuracy is achieved while significantly reducing the number of EEG channels, providing new ideas and approaches based on attention mechanisms and gated fusion for multimodal emotion recognition in resource-constrained environments. Full article

(This article belongs to the Special Issue Machine Learning in Medical Signal and Image Processing (4th Edition))

► Show Figures

Graphical abstract

21 pages, 11040 KB

Open AccessArticle

DPDN-YOLOv8: A Method for Dense Pedestrian Detection in Complex Environments

by Yue Liu, Linjun Xu, Baolong Li, Zifan Lin and Deyue Yuan

Mathematics 2025, 13(20), 3325; https://doi.org/10.3390/math13203325 - 18 Oct 2025

Viewed by 479

Abstract

Accurate pedestrian detection from a robotic perspective has become increasingly critical, especially in complex environments such as crowded and high-density populations. Existing methods have low accuracy due to multi-scale pedestrians and dense occlusion in complex environments. To address the above drawbacks, a dense [...] Read more.

Accurate pedestrian detection from a robotic perspective has become increasingly critical, especially in complex environments such as crowded and high-density populations. Existing methods have low accuracy due to multi-scale pedestrians and dense occlusion in complex environments. To address the above drawbacks, a dense pedestrian detection network architecture based on YOLOv8n (DPDN-YOLOv8) was introduced for complex environments. The network aims to improve robots’ pedestrian detection in complex environments. Firstly, the C2f modules in the backbone network are replaced with C2f_ODConv modules integrating omni-dimensional dynamic convolution (ODConv) to enable the model’s multi-dimensional feature focusing on detected targets. Secondly, the up-sampling operator Content-Aware Reassembly of Features (CARAFE) is presented to replace the Up-Sample module to reduce the loss of the up-sampling information. Then, the Adaptive Spatial Feature Fusion detector head with four detector heads (ASFF-4) was introduced to enhance the system’s ability to detect small targets. Finally, to accelerate the convergence of the network, the Focaler-Shape-IoU is utilized to become the bounding box regression loss function. The experimental results show that, compared with YOLOv8n, the mAP@0.5 of DPDN-YOLOv8 increases from 80.5% to 85.6%. Although model parameters increase from

3 \times 10^{6}

to

5.2 \times 10^{6}

, it can still meet requirements for deployment on mobile devices. Full article

(This article belongs to the Special Issue Artificial Intelligence: Deep Learning and Computer Vision)

► Show Figures

Figure 1

42 pages, 104137 KB

Open AccessArticle

A Hierarchical Absolute Visual Localization System for Low-Altitude Drones in GNSS-Denied Environments

by Qing Zhou, Haochen Tang, Zhaoxiang Zhang, Yuelei Xu, Feng Xiao and Yulong Jia

Remote Sens. 2025, 17(20), 3470; https://doi.org/10.3390/rs17203470 - 17 Oct 2025

Viewed by 689

Abstract

Current drone navigation systems primarily rely on Global Navigation Satellite Systems (GNSSs), but their signals are susceptible to interference, spoofing, or suppression in complex environments, leading to degraded positioning performance or even failure. To enhance the positioning accuracy and robustness of low-altitude drones [...] Read more.

Current drone navigation systems primarily rely on Global Navigation Satellite Systems (GNSSs), but their signals are susceptible to interference, spoofing, or suppression in complex environments, leading to degraded positioning performance or even failure. To enhance the positioning accuracy and robustness of low-altitude drones in satellite-denied environments, this paper investigates an absolute visual localization solution. This method achieves precise localization by matching real-time images with reference images that have absolute position information. To address the issue of insufficient feature generalization capability due to the complex and variable nature of ground scenes, a visual-based image retrieval algorithm is proposed, which utilizes a fusion of shallow spatial features and deep semantic features, combined with generalized average pooling to enhance feature representation capabilities. To tackle the registration errors caused by differences in perspective and scale between images, an image registration algorithm based on cyclic consistency matching is designed, incorporating a reprojection error loss function, a multi-scale feature fusion mechanism, and a structural reparameterization strategy to improve matching accuracy and inference efficiency. Based on the above methods, a hierarchical absolute visual localization system is constructed, achieving coarse localization through image retrieval and fine localization through image registration, while also integrating IMU prior correction and a sliding window update strategy to mitigate the effects of scale and rotation differences. The system is implemented on the ROS platform and experimentally validated in a real-world environment. The results show that the localization success rates for the h, s, v, and w trajectories are 95.02%, 64.50%, 64.84%, and 91.09%, respectively. Compared to similar algorithms, it demonstrates higher accuracy and better adaptability to complex scenarios. These results indicate that the proposed technology can achieve high-precision and robust absolute visual localization without the need for initial conditions, highlighting its potential for application in GNSS-denied environments. Full article

(This article belongs to the Special Issue Target Detection, Recognition, Tracking, and Positioning Using Remote Sensing and AI Techniques)

► Show Figures

Graphical abstract

24 pages, 13555 KB

Open AccessArticle

A Visual Trajectory-Based Method for Personnel Behavior Recognition in Industrial Scenarios

by Houquan Wang, Tao Song, Zhipeng Xu, Songxiao Cao, Bin Zhou and Qing Jiang

Sensors 2025, 25(20), 6331; https://doi.org/10.3390/s25206331 - 14 Oct 2025

Viewed by 523

Abstract

Accurate recognition of personnel behavior in industrial environments is essential for asset protection and workplace safety, yet complex environmental conditions pose a significant challenge to its accuracy. This paper presents a novel, lightweight framework to address these issues. We first enhance a YOLOv8n [...] Read more.

Accurate recognition of personnel behavior in industrial environments is essential for asset protection and workplace safety, yet complex environmental conditions pose a significant challenge to its accuracy. This paper presents a novel, lightweight framework to address these issues. We first enhance a YOLOv8n model with Receptive Field Attention Convolution (RFAConv) and Efficient Multi-scale Attention (EMA) mechanisms, achieving a 6.9% increase in AP50 and a 4.2% increase in AP50:95 over the baseline. Continuous motion trajectories are then generated using the BOT-SORT algorithm and geometrically corrected via perspective transformation to produce a high-fidelity bird’s-eye view. Finally, a set of discriminative trajectory features is classified using a Random Forest model, attaining F1-scores exceeding 82% for all behaviors on our proprietary industrial dataset. The proposed framework provides a robust and efficient solution for real-time personnel behavior recognition in challenging industrial settings. Future work will focus on exploring more advanced algorithms and validating the framework’s performance on edge devices. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

16 pages, 308 KB

Open AccessReview

Osteosarcoma: A Comprehensive Morphological and Molecular Review with Prognostic Implications

by Alessandro El Motassime, Raffaele Vitiello, Rocco Maria Comodo, Giacomo Capece, Guido Bocchino, Maria Beatrice Bocchi, Giulio Maccauro and Cesare Meschini

Biology 2025, 14(10), 1407; https://doi.org/10.3390/biology14101407 - 13 Oct 2025

Viewed by 767

Abstract

Osteosarcoma (OS) is the most common primary malignant bone tumor, predominantly affecting adolescents and young adults. Despite advances in surgery and multi-agent chemotherapy, survival rates for metastatic or recurrent OS remain poor, highlighting the need for novel prognostic and therapeutic strategies. This review [...] Read more.

Osteosarcoma (OS) is the most common primary malignant bone tumor, predominantly affecting adolescents and young adults. Despite advances in surgery and multi-agent chemotherapy, survival rates for metastatic or recurrent OS remain poor, highlighting the need for novel prognostic and therapeutic strategies. This review integrates histopathologic, molecular, and immune perspectives to provide a comprehensive understanding of OS biology in the context of precision medicine. We discuss classic morphologic and radiographic features alongside recent insights into the tumor microenvironment, including the role of tumor-infiltrating lymphocytes, tumor-associated macrophages, and immune checkpoint expression. Emerging molecular markers, such as gene expression–based immune risk signatures, circulating tumor DNA, and gasdermin D overexpression, are evaluated for their prognostic and therapeutic relevance. Key dysregulated pathways, including WNT/β-catenin and JAK/STAT, are examined in relation to metastasis, chemoresistance, and immune evasion, with emphasis on current targeted approaches under development. By bridging histopathology, immunogenomics, and translational research, this work outlines how integrated biomarker assessment can refine patient stratification and guide the implementation of individualized treatment strategies in OS. Full article

Search Results (630)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (630)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI