Saved Queries

This study presents an innovative pipeline for processing, compressing, and remotely visualizing large-scale numerical simulations of fluid dynamics in a virtual wind tunnel (VWT), leveraging virtual and augmented reality (VR/AR) for enhanced analysis and high-end visualization. The workflow addresses the challenges of handling massive databases generated using Direct Numerical Simulation (DNS) while maintaining visual fidelity and ensuring efficient rendering for user interaction. Fully immersive visualization of supersonic (Mach number 2.86) spatially developing turbulent boundary layers (SDTBLs) over strong concave and convex curvatures was achieved. The comprehensive DNS data provides insights on the transport phenomena inside turbulent boundary layers under strong deceleration or an Adverse Pressure Gradient (APG) caused by concave walls as well as strong acceleration or a Favorable Pressure Gradient (FPG) caused by convex walls under different wall thermal conditions (i.e., Cold, Adiabatic, and Hot walls). The process begins with a .vts file input from a DNS, which is visualized using ParaView software. These visualizations, representing different fluid behaviors based on a DNS with a high spatial/temporal resolution and employing millions of “numerical sensors”, are treated as individual time frames and exported in GL Transmission Format (GLTF), which is a widely used open-source file format designed for efficient transmission and loading of 3D scenes. To support the workflow, optimized Extract–Transform–Load (ETL) techniques were implemented for high-throughput data handling. Conversion of exported Graphics Library Transmission Format (GLTF) files into Graphics Library Transmission Format Binary files (typically referred to as GLB) reduced the storage by 25% and improved the load latency by 60%. This research uses Unity’s Profile Analyzer and Memory Profiler to identify performance limitations during contour rendering, focusing on the GPU and CPU efficiency. Further, immersive VR/AR analytics are achieved by connecting the processed outputs to Unity engine software and Microsoft HoloLens Gen 2 via Azure Remote Rendering cloud services, enabling real-time exploration of fluid behavior in mixed-reality environments. This pipeline constitutes a significant advancement in the scientific visualization of fluid dynamics, particularly when applied to datasets comprising hundreds of high-resolution frames. Moreover, the methodologies and insights gleaned from this approach are highly transferable, offering potential applications across various other scientific and engineering disciplines. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

►▼ Show Figures

Figure 1

16 pages, 7955 KiB

Open AccessArticle

Development and Validation of a Computer Vision Dataset for Object Detection and Instance Segmentation in Earthwork Construction Sites

by JongHo Na, JaeKang Lee, HyuSoung Shin and IlDong Yun

Appl. Sci. 2025, 15(16), 9000; https://doi.org/10.3390/app15169000 - 14 Aug 2025

Abstract

Construction sites report the highest rate of industrial accidents, prompting the active development of smart safety management systems based on deep learning-based computer vision technology. To support the digital transformation of construction sites, securing site-specific datasets is essential. In this study, raw data were collected from an actual earthwork site. Key construction equipment and terrain objects primarily operated at the site were identified, and 89,766 images were processed to build a site-specific training dataset. This dataset includes annotated bounding boxes for object detection and polygon masks for instance segmentation. The performance of the dataset was validated using representative models—YOLO v7 for object detection and Mask R-CNN for instance segmentation. Quantitative metrics and visual assessments confirmed the validity and practical applicability of the dataset. The dataset used in this study has been made publicly available for use by researchers in related fields. This dataset is expected to serve as a foundational resource for advancing object detection applications in construction safety. Full article

(This article belongs to the Section Civil Engineering)

►▼ Show Figures

Figure 1

32 pages, 6394 KiB

Open AccessArticle

Neuro-Bridge-X: A Neuro-Symbolic Vision Transformer with Meta-XAI for Interpretable Leukemia Diagnosis from Peripheral Blood Smears

by Fares Jammal, Mohamed Dahab and Areej Y. Bayahya

Diagnostics 2025, 15(16), 2040; https://doi.org/10.3390/diagnostics15162040 - 14 Aug 2025

Abstract

Background/Objectives: Acute Lymphoblastic Leukemia (ALL) poses significant diagnostic challenges due to its ambiguous symptoms and the limitations of conventional methods like bone marrow biopsies and flow cytometry, which are invasive, costly, and time-intensive. Methods: This study introduces Neuro-Bridge-X, a novel neuro-symbolic hybrid model designed for automated, explainable ALL diagnosis using peripheral blood smear (PBS) images. Leveraging two comprehensive datasets, ALL Image (3256 images from 89 patients) and C-NMC (15,135 images from 118 patients), the model integrates deep morphological feature extraction, vision transformer-based contextual encoding, fuzzy logic-inspired reasoning, and adaptive explainability. To address class imbalance, advanced data augmentation techniques were applied, ensuring equitable representation across benign and leukemic classes. The proposed framework was evaluated through 5-fold cross-validation and fixed train-test splits, employing Nadam, SGD, and Fractional RAdam optimizers. Results: Results demonstrate exceptional performance, with SGD achieving near-perfect accuracy (1.0000 on ALL, 0.9715 on C-NMC) and robust generalization, while Fractional RAdam closely followed (0.9975 on ALL, 0.9656 on C-NMC). Nadam, however, exhibited inconsistent convergence, particularly on C-NMC (0.5002 accuracy). A Meta-XAI controller enhances interpretability by dynamically selecting optimal explanation strategies (Grad-CAM, SHAP, Integrated Gradients, LIME), ensuring clinically relevant insights into model decisions. Conclusions: Visualizations confirm that SGD and RAdam models focus on morphologically critical features, such as leukocyte nuclei, while Nadam struggles with spurious attributions. Neuro-Bridge-X offers a scalable, interpretable solution for ALL diagnosis, with potential to enhance clinical workflows and diagnostic precision in oncology. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

►▼ Show Figures

Figure 1

25 pages, 15383 KiB

Open AccessArticle

SplitGround: Long-Chain Reasoning Split via Modular Multi-Expert Collaboration for Training-Free Scene Knowledge-Guided Visual Grounding

by Xilong Qin, Yue Hu, Wansen Wu, Xinmeng Li and Quanjun Yin

Big Data Cogn. Comput. 2025, 9(8), 209; https://doi.org/10.3390/bdcc9080209 - 14 Aug 2025

Abstract

Scene Knowledge-guided Visual Grounding (SK-VG) is a multi-modal detection task built upon conventional visual grounding (VG) for human–computer interaction scenarios. It utilizes an additional passage of scene knowledge apart from the image and context-dependent textual query for referred object localization. Due to the inherent difficulty in directly establishing correlations between the given query and the image without leveraging scene knowledge, this task imposes significant demands on a multi-step knowledge reasoning process to achieve accurate grounding. Off-the-shelf VG models underperform under such a setting due to the requirement of detailed description in the query and a lack of knowledge inference based on implicit narratives of the visual scene. Recent Vision–Language Models (VLMs) exhibit improved cross-modal reasoning capabilities. However, their monolithic architectures, particularly in lightweight implementations, struggle to maintain coherent reasoning chains across sequential logical deductions, leading to error accumulation in knowledge integration and object localization. To address the above-mentioned challenges, we propose SplitGround—a collaborative framework that strategically decomposes complex reasoning processes by fusing the input query and image with knowledge through two auxiliary modules. Specifically, it implements an Agentic Annotation Workflow (AAW) for explicit image annotation and a Synonymous Conversion Mechanism (SCM) for semantic query transformation. This hierarchical decomposition enables VLMs to focus on essential reasoning steps while offloading auxiliary cognitive tasks to specialized modules, effectively splitting long reasoning chains into manageable subtasks with reduced complexity. Comprehensive evaluations on the SK-VG benchmark demonstrate the significant advancements of our method. Remarkably, SplitGround attains an accuracy improvement of 15.71% on the hard split of the test set over the previous training-required SOTA, using only a compact VLM backbone without fine-tuning, which provides new insights for knowledge-intensive visual grounding tasks. Full article

►▼ Show Figures

Figure 1

28 pages, 19126 KiB

Open AccessArticle

Digital Geospatial Twinning for Revaluation of a Waterfront Urban Park Design (Case Study: Burgas City, Bulgaria)

by Stelian Dimitrov, Bilyana Borisova, Antoaneta Ivanova, Martin Iliev, Lidiya Semerdzhieva, Maya Ruseva and Zoya Stoyanova

Land 2025, 14(8), 1642; https://doi.org/10.3390/land14081642 - 14 Aug 2025

Abstract

Digital twins play a crucial role in linking data with practical solutions. They convert raw measurements into actionable insights, enabling spatial planning that addresses environmental challenges and meets the needs of local communities. This paper presents the development of a digital geospatial twin for a residential district in Burgas, the largest port city on Bulgaria’s southern Black Sea coast. The aim is to provide up-to-date geospatial data quickly and efficiently, and to merge available data into a single, accurate model. This model is used to test three scenarios for revitalizing coastal functions and improving a waterfront urban park in collaboration with stakeholders. The methodology combines aerial photogrammetry, ground-based mobile laser scanning (MLS), and airborne laser scanning (ALS), allowing for robust 3D modeling and terrain reconstruction across different land cover conditions. The current topography, areas at risk from geological hazards, and the vegetation structure with detailed attribute data for each tree are analyzed. These data are used to evaluate the strengths and limitations of the site concerning the desired functionality of the waterfront, considering urban priorities, community needs, and the necessity of addressing contemporary climate challenges. The carbon storage potential under various development scenarios is assessed. Through effective visualization and communication with residents and professional stakeholders, collaborative development processes have been facilitated through a series of workshops focused on coastal transformation. The results aim to support the design of climate-neutral urban solutions that mitigate natural risks without compromising the area’s essential functions, such as residential living and recreation. Full article

(This article belongs to the Special Issue Urban and Peri-Urban Forests—Status, Ecosystem Services, and Future Perspectives)

►▼ Show Figures

Figure 1

23 pages, 2744 KiB

Open AccessArticle

CASF: Correlation-Alignment and Significance-Aware Fusion for Multimodal Named Entity Recognition

by Hui Li, Yunshi Tao, Huan Wang, Zhe Wang and Qingzheng Liu

Algorithms 2025, 18(8), 511; https://doi.org/10.3390/a18080511 - 14 Aug 2025

Viewed by 101

Abstract

With the increasing content richness of social media platforms, Multimodal Named Entity Recognition (MNER) faces the dual challenges of heterogeneous feature fusion and accurate entity recognition. Aiming at the key problems of inconsistent distribution of textual and visual information, insufficient feature alignment and noise interference fusion, this paper proposes a multimodal named entity recognition model based on dual-stream Transformer: CASF-MNER, which designs cross-modal cross-attention based on visual and textual features, constructs a bidirectional interaction mechanism between single-layer features, forms a higher-order semantic correlation modeling, and realizes the cross relevance alignment of modal features; construct a dynamic perception mechanism of multimodal feature saliency features based on multiscale pooling method, construct an entropy weighting strategy of global feature distribution information to adaptively suppress noise redundancy and enhance key feature expression; establish a deep semantic fusion method based on hybrid isomorphic model, design a progressive cross-modal interaction structure, and combine with contrastive learning to realize global fusion of the deep semantic space and representational consistency optimization. The experimental results show that CASF-MNER achieves excellent performance on both Twitter-2015 and Twitter-2017 public datasets, which verifies the effectiveness and advancement of the method proposed in this paper. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

►▼ Show Figures

Figure 1

36 pages, 9430 KiB

Open AccessArticle

Numerical Method for Internal Structure and Surface Evaluation in Coatings

by Tomas Kačinskas and Saulius Baskutis

Inventions 2025, 10(4), 71; https://doi.org/10.3390/inventions10040071 - 13 Aug 2025

Viewed by 110

Abstract

This study introduces a MATrix LABoratory (MATLAB, version R2024b, update 1 (24.2.0.2740171))-based automated system for the detection and measurement of indication areas in coated surfaces, enhancing the accuracy and efficiency of quality control processes in metal, polymeric and thermoplastic coatings. The developed code identifies various indication characteristics in the image and provides numerical results, assesses the size and quantity of indications and evaluates conformity to ISO standards. A comprehensive testing method, involving non-destructive penetrant testing (PT) and radiographic testing (RT), allowed for an in-depth analysis of surface and internal porosity across different coating methods, including aluminum-, copper-, polytetrafluoroethylene (PTFE)- and polyether ether ketone (PEEK)-based materials. Initial findings had a major impact on indicating a non-homogeneous surface of obtained coatings, manufactured using different technologies and materials. Whereas researchers using non-destructive testing (NDT) methods typically rely on visual inspection and manual counting, the system under study automates this process. Each sample image is loaded into MATLAB and analyzed using the Image Processing Tool, Computer Vision Toolbox, Statistics and Machine Learning Toolbox. The custom code performs essential tasks such as image conversion, filtering, boundary detection, layering operations and calculations. These processes are integral to rendering images with developed indications according to NDT method requirements, providing a detailed visual and numerical representation of the analysis. RT also validated the observations made through surface indication detection, revealing either the absence of hidden defects or, conversely, internal porosity correlating with surface conditions. Matrix and graphical representations were used to facilitate the comparison of test results, highlighting more advanced methods and materials as the superior choice for achieving optimal mechanical and structural integrity. This research contributes to addressing challenges in surface quality assurance, advancing digital transformation in inspection processes and exploring more advanced alternatives to traditional coating technologies and materials. Full article

(This article belongs to the Section Inventions and Innovation in Advanced Manufacturing)

►▼ Show Figures

Figure 1

20 pages, 6570 KiB

Open AccessArticle

Autonomous Vehicle Maneuvering Using Vision–LLM Models for Marine Surface Vehicles

by Tae-Yeon Kim and Woen-Sug Choi

J. Mar. Sci. Eng. 2025, 13(8), 1553; https://doi.org/10.3390/jmse13081553 - 13 Aug 2025

Viewed by 168

Abstract

Recent advances in vision–language models (VLMs) have transformed the field of robotics. Researchers are combining the reasoning capabilities of large language models (LLMs) with the visual information processing capabilities of VLMs in various domains. However, most efforts have focused on terrestrial robots and are limited in their applicability to volatile environments such as ocean surfaces and underwater environments, where real-time judgment is required. We propose a system integrating the cognition, decision making, path planning, and control of autonomous marine surface vehicles in the ROS2–Gazebo simulation environment using a multimodal vision–LLM system with zero-shot prompting for real-time adaptability. In 30 experiments, adding the path plan mode feature increased the success rate from 23% to 73%. The average distance increased from 39 m to 45 m, and the time required to complete the task increased from 483 s to 672 s. These results demonstrate the trade-off between improved reliability and reduced efficiency. Experiments were conducted to verify the effectiveness of the proposed system and evaluate its performance with and without adding a path-planning step. The final algorithm with the path-planning sub-process yields a higher success rate, and better average path length and time. We achieve real-time environmental adaptability and performance improvement through prompt engineering and the addition of a path-planning sub-process in a limited structure, where the LLM state is initialized with every application programming interface call (zero-shot prompting). Additionally, the developed system is independent of the vision–LLM archetype, making it scalable and adaptable to future models. Full article

(This article belongs to the Special Issue Intelligent Measurement and Control System of Marine Robots)

►▼ Show Figures

Figure 1

26 pages, 4766 KiB

Open AccessArticle

RetinoDeep: Leveraging Deep Learning Models for Advanced Retinopathy Diagnostics

by Sachin Kansal, Bajrangi Kumar Mishra, Saniya Sethi, Kanika Vinayak, Priya Kansal and Jyotindra Narayan

Sensors 2025, 25(16), 5019; https://doi.org/10.3390/s25165019 - 13 Aug 2025

Viewed by 211

Abstract

Diabetic retinopathy (DR), a leading cause of vision loss worldwide, poses a critical challenge to healthcare systems due to its silent progression and the reliance on labor-intensive, subjective manual screening by ophthalmologists, especially amid a global shortage of eye care specialists. Addressing the pressing need for scalable, objective, and interpretable diagnostic tools, this work introduces RetinoDeep—deep learning frameworks integrating hybrid architectures and explainable AI to enhance the automated detection and classification of DR across seven severity levels. Specifically, we propose four novel models: an EfficientNetB0 combined with an SPCL transformer for robust global feature extraction; a ResNet50 ensembled with Bi-LSTM to synergize spatial and sequential learning; a Bi-LSTM optimized through genetic algorithms for hyperparameter tuning; and a Bi-LSTM with SHAP explainability to enhance model transparency and clinical trustworthiness. The models were trained and evaluated on a curated dataset of 757 retinal fundus images, augmented to improve generalization, and benchmarked against state-of-the-art baselines (including EfficientNetB0, Hybrid Bi-LSTM with EfficientNetB0, Hybrid Bi-GRU with EfficientNetB0, ResNet with filter enhancements, Bi-LSTM optimized using Random Search Algorithm (RSA), Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), and a standard Convolutional Neural Network (CNN)), using metrics such as accuracy, F1-score, and precision. Notably, the Bi-LSTM with Particle Swarm Optimization (PSO) outperformed other configurations, achieving superior stability and generalization, while SHAP visualizations confirmed alignment between learned features and key retinal biomarkers, reinforcing the system’s interpretability. By combining cutting-edge neural architectures, advanced optimization, and explainable AI, this work sets a new standard for DR screening systems, promising not only improved diagnostic performance but also potential integration into real-world clinical workflows. Full article

(This article belongs to the Special Issue Emerging Trends in Artificial Intelligence for Biomedical Image Analysis)

►▼ Show Figures

Figure 1

22 pages, 10765 KiB

Open AccessArticle

Exploring the Cognitive Reconstruction Mechanism of Generative AI in Outcome-Based Design Education: A Study on Load Optimization and Performance Impact Based on Dual-Path Teaching

by Qidi Dong, Jiaxi He, Nanxin Li, Binzhu Wang, Heng Lu and Yingyin Yang

Buildings 2025, 15(16), 2864; https://doi.org/10.3390/buildings15162864 - 13 Aug 2025

Viewed by 189

Abstract

Undergraduate design education faces a structural contradiction characterized by high cognitive load (CL) and relatively low innovation output. Meanwhile, existing generative AI tools predominantly emphasize the generation of visual outcomes, often overlooking the logical guidance mechanisms inherent in design thinking. This study proposes a Dual-Path teaching model integrating critical reconstruction behaviors to examine how AI enhances design thinking. It adopts structured interactions with the DeepSeek large language model, CL theory, and Structural Equation Modeling for analysis. Quantitative results indicate that AI-assisted paths significantly enhance design quality (72.43 vs. 65.60 in traditional paths). This improvement is attributed to a “direct effect + multiple mediators” model: specifically, AI reduced the mediating role of Extraneous Cognitive Load from 0.907 to 0.017, while simultaneously enhancing its investment in Germane Cognitive Load to support deep, innovative thinking. Theoretically, this study is among the first to integrate AI-driven critical reconstruction behaviors (e.g., iteration count, cross-domain terms) into CL theory, validating the “logical chain externalization → load optimization” mechanism in design education contexts. Practically, it provides actionable strategies for the digital transformation of design education, fostering interdisciplinary thinking and advancing a teaching paradigm where low-order cognition is outsourced to reinforce high-order creative thinking. Full article

(This article belongs to the Topic Architectural Education)

►▼ Show Figures

Figure 1

29 pages, 12262 KiB

Open AccessArticle

3D Heritage Reconstruction Through HBIM and Multi-Source Data Fusion: Geometric Change Analysis Across Decades

by Przemysław Klapa, Andrzej Żygadło and Massimiliano Pepe

Appl. Sci. 2025, 15(16), 8929; https://doi.org/10.3390/app15168929 - 13 Aug 2025

Viewed by 161

Abstract

The reconstruction of historic buildings requires the integration of diverse data sources, both geometric and non-geometric. This study presents a multi-source data analysis methodology for heritage reconstruction using 3D modeling and Historic Building Information Modeling (HBIM). The proposed approach combines geometric data, including point clouds acquired via Terrestrial Laser Scanning (TLS), with architectural documentation and non-geometric information such as photographs, historical records, and technical descriptions. The case study focuses on a wooden Orthodox church in Żmijowiska, Poland, analyzing geometric changes in the structure over multiple decades. The reconstruction process integrates modern surveys with archival sources and, in the absence of complete geometric data, utilizes semantic, topological, and structural information. Geometric datasets from the 1990s, 1930s, and the turn of the 20th century were analyzed, supplemented by intermediate archival photographs and technical documentation. This integrated method enabled the identification of transformation phases and verification of discrepancies between historical records and the building’s actual condition. The findings confirm that the use of HBIM and multi-source data fusion facilitates accurate reconstruction of historical geometry and supports visualization of spatial changes across decades. Full article

(This article belongs to the Special Issue Intelligent Techniques and 3D Virtual Reconstruction for Architectural Heritage)

►▼ Show Figures

Figure 1

20 pages, 4191 KiB

Open AccessArticle

A Deep Transfer Contrastive Learning Network for Few-Shot Hyperspectral Image Classification

by Gan Yang and Zhaohui Wang

Remote Sens. 2025, 17(16), 2800; https://doi.org/10.3390/rs17162800 - 13 Aug 2025

Viewed by 202

Abstract

Over recent decades, the hyperspectral image (HSI) classification landscape has undergone significant transformations driven by advances in deep learning (DL). Despite substantial progress, few-shot scenarios remain a significant challenge, primarily due to the high cost of manual annotation and the unreliability of visual interpretation. Traditional DL models require massive datasets to learn sophisticated feature representations, hindering their full potential in data-scarce contexts. To tackle this issue, a deep transfer contrastive learning network is proposed. A spectral data augmentation module is incorporated to expand limited sample pairs. Subsequently, a spatial–spectral feature extraction module is designed to fuse the learned feature information. The weights of the spatial feature extraction network are initialized with knowledge transferred from source-domain pretraining, while the spectral residual network acquires rich spectral information. Furthermore, contrastive learning is integrated to enhance discriminative representation learning from scarce samples, effectively mitigating obstacles arising from the high inter-class similarity and large intra-class variance inherent in HSIs. Experiments on four public HSI datasets demonstrate that our method achieves competitive performance against state-of-the-art approaches. Full article

►▼ Show Figures

Figure 1

11 pages, 1650 KiB

Open AccessArticle

A RUBY Reporter for Efficient Banana Transformation and Development of Betalain-Rich Musa Germplasm

by Weidi He, Huoqing Huang, Shuxian Wang, Dalin Wang, Yanling Xie and Chunhua Hu

Int. J. Mol. Sci. 2025, 26(16), 7805; https://doi.org/10.3390/ijms26167805 - 13 Aug 2025

Viewed by 116

Abstract

Bananas are economically important crops valued for both their nutritional and dietary uses. However, the global banana industry suffers from a narrow base dominated by a single variety. Developing novel varieties enriched in health-promoting compounds such as betalains can help diversify banana germplasm and meet evolving consumer demands. In this study, the RUBY reporter system was employed to produce betalain-rich bananas via stable and transient genetic transformations. Transient transformation by injecting 3 mL of Agrobacterium suspension into immature fruits produced vivid red-purple pulp containing up to 1.78 mg/g of betalains. For stable transformation, embryonic cell suspensions expressing RUBY exhibited a red-purple coloration after the first screening, reducing the selection period from 45 to 15 days. These findings demonstrate that RUBY is a reliable visual reporter for efficient screening and can be used to develop nutritionally enhanced bananas. Full article

(This article belongs to the Special Issue Plant Breeding and Genetics: New Findings and Perspectives)

►▼ Show Figures

Figure 1

18 pages, 1534 KiB

Open AccessArticle

TSGformer: A Unified Temporal–Spatial Graph Transformer with Adaptive Cross-Scale Modeling for Multivariate Time Series

by Yan Chen, Cheng Li and Xiaoli Zhao

Systems 2025, 13(8), 688; https://doi.org/10.3390/systems13080688 - 12 Aug 2025

Viewed by 187

Abstract

Multivariate time series forecasting requires modeling complex and evolving spatio-temporal dependencies as well as frequency-domain patterns; however, the existing Transformer-based approaches often struggle to effectively capture dynamic inter-series correlations and disentangle relevant spectral components, leading to limited forecasting accuracy and robustness under non-stationary conditions. To address these challenges, we propose TSGformer, a Transformer-based architecture that integrates multi-scale adaptive graph learning, adaptive spectral decomposition, and cross-scale interactive fusion modules to jointly model temporal, spatial, and spectral dynamics in multivariate time series data. Specifically, TSGformer constructs dynamic graphs at multiple temporal scales to adaptively learn evolving inter-variable relationships, applies an adaptive spectral enhancement module to emphasize critical frequency components while suppressing noise, and employs interactive convolution blocks to fuse multi-domain features effectively. Extensive experiments across eight benchmark datasets show that TSGformer achieves the best results on five datasets, with an MSE of 0.354 on Exchange, improving upon the best baselisnes by 2.4%. Ablation studies further verify the effectiveness of each proposed component, and visualization analyses reveal that TSGformer captures meaningful dynamic correlations aligned with real-world patterns. Full article

►▼ Show Figures

Figure 1

21 pages, 6057 KiB

Open AccessArticle

PFSKANs: A Novel Pixel-Level Feature Selection Model Based on Kolmogorov–Arnold Networks

by Rui Yang, Michael V. Basin, Guangzhe Yao and Hongzheng Zeng

Sensors 2025, 25(16), 4982; https://doi.org/10.3390/s25164982 - 12 Aug 2025

Viewed by 209

Abstract

Inspired by the interpretability of Kolmogorov–Arnold Networks (KANs), a novel Pixel-level Feature Selection (PFS) model based on KANs (PFSKANs) is proposed as a fundamentally distinct alternative from trainable Convolutional Neural Networks (CNNs) and transformers in the computer vision tasks. We modify the simplification techniques of KANs to detect key pixels with high contribution scores directly at the input image. Specifically, a trainable selection procedure is intuitively visualized and performed only once, since the obtained interpretable pixels can subsequently be identified and dimensionally standardized using the proposed mathematical approach. Experiments on the image classification tasks using the MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets demonstrate that PFSKANs achieve comparable performance to CNNs in terms of accuracy, parameter efficiency, and training time. Full article

(This article belongs to the Special Issue Image Feature Extraction for Computer Vision Tasks in Sensor Systems and Applications)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 61.

Go to page 1 2 3 4 5

Search Results (3,033)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI