Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,637)

Search Parameters:
Keywords = architectural visualization

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 33536 KB  
Article
A Global Collaborative Discriminative Denoising Network for Text-to-Image Person Re-Identification
by Shaozhen Han and Shuai Guo
Sensors 2026, 26(11), 3604; https://doi.org/10.3390/s26113604 (registering DOI) - 5 Jun 2026
Abstract
Text-to-Image Person Re-Identification (TI-ReID) aims to retrieve target pedestrians from large-scale image galleries using natural language descriptions. Despite recent progress achieved by dual-tower architectures based on vision-language pre-training, these methods remain susceptible to semantic misalignment and noise induced by occlusions, background clutter, and [...] Read more.
Text-to-Image Person Re-Identification (TI-ReID) aims to retrieve target pedestrians from large-scale image galleries using natural language descriptions. Despite recent progress achieved by dual-tower architectures based on vision-language pre-training, these methods remain susceptible to semantic misalignment and noise induced by occlusions, background clutter, and fine-grained attribute distractions. To mitigate these issues, we propose a Global Collaborative Discriminative Denoising Network (GCDD), a dual-tower fine-tuning framework built upon a CLIP visual encoder and a BERT text encoder. Specifically, GCDD introduces three complementary branches for robust feature enhancement. First, Discriminative Token Selection (DTS) performs adaptive hard filtering to suppress low-informative tokens. Second, Global-Guided Feature Adaptation (GFA) leverages modality-specific global semantics to recalibrate local features. Third, Query-Driven Aggregation (QDA) constructs more discriminative global representations via attentive pooling, where the backbone global feature serves as the query. The outputs of the three branches are fused through a parameter-free averaging strategy to produce the final representation. Extensive experiments on three standard TI-ReID benchmarks demonstrate that GCDD achieves strong competitive performance, validating the effectiveness of the proposed feature enhancement framework. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

25 pages, 1924 KB  
Article
A Human-in-the-Loop Assistive Navigation Platform for UAS-Based Infrastructure Visual Inspection: System Architecture and Proof-of-Concept Demonstration
by Martin Xu, Yuxiang Zhao, Zixin Wang and Mohamad Alipour
Sensors 2026, 26(11), 3615; https://doi.org/10.3390/s26113615 (registering DOI) - 5 Jun 2026
Abstract
While Unmanned Aerial Systems (UAS) are increasingly used for infrastructure inspection, a critical gap exists between optimized path planning and reliable real-world execution. Fully autonomous flights face regulatory constraints and environmental risks, whereas manual piloting introduces inconsistencies that compromise data quality. To address [...] Read more.
While Unmanned Aerial Systems (UAS) are increasingly used for infrastructure inspection, a critical gap exists between optimized path planning and reliable real-world execution. Fully autonomous flights face regulatory constraints and environmental risks, whereas manual piloting introduces inconsistencies that compromise data quality. To address this gap, this study proposes a human-in-the-loop assistive navigation platform that enables pilots to follow preplanned inspection trajectories while maintaining manual control. The proposed system integrates an Augmented Reality (AR)-based guidance module that provides real-time viewpoint localization with a mesh-coupled quality monitoring module that continuously evaluates view redundancy and triangulation uncertainty. A proof-of-concept field demonstration through an on-site façade inspection example indicates that the proposed platform has the potential to improve the consistency of viewpoint distribution, achieving closer adherence to planned spacing and stand-off distance. This results in more uniform spatial sampling, enhanced view redundancy, and reduced variability in theoretical uncertainty, leading to improved geometric conditions for Structure-from-Motion (SfM) reconstruction. Overall, the field demonstration highlights the potential of combining computational guidance with human decision-making to support reliable and high-quality UAS-based infrastructure inspection. Full article
33 pages, 4252 KB  
Article
An Efficient Multimodal Framework for Barley Drought Stress Detection on Resource-Constrained Devices
by Rihab Boukouba, Dalenda Ben Aissa, Amira Guidara, Nadia Smaoui and Chantal Ebel
AgriEngineering 2026, 8(6), 230; https://doi.org/10.3390/agriengineering8060230 (registering DOI) - 5 Jun 2026
Abstract
Drought stress significantly impacts barley (Hordeum vulgare L.) production, necessitating early and accurate detection systems for precision agriculture. Traditional monitoring approaches rely on manual inspection or single-modality sensing, which often fail to capture the complex physiological responses to water deficit. This study [...] Read more.
Drought stress significantly impacts barley (Hordeum vulgare L.) production, necessitating early and accurate detection systems for precision agriculture. Traditional monitoring approaches rely on manual inspection or single-modality sensing, which often fail to capture the complex physiological responses to water deficit. This study presents a novel multimodal deep learning framework that integrates RGB imaging with environmental sensor data (temperature and humidity) for real-time drought stress classification in barley plants. The proposed architecture employs EfficientNetV2-S for visual feature extraction, coupled with a dedicated sensor encoding branch, unified through a cross-modal attention mechanism and gated multimodal fusion strategy. To address the computational constraints of agricultural IoT systems, we implemented comprehensive CPU optimization techniques and model compression via TensorFlow Lite INT8 quantization, achieving a 68.5% reduction in training time and 90% reduction in model size. The system was validated on a custom greenhouse dataset (379 samples, 80/20 split) and the PlantVillage dataset (26,000 images, binary reformulation). A 10-seed evaluation protocol demonstrated that the full multimodal model achieves 98.3 ± 1.5% accuracy, outperforming both an image-only baseline (97.4 ± 1.8%) and a sensor-only MLP (73.8 ± 3.5%). Across seeds, the model also achieved an F1-score of 98.34 ± 1.48% and ROC-AUC of 99.93 ± 0.13%. Ablation analysis with ANOVA (F(4,36) = 4.44, p = 0.005) confirmed that multimodal fusion improves accuracy by 0.92% over image-only models, with the full gated cross-modal attention mechanism outperforming all simplified baselines, including AgriFusionNet (75.22%), Shallow CNN (92.54%), Logistic Regression multimodal (92.11%), and Random Forest multimodal (89.91%). These results further show that relying on environmental data alone is insufficient, reinforcing the benefit of multimodal fusion. External validation on PlantVillage achieved 99.97% accuracy, demonstrating strong generalization capabilities. The optimized model operates efficiently on CPU-only hardware (training time: 9.1 min/epoch), making it suitable for edge deployment in resource-constrained agricultural environments. This work demonstrates that a low-cost, CPU-compatible multimodal deep learning system can reliably detect drought stress in barley under real greenhouse conditions and provides a practical and scalable solution for early stress monitoring in precision agriculture. Full article
(This article belongs to the Special Issue Precision Agriculture: Sensor-Based Systems and IoT-Enabled Machinery)
24 pages, 5910 KB  
Article
Digital Heritage Conservation of Historical Villages Using UAV Photogrammetry–LiDAR Fusion and AI-Based Façade Material Analytics
by Junpeng Fan, Zao Zhang, Anbang Dai, Hongxi Yin and Yasushi Ikeda
Geomatics 2026, 6(3), 66; https://doi.org/10.3390/geomatics6030066 (registering DOI) - 5 Jun 2026
Abstract
The accelerating deterioration of Chinese historical villages necessitates advanced digital approaches for systematic documentation and conservation. The present research proposes a novel Digital Heritage Framework that integrates UAV-based 3D oblique photogrammetry, LiDAR point cloud modeling, and computer vision. Unlike single-technology approaches, our methodology [...] Read more.
The accelerating deterioration of Chinese historical villages necessitates advanced digital approaches for systematic documentation and conservation. The present research proposes a novel Digital Heritage Framework that integrates UAV-based 3D oblique photogrammetry, LiDAR point cloud modeling, and computer vision. Unlike single-technology approaches, our methodology solves modeling issues for complex terrain mapping. This especially applies to the interior and roof works of buildings. The framework implements a customized Rhino-Grasshopper. The 3D model is able to resolve issues of shadow occlusion and spatial discontinuity by integrating aerial and ground-based datasets into spatially coherent formats. This makes use of the Meta-AI-SAM2 deep learning model for semantic segmentation and identification of materials. The computer vision (CV) approach gives semi-automated façade analysis. It enables documentation of complex architectural features non-invasively. We developed a Unity-based visualization platform. It features multiscale representations, ranging from village-scale layouts to centimeter-accurate scans of heritage structures such as the Qinchuan Ancestral Hall. Integration with the Unity platform optimizes dataset organization and hierarchical structuring. This significantly enhances database operational efficiency. This integration reduces manual processing complexity and hardware demands. Demonstrating documented efficiency and precision, this workflow presents a scalable solution for endangered heritage sites. Future research will explore AI-assisted detail reconstruction and cross-cultural adaptations. It potentially establishes this framework as a comprehensive tool for sustainable digital conservation. Full article
Show Figures

Figure 1

40 pages, 5078 KB  
Article
Designing Human-Centred Adaptive AI Navigation for Blind and Visually Impaired Individuals: A Cognitive Load-Aware Framework for Accessible Urban Mobility
by Pilar Herrero-Martín and Álvaro García-Ballestero
AI 2026, 7(6), 206; https://doi.org/10.3390/ai7060206 (registering DOI) - 5 Jun 2026
Abstract
Artificial intelligence systems increasingly mediate high-stakes human activities, yet urban navigation remains highly challenging for blind and visually impaired individuals. Although digital navigation technologies have significantly improved route planning and accessibility, many existing systems still rely on generic interaction paradigms that insufficiently account [...] Read more.
Artificial intelligence systems increasingly mediate high-stakes human activities, yet urban navigation remains highly challenging for blind and visually impaired individuals. Although digital navigation technologies have significantly improved route planning and accessibility, many existing systems still rely on generic interaction paradigms that insufficiently account for cognitive load, contextual uncertainty, and the adaptive needs of vulnerable users. This challenge highlights the importance of Human-Centred AI approaches capable of supporting not only functional accessibility, but also cognitively sustainable and trustworthy interaction. This paper introduces LAZAR, a human-centred adaptive AI framework for accessible urban mobility grounded in a user-centred design methodology and formalised through a structured Software Requirements Specification. Rather than focusing exclusively on route optimisation, LAZAR approaches assistive navigation as an adaptive human–AI interaction problem in which instructional granularity, interaction frequency, and feedback mechanisms are designed to support user autonomy and situational awareness whilst limiting unnecessary cognitive burden. The proposed framework integrates high-fidelity prototyping, accessibility-oriented interaction modelling, and a modular multi-agent architecture intended to support adaptive and personalised guidance. Central to the approach is a cognitive load-aware interaction layer designed to regulate the presentation and timing of navigational assistance according to user needs and contextual conditions. The proposed multi-agent architecture is presented as a modular design framework whose interaction principles and interface logic were partially operationalised in the evaluated prototype. The complete integration of all adaptive coordination mechanisms, together with large-scale real-world validation, remains part of ongoing and future development work. This work contributes a structured methodology for the design of adaptive assistive AI systems that integrates accessibility requirements, human-centred interaction principles, and cognitively informed guidance strategies. A formative usability evaluation involving eleven visually impaired participants provides preliminary empirical evidence regarding usability, accessibility, and perceived usefulness of the proposed interaction model. The framework establishes a foundation for future research on inclusive and adaptive AI-based navigation systems in urban environments. Full article
(This article belongs to the Special Issue Human-Computer Interaction and Human-Centered AI)
Show Figures

Figure 1

28 pages, 3181 KB  
Article
FedVI: Financial Cross-Domain Federated Learning with Scarce Overlapping Samples via Visual Representation of Heterogeneous Tabular Data and Meta-Optimization
by Kaiqing Yuan and Jiang Wu
Entropy 2026, 28(6), 637; https://doi.org/10.3390/e28060637 - 4 Jun 2026
Abstract
Federated learning offers a promising approach for cross-institutional financial risk control modeling but encounters two key challenges in practice: feature space heterogeneity and low sample overlap rate. Current federated transfer learning methods often rely heavily on sufficient overlapping samples or explicit feature alignment. [...] Read more.
Federated learning offers a promising approach for cross-institutional financial risk control modeling but encounters two key challenges in practice: feature space heterogeneity and low sample overlap rate. Current federated transfer learning methods often rely heavily on sufficient overlapping samples or explicit feature alignment. However, these approaches frequently result in negative transfer when enforced alignment is applied in highly heterogeneous environments. To address this issue, we propose FedVI, a novel federated transfer learning framework that integrates tabular-to-image conversion and meta-learning mechanisms. Moving beyond conventional methods that rely on sample-level alignment, FedVI employs a federated dual-stream feature alignment strategy to securely reconstruct a unified global feature map across institutions. Subsequently, FedVI integrates federated Image Generator for Tabular Data (IGTD) with tabular Transformer technology to convert one-dimensional tabular data into two-dimensional visual-semantic tensors. These tensors effectively fuse spatial topology and semantic information while embedding an independent Mask channel to explicitly retain the true missingness patterns of features. Finally, FedVI adopts the Model-Agnostic Meta-Learning (MAML) architecture to facilitate global parameter optimization. We evaluated FedVI on the real-world Lending Club credit dataset and Home Credit Default Risk datasets under highly heterogeneous federated settings (i.e., heterogeneous feature spaces across three clients and scarce overlapping samples). The results reveal that FedVI achieves competitive performance against advanced baselines such as FedProx, FedRep, and FedKT, particularly in recall and F1-Score. These findings indicate that FedVI can effectively support cross-domain adaptation under heterogeneous federated learning settings. Full article
Show Figures

Figure 1

39 pages, 3956 KB  
Review
Converging Functional Layers in Bridge Digital Twin Research: A Scientometric Analysis of Intellectual Structures
by Sung-Hoon Kim, Do Young Kim and Sang-Ho Lee
Buildings 2026, 16(11), 2271; https://doi.org/10.3390/buildings16112271 - 4 Jun 2026
Abstract
Bridge maintenance research has increasingly expanded toward Digital Twin (DT), Structural Health Monitoring (SHM), Artificial Intelligence (AI), sensing technologies, and object-based information management. As maintenance paradigms shift from reactive to preventive and prescriptive approaches, digital twins have gained attention as a means of [...] Read more.
Bridge maintenance research has increasingly expanded toward Digital Twin (DT), Structural Health Monitoring (SHM), Artificial Intelligence (AI), sensing technologies, and object-based information management. As maintenance paradigms shift from reactive to preventive and prescriptive approaches, digital twins have gained attention as a means of integrating fragmented technological components. However, the growing emphasis on AI- and DT-based analytics raises questions about how object-based information structures, sensing systems, SHM, AI-based analytics, and interoperability mechanisms are thematically connected and structurally associated. This study conducted a scientometric analysis of publications retrieved from the Web of Science (WoS) database without year restrictions. To avoid predetermining the importance of any single information-modeling technology, the main search query excluded BIM-related terms and combined the bridge domain, DT-related technology layer, and maintenance domain. After applying document type, language, and research-area filters, 406 records were screened by title and abstract. Six records that were not directly related to bridge DT maintenance research were excluded, resulting in a final analytical corpus of 400 records. Among these, 77 records were identified as the BIM-related subset for sensitivity analysis. Using VOSviewer-based bibliographic coupling as the core method, supported by keyword co-occurrence, density and overlay visualization, and CiteSpace analysis, this study examined contemporary research structures and historical intellectual bases. The results show that bridge DT development is not detached from existing technological foundations but reflects the cumulative convergence of object-based information modeling, sensing, SHM, AI-based analytics, and interoperability mechanisms within integrated DT architectures. Full article
Show Figures

Figure 1

23 pages, 1962 KB  
Article
Real-Time Water Quality Monitoring System in an Aquaponics Pilot Culture
by Josefina Ortiz-Arreola, Pedro Avila-Pérez, José Luis García-Rivas, Carlos Eduardo Barrera-Díaz, Sonia Martínez-Gallegos, Gabriela Roa-Morales and Ernesto de la Cruz-Reyes
Appl. Sci. 2026, 16(11), 5638; https://doi.org/10.3390/app16115638 - 4 Jun 2026
Abstract
Water-quality monitoring is critical for maintaining the symbiotic balance and productivity of aquaponic systems. This study presents the design, implementation, and evaluation of a remote, real-time monitoring system based on the Internet of Things (IoT) paradigm. The system continuously monitors the key parameters [...] Read more.
Water-quality monitoring is critical for maintaining the symbiotic balance and productivity of aquaponic systems. This study presents the design, implementation, and evaluation of a remote, real-time monitoring system based on the Internet of Things (IoT) paradigm. The system continuously monitors the key parameters of temperature, pH, electrical conductivity, total dissolved solids, salinity, dissolved oxygen, turbidity, and total suspended solids. Utilizing a modular architecture, the platform provides real-time visualization, cloud-based data management, and automated alerts via SMS and e-mail to notify operators of deviations from established tolerance ranges. The system was experimentally validated over a six-month period in a pilot-scale aquaponics system cultivating common carp (Cyprinus carpio). Statistical analysis demonstrated a 97% data acquisition reliability rate. Furthermore, no statistically significant differences (p > 0.05) were observed between the sensor-based measurements and reference laboratory analyses, confirming the system’s high accuracy. This versatile and cost-effective tool enables data-driven decision-making, facilitates timely interventions to reduce production losses, and ensures the long-term environmental stability of integrated aquaculture systems. Full article
(This article belongs to the Special Issue Innovative Technologies in Ecological Quality Assessment)
Show Figures

Figure 1

20 pages, 3101 KB  
Article
Dual-Stream Wavelet Network for Early Knee Osteoarthritis Grading in IoT-Enabled Smart Clinics
by Lassaad Ben Ammar, Altahir Saad and Ahod Alghuried
Future Internet 2026, 18(6), 304; https://doi.org/10.3390/fi18060304 - 4 Jun 2026
Abstract
Knee Osteoarthritis (KOA) is a leading contributor to global physical disability, where delayed diagnosis often results in irreversible joint damage and socio-economic cost. Early diagnosis remains challenging due to subtle radiographic biomarkers and limited access to specialized expertise, particularly in distributed healthcare settings. [...] Read more.
Knee Osteoarthritis (KOA) is a leading contributor to global physical disability, where delayed diagnosis often results in irreversible joint damage and socio-economic cost. Early diagnosis remains challenging due to subtle radiographic biomarkers and limited access to specialized expertise, particularly in distributed healthcare settings. Within the evolving landscape of the Future Internet, characterized by Internet of Medical Things (IoMT), edge–cloud computing, and intelligent digital health infrastructures, there is an increasing demand for scalable, low-latency, and explainable AI-driven diagnostic solutions. In this work, we propose a Dual-Stream Wavelet Fusion Network (DS-WFN) alongside a distributed edge-cloud architectural roadmap tailored for deployment in distributed and edge-enabled healthcare ecosystems. The framework integrates a spatial morphological stream with a spectral wavelet stream, augmented by an Adaptive Wavelet Selection Mechanism (AWSM). The AWSM dynamically selects optimal frequency bases (Haar, Symlet, Daubechies) to preserve fine-grained diagnostic features typically lost in conventional CNN architectures. An Adaptive Spatial Alignment (ASA) module further ensures efficient fusion of heterogeneous representations, enabling robust feature integration across computational nodes. Experimental results across a five-fold patient-isolated cross-validation protocol demonstrate that the DS-WFN achieves a mean classification accuracy of 76.3% (95% CI: 71.6–80.8%) and a macro-averaged F1-score of 0.747 (95% CI: 0.697–0.795), consistently outperforming single-stream baselines while preventing patient-level data leakage. Furthermore, Grad-CAM visualizations provide interpretable outputs aligned with clinical diagnostic criteria, supporting trustworthy AI integration into digital healthcare workflows. Furthermore, we disclose a methodological framework for edge-based implementation, highlighting how localized inference ensures data sovereignty and real-time clinical support. By combining multiscale signal processing with deep learning under a Future Internet paradigm, this work contributes a scalable, explainable, and edge-ready diagnostic framework for early KOA detection, enabling intelligent, connected, and resource-efficient healthcare services. Full article
(This article belongs to the Special Issue Distributed Intelligence for IoT and Smart Systems)
Show Figures

Figure 1

28 pages, 26785 KB  
Article
LIVAS-Net: A Parameter-Efficient 3D Architecture for Intracranial Artery Segmentation in TOF-MRA
by Mekhla Sarkar, Prasan Kumar Sahoo and Yen-Chu Huang
Electronics 2026, 15(11), 2450; https://doi.org/10.3390/electronics15112450 - 3 Jun 2026
Viewed by 78
Abstract
Cerebrovascular diseases, including stroke and intracranial aneurysm, affect millions worldwide and remain a leading cause of mortality and disability. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) enables non-invasive visualization of intracranial arteries. However, the complex cerebrovascular anatomy, characterized by variable diameters, tortuous trajectories, and intricate [...] Read more.
Cerebrovascular diseases, including stroke and intracranial aneurysm, affect millions worldwide and remain a leading cause of mortality and disability. Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) enables non-invasive visualization of intracranial arteries. However, the complex cerebrovascular anatomy, characterized by variable diameters, tortuous trajectories, and intricate branching, renders manual segmentation time-consuming, subjective, and prone to inter-observer variability. While deep learning models achieve strong segmentation performance, existing 3D approaches typically require millions of parameters, limiting deployment in resource-constrained clinical settings. To address this challenge, this paper proposes a Lightweight Intracranial Vascular Segmentation Network (LIVAS-Net), a parameter-efficient 3D encoder–decoder architecture using 3D Ghost convolution modules. It incorporates a novel Vessel Continuity Refinement Branch (VCRB), which aims to correct discontinuities in logit space through per-voxel learnable gating. Two model variants are introduced, LIVAS-Net (129K parameters, 18.3 GFLOPs) and LIVAS-L Net (2.97M parameters, 87.8 GFLOPs), achieving 7.9× and 1.6× fewer FLOPs than the standard 3D U-Net (144.5 GFLOPs), respectively. Evaluation on the multi-center COSTA benchmark shows a DSC of 0.8943 (HD95: 1.97 mm) and 0.9235 (HD95: 0.77 mm) on the ADAM test set, outperforming 3D U-Net (DSC: 0.8762). Cross-center evaluation on three external COSTA datasets yields overall DSCs of 0.7834 and 0.7967 versus 0.6998 for 3D UNet. Further evaluation on the CereVessMRA dataset (N = 271) reveals that LIVAS-Net achieves the highest DSC (0.669), demonstrating promising experimental results warranting future clinical validation in resource-constrained settings. Full article
(This article belongs to the Special Issue Feature Papers in "Computer Science & Engineering", 3rd Edition)
Show Figures

Figure 1

18 pages, 1091 KB  
Article
Aircraft Classification via Dual-Branch Color–Shape Feature Learning and Cross-Attention Fusion
by Xianyun Qian and Peilin Liu
Appl. Sci. 2026, 16(11), 5604; https://doi.org/10.3390/app16115604 - 3 Jun 2026
Viewed by 60
Abstract
Aircraft type classification plays a crucial role in various applications, including remote sensing, surveillance, and aviation management. Since the development of deep learning techniques, nearly all related methods are based on neural networks, achieving excellent classification results. However, existing classification networks primarily focus [...] Read more.
Aircraft type classification plays a crucial role in various applications, including remote sensing, surveillance, and aviation management. Since the development of deep learning techniques, nearly all related methods are based on neural networks, achieving excellent classification results. However, existing classification networks primarily focus on optimizing single-branch architectures, often overlooking the underlying factors driving recognition performance. Our analysis suggests that color and shape are two important and complementary visual cues for aircraft classification, with their relative importance varying across datasets and imaging scenarios. Motivated by this insight, we propose a novel dual-branch network architecture that separately processes shape and color cues, allowing each branch to emphasize one type of visual information before adaptive fusion. Specifically, we designed two dedicated modules: a Shape Feature Module (SFM) and a Color Feature Module (CFM), tailored for extracting shape and color information independently. Furthermore, we introduced a Color–Shape Cross-Attention-based Fusion Module (CSCAFM) to integrate these features. Within CSCAFM, the separated shape and color features are adaptively fused through a cross-attention mechanism, enabling the network to dynamically weigh the contributions of shape and color. Experimental results on benchmark datasets demonstrate the effectiveness of our approach. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

27 pages, 17585 KB  
Article
Comparative Analysis of Glass Façade Systems: Daylight Modulation, Architectural Composition, and Visual Communication
by Alina Lipowicz-Budzyńska
Buildings 2026, 16(11), 2259; https://doi.org/10.3390/buildings16112259 - 3 Jun 2026
Viewed by 163
Abstract
Contemporary glass façade systems play a crucial role in shaping both the environmental performance and the architectural expression of buildings. This study presents a comparative analysis of selected façade solutions, including internal louvres, adaptive façades, louvre systems combined with glass, and façades incorporating [...] Read more.
Contemporary glass façade systems play a crucial role in shaping both the environmental performance and the architectural expression of buildings. This study presents a comparative analysis of selected façade solutions, including internal louvres, adaptive façades, louvre systems combined with glass, and façades incorporating printed graphics. This research is based on in situ measurements of light reduction, digital analysis of enamel coverage, and a multi-criteria evaluation of compositional and communicative aspects. The analysis covers twelve European public buildings and focuses on the relationship between daylight modulation, solar protection, and the visual articulation of façades. The results indicate that façade systems differ significantly in their ability to control light and shape architectural expression. Adaptive façades and louvre-based systems demonstrate high efficiency in daylight modulation, while façade graphics integrated with selective glazing offer a balanced performance, combining effective solar protection with high daylight transmittance. This study highlights the role of façade design as a multi-functional element that integrates environmental performance with compositional and communicative functions. The proposed comparative framework provides a useful tool for evaluating façade strategies in the early stages of architectural design. The findings suggest that façade graphics, when integrated with contemporary glazing systems, provide a balanced solution combining environmental performance with architectural and communicative functions. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Graphical abstract

38 pages, 7245 KB  
Article
A Hybrid Architecture of CNN–Swin-T Integrated with Attention Mechanism and Explainable AI for Alzheimer’s Disease Classification
by Saeed Mohsen, Saada Khadragy, Norah Alnaim, Noorah Albehaijan and Ahmed F. Ibrahim
Computers 2026, 15(6), 361; https://doi.org/10.3390/computers15060361 - 3 Jun 2026
Viewed by 148
Abstract
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that requires early and accurate diagnosis to improve patient outcomes. In this paper, an attention-enhanced hybrid deep learning (DL) framework is proposed that combines Convolutional Neural Network (CNN) and Swin Transformer (Swin-T) architectures for multi-class [...] Read more.
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that requires early and accurate diagnosis to improve patient outcomes. In this paper, an attention-enhanced hybrid deep learning (DL) framework is proposed that combines Convolutional Neural Network (CNN) and Swin Transformer (Swin-T) architectures for multi-class Alzheimer’s classification. The proposed model integrates an attention mechanism to enhance feature representation and improve classification performance. Experiments are conducted on a dataset containing three classes: Mild Demented, Very Mild Demented, and Non-Demented. To improve the model’s generalization, data augmentation techniques are applied to enhance the model’s performance. Additionally, three explainable artificial intelligence (XAI) techniques are employed, including Grad-CAM++, Integrated Gradients, and Saliency maps, to interpret the model’s predictions and to provide visual insights into decision-making processes. The proposed attention-enhanced hybrid CNN–Swin-T model achieves a testing accuracy of 99.92% and reaches 99.71%, 99.73%, and 99.72%, for precision, recall, and F1-score, respectively. The hybrid CNN–Swin-T with attention outperforms three implemented models: baseline CNN, standalone Swin-T, and hybrid CNN–Swin-T. The explainability results validate the proposed model’s focus on relevant regions, increasing trust in automated diagnosis systems. Finally, a comparative analysis with an ablation study is presented to demonstrate that the integration of the attention mechanism with a hybrid CNN–Swin-T architecture leads to the highest performance and more reliable predictions compared to the other three models. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Figure 1

62 pages, 16802 KB  
Review
Infrared Imaging for Autonomous Power Inspection: A Review from Detector to System Integration
by Yingye Guo, Yuxi Du, Run Mao, Yongyin Zhao and Junxiong Guo
Sensors 2026, 26(11), 3552; https://doi.org/10.3390/s26113552 - 3 Jun 2026
Viewed by 248
Abstract
The transition toward smart grids and Industry 4.0 demands a fundamental shift in maintenance strategies, as manual inspection methods are increasingly being supplanted by automated monitoring systems. Among the advanced technologies for smart inspection, infrared imaging has advantages including non-contact operation, intuitive visualization, [...] Read more.
The transition toward smart grids and Industry 4.0 demands a fundamental shift in maintenance strategies, as manual inspection methods are increasingly being supplanted by automated monitoring systems. Among the advanced technologies for smart inspection, infrared imaging has advantages including non-contact operation, intuitive visualization, and predictive capabilities, which has become a cornerstone for autonomous inspection of critical power infrastructure. This review provides recent advancements in infrared imaging, with a specific focus on automated power system inspection. The discussion starts with an overview of the fundamental principles and system architectures, emphasizing the pivotal role of infrared detectors. A detailed analysis traces the technological evolution from traditional photon detectors to current uncooled microbolometers, and critically assesses emerging low-dimensional materials. The analysis highlights inherent performance trade-offs among sensitivity, operating temperature, and fabrication cost. Subsequently, the review explores advanced signal processing algorithms, such as real-time non-uniformity correction and adaptive noise suppression, which are typically implemented on FPGA platforms. Advanced optical configurations—encompassing computational imaging, lensless designs, and scattering suppression methods—are also discussed, demonstrating how their convergence enhances image fidelity and operational reliability in complex field environments. Representative application paradigms are surveyed, including drone-based transmission line inspections, patrol robots in substations, and fault diagnosis in photovoltaic plants; for each, operational efficacy and economic benefits are assessed. Despite considerable progress, several challenges persist, notably the performance–stability–cost trilemma in novel detector development, the substantial computational demands of end-to-end optimized systems, and a lack of standardization. Finally, the review outlines future research directions, such as high-performance uncooled arrays, AI-driven co-design of optics and algorithms, and the development of standardized, low-cost, intelligent inspection platforms. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

49 pages, 2508 KB  
Review
Sensing the Action: Rethinking Sensor Modalities and Multi-Modal Fusion in Vision–Language–Action Models for Robotic Manipulation
by Byoung Chul Ko
Sensors 2026, 26(11), 3541; https://doi.org/10.3390/s26113541 - 3 Jun 2026
Viewed by 105
Abstract
Recent Vision–Language–Action (VLA) models have rapidly emerged as general-purpose robotic policies that integrate language understanding, visual perception, and robot control. However, prior studies and surveys have primarily emphasized backbone architectures, action decoders, training recipes, and benchmark performance, whereas relatively limited systematic attention has [...] Read more.
Recent Vision–Language–Action (VLA) models have rapidly emerged as general-purpose robotic policies that integrate language understanding, visual perception, and robot control. However, prior studies and surveys have primarily emphasized backbone architectures, action decoders, training recipes, and benchmark performance, whereas relatively limited systematic attention has been given to sensor modality selection, heterogeneous signal alignment and fusion, and their connection to action generation, all of which are critical to the performance and safety of real-world robotic manipulation. This survey addresses this gap by reinterpreting VLA within the framework of a sensor–fusion–action pipeline. This study first presents a systematic taxonomy of major sensor modalities, including RGB, depth, tactile sensing, force/torque, proprioception and inertial measurement unit, multi-spectral/thermal, and event-based vision, and compares them in terms of the physical information they provide, their characteristic failure modes, and their deployment constraints. This survey further reviews teleoperation-, human video-, and simulation-based data collection pipelines, together with representative dataset configurations, and analyzes the multi-modal design space from a sensor-centric perspective, including early and late fusion, cross-attention, token-level fusion, adapters, mixture of experts, and multi-rate action representations. In addition, this study identifies a strong bias in existing benchmarks toward RGB-centric inputs and single success-rate metrics and emphasizes the need for a multidimensional evaluation framework incorporating robustness, worst-case performance, safety, latency, and efficiency. By shifting the focus away from a model-centric narrative and explicitly accounting for real-world sensor complexity, this survey seeks to establish a sensor-centered foundation for the next generation of Physical AI. Full article
(This article belongs to the Special Issue Feature Review Papers in Sensors and Robotics)
Show Figures

Figure 1

Back to TopTop