Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,355)

Search Parameters:
Keywords = visual navigation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 2055 KiB  
Article
Language-Driven Cross-Attention for Visible–Infrared Image Fusion Using CLIP
by Xue Wang, Jiatong Wu, Pengfei Zhang and Zhongjun Yu
Sensors 2025, 25(16), 5083; https://doi.org/10.3390/s25165083 - 15 Aug 2025
Abstract
Language-guided multimodal fusion, which integrates information from both visible and infrared images, has shown strong performance in image fusion tasks. In low-light or complex environments, a single modality often fails to fully capture scene features, whereas fused images enable robots to obtain multidimensional [...] Read more.
Language-guided multimodal fusion, which integrates information from both visible and infrared images, has shown strong performance in image fusion tasks. In low-light or complex environments, a single modality often fails to fully capture scene features, whereas fused images enable robots to obtain multidimensional scene understanding for navigation, localization, and environmental perception. This capability is particularly important in applications such as autonomous driving, intelligent surveillance, and search-and-rescue operations, where accurate recognition and efficient decision-making are critical. To enhance the effectiveness of multimodal fusion, we propose a text-guided infrared and visible image fusion network. The framework consists of two key components: an image fusion branch, which employs a cross-domain attention mechanism to merge multimodal features, and a text-guided module, which leverages the CLIP model to extract semantic cues from image descriptions containing visible content. These semantic parameters are then used to guide the feature modulation process during fusion. By integrating visual and linguistic information, our framework is capable of generating high-quality color-fused images that not only enhance visual detail but also enrich semantic understanding. On benchmark datasets, our method achieves strong quantitative performance: SF = 2.1381, Qab/f = 0.6329, MI = 14.2305, SD = 0.8527, VIF = 45.1842 on LLVIP, and SF = 1.3149, Qab/f = 0.5863, MI = 13.9676, SD = 94.7203, VIF = 0.7746 on TNO. These results highlight the robustness and scalability of our model, making it a promising solution for real-world multimodal perception applications. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

21 pages, 9031 KiB  
Article
A Pyramid Convolution-Based Scene Coordinate Regression Network for AR-GIS
by Haobo Xu, Chao Zhu, Yilong Wang, Huachen Zhu and Wei Ma
ISPRS Int. J. Geo-Inf. 2025, 14(8), 311; https://doi.org/10.3390/ijgi14080311 - 15 Aug 2025
Abstract
Camera tracking plays a pivotal role in augmented reality geographic information systems (AR-GIS) and location-based services (LBS), serving as a crucial component for accurate spatial awareness and navigation. Current learning-based camera tracking techniques, while achieving superior accuracy in pose estimation, often overlook changes [...] Read more.
Camera tracking plays a pivotal role in augmented reality geographic information systems (AR-GIS) and location-based services (LBS), serving as a crucial component for accurate spatial awareness and navigation. Current learning-based camera tracking techniques, while achieving superior accuracy in pose estimation, often overlook changes in scale. This oversight results in less stable localization performance and challenges in coping with dynamic environments. To address these tasks, we propose a pyramid convolution-based scene coordinate regression network (PSN). Our approach leverages a pyramidal convolutional structure, integrating kernels of varying sizes and depths, alongside grouped convolutions that alleviate computational demands while capturing multi-scale features from the input imagery. Subsequently, the network incorporates a novel randomization strategy, effectively diminishing correlated gradients and markedly bolstering the training process’s efficiency. The culmination lies in a regression layer that maps the 2D pixel coordinates to their corresponding 3D scene coordinates with precision. The experimental outcomes show that our proposed method achieves centimeter-level accuracy in small-scale scenes and decimeter-level accuracy in large-scale scenes after only a few minutes of training. It offers a favorable balance between localization accuracy and efficiency, and effectively supports augmented reality visualization in dynamic environments. Full article
Show Figures

Figure 1

24 pages, 2716 KiB  
Article
Interactive Indoor Audio-Map as a Digital Equivalent of the Tactile Map
by Dariusz Gotlib, Krzysztof Lipka and Hubert Świech
Appl. Sci. 2025, 15(16), 8975; https://doi.org/10.3390/app15168975 - 14 Aug 2025
Abstract
There are still relatively few applications that serve the function of a traditional tactile map, allowing visually impaired individuals to explore a digital map by sliding their fingers across it. Moreover, existing technological solutions either lack a spatial learning mode or provide only [...] Read more.
There are still relatively few applications that serve the function of a traditional tactile map, allowing visually impaired individuals to explore a digital map by sliding their fingers across it. Moreover, existing technological solutions either lack a spatial learning mode or provide only limited functionality, focusing primarily on navigating to a selected destination. To address these gaps, the authors have proposed an original concept for an indoor mobile application that enables map exploration by sliding a finger across the smartphone screen, using audio spatial descriptions as the primary medium for conveying information. The spatial descriptions are hierarchical and contextual, focusing on anchoring them in space and indicating their extent of influence. The basis for data management and analysis is GIS technology. The application is designed to support spatial orientation during user interaction with the digital map. The research emphasis was on creating an effective cartographic communication message, utilizing voice-based delivery of spatial information stored in a virtual building model (within a database) and tags placed in real-world buildings. Techniques such as Text-to-Speech, TalkBack, QRCode technologies were employed to achieve this. Preliminary tests conducted with both blind and sighted people demonstrated the usefulness of the proposed concept. The proposed solution supporting people with disabilities can also be useful and attractive to all users of navigation applications and may affect the development of such applications. Full article
(This article belongs to the Section Earth Sciences)
Show Figures

Figure 1

25 pages, 24334 KiB  
Article
Unsupervised Knowledge Extraction of Distinctive Landmarks from Earth Imagery Using Deep Feature Outliers for Robust UAV Geo-Localization
by Zakhar Ostrovskyi, Oleksander Barmak, Pavlo Radiuk and Iurii Krak
Mach. Learn. Knowl. Extr. 2025, 7(3), 81; https://doi.org/10.3390/make7030081 - 13 Aug 2025
Viewed by 158
Abstract
Vision-based navigation is a common solution for the critical challenge of GPS-denied Unmanned Aerial Vehicle (UAV) operation, but a research gap remains in the autonomous discovery of robust landmarks from aerial survey imagery needed for such systems. In this work, we propose a [...] Read more.
Vision-based navigation is a common solution for the critical challenge of GPS-denied Unmanned Aerial Vehicle (UAV) operation, but a research gap remains in the autonomous discovery of robust landmarks from aerial survey imagery needed for such systems. In this work, we propose a framework to fill this gap by identifying visually distinctive urban buildings from aerial survey imagery and curating them into a landmark database for GPS-free UAV localization. The proposed framework constructs semantically rich embeddings using intermediate layers from a pre-trained YOLOv11n-seg segmentation network. This novel technique requires no additional training. An unsupervised landmark selection strategy, based on the Isolation Forest algorithm, then identifies objects with statistically unique embeddings. Experimental validation on the VPAIR aerial-to-aerial benchmark shows that the proposed max-pooled embeddings, assembled from selected layers, significantly improve retrieval performance. The top-1 retrieval accuracy for landmarks more than doubled compared to typical buildings (0.53 vs. 0.31), and a Recall@5 of 0.70 is achieved for landmarks. Overall, this study demonstrates that unsupervised outlier selection in a carefully constructed embedding space yields a highly discriminative, computation-friendly set of landmarks suitable for real-time, robust UAV navigation. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition)
Show Figures

Figure 1

27 pages, 15885 KiB  
Article
Model-Free UAV Navigation in Unknown Complex Environments Using Vision-Based Reinforcement Learning
by Hao Wu, Wei Wang, Tong Wang and Satoshi Suzuki
Drones 2025, 9(8), 566; https://doi.org/10.3390/drones9080566 - 12 Aug 2025
Viewed by 354
Abstract
Autonomous UAV navigation in unknown and complex environments remains a core challenge, especially under limited sensing and computing resources. While most methods rely on modular pipelines involving mapping, planning, and control, they often suffer from poor real-time performance, limited adaptability, and high dependency [...] Read more.
Autonomous UAV navigation in unknown and complex environments remains a core challenge, especially under limited sensing and computing resources. While most methods rely on modular pipelines involving mapping, planning, and control, they often suffer from poor real-time performance, limited adaptability, and high dependency on accurate environment models. Moreover, many deep-learning-based solutions either use RGB images prone to visual noise or optimize only a single objective. In contrast, this paper proposes a unified, model-free vision-based DRL framework that directly maps onboard depth images and UAV state information to continuous navigation commands through a single convolutional policy network. This end-to-end architecture eliminates the need for explicit mapping and modular coordination, significantly improving responsiveness and robustness. A novel multi-objective reward function is designed to jointly optimize path efficiency, safety, and energy consumption, enabling adaptive flight behavior in unknown complex environments. The trained policy demonstrates generalization in diverse simulated scenarios and transfers effectively to real-world UAV flights. Experiments show that our approach achieves stable navigation and low latency. Full article
Show Figures

Figure 1

20 pages, 27328 KiB  
Article
GDVI-Fusion: Enhancing Accuracy with Optimal Geometry Matching and Deep Nearest Neighbor Optimization
by Jincheng Peng, Xiaoli Zhang, Kefei Yuan, Xiafu Peng and Gongliu Yang
Appl. Sci. 2025, 15(16), 8875; https://doi.org/10.3390/app15168875 - 12 Aug 2025
Viewed by 148
Abstract
The visual–inertial odometry (VIO) system is not robust enough in long time operation. Especially, the visual–inertial and Global Navigation Satellite System (GNSS) coupled system is prone to dispersion of system position information in case of failure of visual information or GNSS information. To [...] Read more.
The visual–inertial odometry (VIO) system is not robust enough in long time operation. Especially, the visual–inertial and Global Navigation Satellite System (GNSS) coupled system is prone to dispersion of system position information in case of failure of visual information or GNSS information. To address the above problems, this paper proposes a tightly coupled nonlinear optimized localization system of RGBD visual, inertial measurement unit (IMU), and global position (GDVI-Fusion) to solve the problems of insufficient robustness of carrier position estimation and inaccurate localization information in environments where visual information or GNSS information fails. The preprocessing of depth information in the initialization process is proposed to solve the influence of an RGBD camera by lighting and physical structure and to improve the accuracy of the depth information of image feature points so as to improve the robustness of the localization system. Based on the K-Nearest-Neighbors (KNN) algorithm, to process the feature points, the matching points construct the best geometric constraints and eliminate the feature matching points with an abnormal length and slope of the matching line, which improves the rapidity and accuracy of the feature point matching, resulting in the improvement of the system’s localization accuracy. The lightweight monocular GDVI-Fusion system proposed in this paper achieves a 54.2% improvement in operational efficiency and a 37.1% improvement in positioning accuracy compared with the GVINS system. We have verified the system’s operational efficiency and positioning accuracy using a public dataset and on a prototype. Full article
Show Figures

Figure 1

11 pages, 608 KiB  
Case Report
Myopia in Beagles in a Family of 12 Individuals
by Juliana Giselbrecht and Barbara Nell
Animals 2025, 15(16), 2342; https://doi.org/10.3390/ani15162342 - 11 Aug 2025
Viewed by 176
Abstract
This case report investigated the cause of visual impairment at night in Beagle dogs in a family of 12 individuals. Four related adult male Beagles with impaired night vision and eight related Beagles (three females, five males) underwent a complete ophthalmological examination at [...] Read more.
This case report investigated the cause of visual impairment at night in Beagle dogs in a family of 12 individuals. Four related adult male Beagles with impaired night vision and eight related Beagles (three females, five males) underwent a complete ophthalmological examination at the ophthalmology service. Electroretinography was performed on four dogs with impaired night vision after dark adaptation to evaluate retinal function. Retinoscopy was performed in 12 dogs in a standing or sitting position to assess refraction. Axial globe measurements were conducted using B-scan ultrasonography in nine dogs. In total, twelve adult Beagles (nine males, three females) from four generations were evaluated, with nine dogs showing impaired night vision. Ophthalmic examinations revealed no abnormalities that could explain the visual impairment. Electroretinography showed normal retinal function. In total, 83.3% (10/12) of the dogs were myopic, with refractive errors ranging from −1.25 to −6.25 diopters (D). All dogs with night vision impairment were significantly more myopic (median: −4.88 D) than those without impairment (median: −1.25 D). In two myopic dogs, the insertion of contact lenses improved navigation in the dark maze test. Myopic dogs showed a significantly greater vitreous body depth (10.1 mm; range 9.7–10.3 mm) compared to emmetropic dogs (9.5 mm; range: 9.4–9.6 mm). These findings suggest that in dogs with night vision impairment, retinoscopy should be included in the ophthalmological exam to exclude myopia as a potential cause. Further research is needed to determine the cause of myopia in the tested Beagles and to investigate possible genetic factors. Full article
(This article belongs to the Section Veterinary Clinical Studies)
Show Figures

Figure 1

19 pages, 4425 KiB  
Article
A Multi-Scale Contextual Fusion Residual Network for Underwater Image Enhancement
by Chenye Lu, Li Hong, Yan Fan and Xin Shu
J. Mar. Sci. Eng. 2025, 13(8), 1531; https://doi.org/10.3390/jmse13081531 - 9 Aug 2025
Viewed by 326
Abstract
Underwater image enhancement (UIE) is a key technology in the fields of underwater robot navigation, marine resources development, and ecological environment monitoring. Due to the absorption and scattering of different wavelengths of light in water, the quality of the original underwater images usually [...] Read more.
Underwater image enhancement (UIE) is a key technology in the fields of underwater robot navigation, marine resources development, and ecological environment monitoring. Due to the absorption and scattering of different wavelengths of light in water, the quality of the original underwater images usually deteriorates. In recent years, UIE methods based on deep neural networks have made significant progress, but there still exist some problems, such as insufficient local detail recovery and difficulty in effectively capturing multi-scale contextual information. To solve the above problems, a Multi-Scale Contextual Fusion Residual Network (MCFR-Net) for underwater image enhancement is proposed in this paper. Firstly, we propose an Adaptive Feature Aggregation Enhancement (AFAE) module, which adaptively strengthens the key regions in the input images and improves the feature expression ability by fusing multi-scale convolutional features and a self-attention mechanism. Secondly, we design a Residual Dual Attention Module (RDAM), which captures and strengthens features in key regions through twice self-attention calculation and residual connection, while effectively retaining the original information. Thirdly, a Multi-Scale Feature Fusion Decoding (MFFD) module is designed to obtain rich contexts at multiple scales, improving the model’s understanding of details and global features. We conducted extensive experiments on four datasets, and the results show that MCFR-Net effectively improves the visual quality of underwater images and outperforms many existing methods in both full-reference and no-reference metrics. Compared with the existing methods, the proposed MCFR-Net can not only capture the local details and global contexts more comprehensively, but also show obvious advantages in visual quality and generalization performance. It provides a new technical route and benchmark for subsequent research in the field of underwater vision processing, which has important academic and application values. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

30 pages, 10586 KiB  
Article
Autonomous UAV-Based System for Scalable Tactile Paving Inspection
by Tong Wang, Hao Wu, Abner Asignacion, Zhengran Zhou, Wei Wang and Satoshi Suzuki
Drones 2025, 9(8), 554; https://doi.org/10.3390/drones9080554 - 7 Aug 2025
Viewed by 316
Abstract
Tactile pavings (Tenji Blocks) are prone to wear, obstruction, and improper installation, posing significant safety risks for visually impaired pedestrians. This system incorporates a lightweight YOLOv8 (You Only Look Once version 8) model for real-time detection using a fisheye camera to maximize field-of-view [...] Read more.
Tactile pavings (Tenji Blocks) are prone to wear, obstruction, and improper installation, posing significant safety risks for visually impaired pedestrians. This system incorporates a lightweight YOLOv8 (You Only Look Once version 8) model for real-time detection using a fisheye camera to maximize field-of-view coverage, which is highly advantageous for low-altitude UAV navigation in complex urban settings. To enable lightweight deployment, a novel Lightweight Shared Detail Enhanced Oriented Bounding Box (LSDE-OBB) head module is proposed. The design rationale of LSDE-OBB leverages the consistent structural patterns of tactile pavements, enabling parameter sharing within the detection head as an effective optimization strategy without significant accuracy compromise. The feature extraction module is further optimized using StarBlock to reduce computational complexity and model size. Integrated Contextual Anchor Attention (CAA) captures long-range spatial dependencies and refines critical feature representations, achieving an optimal speed–precision balance. The framework demonstrates a 25.13% parameter reduction (2.308 M vs. 3.083 M), 46.29% lower GFLOPs, and achieves 11.97% mAP50:95 on tactile paving datasets, enabling real-time edge deployment. Validated through public/custom datasets and actual UAV flights, the system realizes robust tactile paving detection and stable navigation in complex urban environments via hierarchical control algorithms for dynamic trajectory planning and obstacle avoidance, providing an efficient and scalable platform for automated infrastructure inspection. Full article
Show Figures

Figure 1

22 pages, 6051 KiB  
Article
Research on GNSS Spoofing Detection and Autonomous Positioning Technology for Drones
by Jiawen Zhou, Mei Hu, Chao Zhou, Zongmin Liu and Chao Ma
Electronics 2025, 14(15), 3147; https://doi.org/10.3390/electronics14153147 - 7 Aug 2025
Viewed by 304
Abstract
With the rapid development of the low-altitude economy, the application of drones in both military and civilian fields has become increasingly widespread. The safety and accuracy of their positioning and navigation have become critical factors in ensuring the successful execution of missions. Currently, [...] Read more.
With the rapid development of the low-altitude economy, the application of drones in both military and civilian fields has become increasingly widespread. The safety and accuracy of their positioning and navigation have become critical factors in ensuring the successful execution of missions. Currently, GNSS spoofing attack techniques are becoming increasingly sophisticated, posing a serious threat to the reliability of drone positioning. This paper proposes a GNSS spoofing detection and autonomous positioning method for drones operating in mission mode, which is based on visual sensors and does not rely on additional hardware devices. First, during the deception detection phase, the ResNet50-SE twin network is used to extract and match real-time aerial images from the drone’s camera with satellite image features obtained via GNSS positioning, thereby identifying positioning anomalies. Second, once deception is detected, during the positioning recovery phase, the system uses the SuperGlue network to match real-time aerial images with satellite image features within a specific area, enabling the drone’s absolute positioning. Finally, experimental validation using open-source datasets demonstrates that the method achieves a GNSS spoofing detection accuracy of 89.5%, with 89.7% of drone absolute positioning errors controlled within 13.9 m. This study provides a comprehensive solution for the safe operation and stable mission execution of drones in complex electromagnetic environments. Full article
Show Figures

Figure 1

19 pages, 19040 KiB  
Article
Multi-Strategy Fusion RRT-Based Algorithm for Optimizing Path Planning in Continuous Cherry Picking
by Yi Zhang, Xinying Miao, Yifei Sun, Zhipeng He, Tianwen Hou, Zhenghan Wang and Qiuyan Wang
Agriculture 2025, 15(15), 1699; https://doi.org/10.3390/agriculture15151699 - 6 Aug 2025
Viewed by 148
Abstract
Automated cherry harvesting presents a significant opportunity to overcome the high costs and inefficiencies of manual labor in modern agriculture. However, robotic harvesting in dense canopies requires sophisticated path planning to navigate cluttered branches and selectively pick target fruits. This paper introduces a [...] Read more.
Automated cherry harvesting presents a significant opportunity to overcome the high costs and inefficiencies of manual labor in modern agriculture. However, robotic harvesting in dense canopies requires sophisticated path planning to navigate cluttered branches and selectively pick target fruits. This paper introduces a complete robotic harvesting solution centered on a novel path-planning algorithm: the Multi-Strategy Integrated RRT for Continuous Harvesting Path (MSI-RRTCHP) algorithm. Our system first employs a machine vision system to identify and locate mature cherries, distinguishing them from unripe fruits, leaves, and branches, which are treated as obstacles. Based on this visual data, the MSI-RRTCHP algorithm generates an optimal picking trajectory. Its core innovation is a synergistic strategy that enables intelligent navigation by combining probability-guided exploration, goal-oriented sampling, and adaptive step size adjustments based on the obstacle’s density. To optimize the picking sequence for multiple targets, we introduce an enhanced traversal algorithm (σ-TSP) that accounts for obstacle interference. Field experiments demonstrate that our integrated system achieved a 90% picking success rate. Compared with established algorithms, the MSI-RRTCHP algorithm reduced the path length by up to 25.47% and the planning time by up to 39.06%. This work provides a practical and efficient framework for robotic cherry harvesting, showcasing a significant step toward intelligent agricultural automation. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

22 pages, 7705 KiB  
Article
Implementation of SLAM-Based Online Mapping and Autonomous Trajectory Execution in Software and Hardware on the Research Platform Nimbulus-e
by Thomas Schmitz, Marcel Mayer, Theo Nonnenmacher and Matthias Schmitz
Sensors 2025, 25(15), 4830; https://doi.org/10.3390/s25154830 - 6 Aug 2025
Viewed by 437
Abstract
This paper presents the design and implementation of a SLAM-based online mapping and autonomous trajectory execution system for the Nimbulus-e, a concept vehicle designed for agile maneuvering in confined spaces. The Nimbulus-e uses individual steer-by-wire corner modules with in-wheel motors at all four [...] Read more.
This paper presents the design and implementation of a SLAM-based online mapping and autonomous trajectory execution system for the Nimbulus-e, a concept vehicle designed for agile maneuvering in confined spaces. The Nimbulus-e uses individual steer-by-wire corner modules with in-wheel motors at all four corners. The associated eight joint variables serve as control inputs, allowing precise trajectory following. These control inputs can be derived from the vehicle’s trajectory using nonholonomic constraints. A LiDAR sensor is used to map the environment and detect obstacles. The system processes LiDAR data in real time, continuously updating the environment map and enabling localization within the environment. The inclusion of vehicle odometry data significantly reduces computation time and improves accuracy compared to a purely visual approach. The A* and Hybrid A* algorithms are used for trajectory planning and optimization, ensuring smooth vehicle movement. The implementation is validated through both full vehicle simulations using an ADAMS Car—MATLAB co-simulation and a scaled physical prototype, demonstrating the effectiveness of the system in navigating complex environments. This work contributes to the field of autonomous systems by demonstrating the potential of combining advanced sensor technologies with innovative control algorithms to achieve reliable and efficient navigation. Future developments will focus on improving the robustness of the system by implementing a robust closed-loop controller and exploring additional applications in dense urban traffic and agricultural operations. Full article
Show Figures

Figure 1

29 pages, 16016 KiB  
Article
An Eye Movement Monitoring Tool: Towards a Non-Invasive Device for Amblyopia Treatment
by Juan Camilo Castro-Rizo, Juan Pablo Moreno-Garzón, Carlos Arturo Narváez Delgado, Nicolas Valencia-Jimenéz, Javier Ferney Castillo García and Alvaro Alexander Ocampo-Gonzalez
Sensors 2025, 25(15), 4823; https://doi.org/10.3390/s25154823 - 6 Aug 2025
Viewed by 388
Abstract
Amblyopia, commonly affecting children aged 0–6 years, results from disrupted visual processing during early development and often leads to reduced visual acuity in one eye. This study presents the development and preliminary usability assessment of a non-invasive ocular monitoring device designed to support [...] Read more.
Amblyopia, commonly affecting children aged 0–6 years, results from disrupted visual processing during early development and often leads to reduced visual acuity in one eye. This study presents the development and preliminary usability assessment of a non-invasive ocular monitoring device designed to support oculomotor engagement and therapy adherence in amblyopia management. The system incorporates an interactive maze-navigation task controlled via gaze direction, implemented during monocular and binocular sessions. The device tracks lateral and anteroposterior eye movements and generates visual reports, including displacement metrics and elliptical movement graphs. Usability testing was conducted with a non-probabilistic adult sample (n = 15), including individuals with and without amblyopia. The System Usability Scale (SUS) yielded an average score of 75, indicating good usability. Preliminary tests with two adults diagnosed with amblyopia suggested increased eye displacement during monocular sessions, potentially reflecting enhanced engagement rather than direct therapeutic improvement. This feasibility study demonstrates the device’s potential as a supportive, gaze-controlled platform for visual engagement monitoring in amblyopia rehabilitation. Future clinical studies involving pediatric populations and integration of visual stimuli modulation are recommended to evaluate therapeutic efficacy and adaptability for early intervention. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Graphical abstract

26 pages, 18583 KiB  
Article
Transforming Pedagogical Practices and Teacher Identity Through Multimodal (Inter)action Analysis: A Case Study of Novice EFL Teachers in China
by Jing Zhou, Chengfei Li and Yan Cheng
Behav. Sci. 2025, 15(8), 1050; https://doi.org/10.3390/bs15081050 - 3 Aug 2025
Viewed by 398
Abstract
This study investigates the evolving pedagogical strategies and professional identity development of two novice college English teachers in China through a semester-long classroom-based inquiry. Drawing on Norris’s Multimodal (Inter)action Analysis (MIA), it analyzes 270 min of video-recorded lessons across three instructional stages, supported [...] Read more.
This study investigates the evolving pedagogical strategies and professional identity development of two novice college English teachers in China through a semester-long classroom-based inquiry. Drawing on Norris’s Multimodal (Inter)action Analysis (MIA), it analyzes 270 min of video-recorded lessons across three instructional stages, supported by visual transcripts and pitch-intensity spectrograms. The analysis reveals each teacher’s transformation from textbook-reliant instruction to student-centered pedagogy, facilitated by multimodal strategies such as gaze, vocal pitch, gesture, and head movement. These shifts unfold across the following three evolving identity configurations: compliance, experimentation, and dialogic enactment. Rather than following a linear path, identity development is shown as a negotiated process shaped by institutional demands and classroom interactional realities. By foregrounding the multimodal enactment of self in a non-Western educational context, this study offers insights into how novice EFL teachers navigate tensions between traditional discourse norms and reform-driven pedagogical expectations, contributing to broader understandings of identity formation in global higher education. Full article
Show Figures

Figure 1

20 pages, 19537 KiB  
Article
Submarine Topography Classification Using ConDenseNet with Label Smoothing Regularization
by Jingyan Zhang, Kongwen Zhang and Jiangtao Liu
Remote Sens. 2025, 17(15), 2686; https://doi.org/10.3390/rs17152686 - 3 Aug 2025
Viewed by 315
Abstract
The classification of submarine topography and geomorphology is essential for marine resource exploitation and ocean engineering, with wide-ranging implications in marine geology, disaster assessment, resource exploration, and autonomous underwater navigation. Submarine landscapes are highly complex and diverse. Traditional visual interpretation methods are not [...] Read more.
The classification of submarine topography and geomorphology is essential for marine resource exploitation and ocean engineering, with wide-ranging implications in marine geology, disaster assessment, resource exploration, and autonomous underwater navigation. Submarine landscapes are highly complex and diverse. Traditional visual interpretation methods are not only inefficient and subjective but also lack the precision required for high-accuracy classification. While many machine learning and deep learning models have achieved promising results in image classification, limited work has been performed on integrating backscatter and bathymetric data for multi-source processing. Existing approaches often suffer from high computational costs and excessive hyperparameter demands. In this study, we propose a novel approach that integrates pruning-enhanced ConDenseNet with label smoothing regularization to reduce misclassification, strengthen the cross-entropy loss function, and significantly lower model complexity. Our method improves classification accuracy by 2% to 10%, reduces the number of hyperparameters by 50% to 96%, and cuts computation time by 50% to 85.5% compared to state-of-the-art models, including AlexNet, VGG, ResNet, and Vision Transformer. These results demonstrate the effectiveness and efficiency of our model for multi-source submarine topography classification. Full article
Show Figures

Figure 1

Back to TopTop