Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (194)

Search Parameters:
Keywords = decoding difficulties

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 4080 KB  
Article
An Unsupervised Situation Awareness Framework for UAV Sensor Data Fusion Enabled by a Stabilized Deep Variational Autoencoder
by Anxin Guo, Zhenxing Zhang, Rennong Yang, Ying Zhang, Liping Hu and Leyan Li
Sensors 2026, 26(1), 111; https://doi.org/10.3390/s26010111 - 24 Dec 2025
Viewed by 201
Abstract
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training [...] Read more.
Effective situation awareness relies on the robust processing of high-dimensional data streams generated by onboard sensors. However, the application of deep generative models to extract features from complex UAV sensor data (e.g., GPS, IMU, and radar feeds) faces two fundamental challenges: critical training instability and the difficulty of representing multi-modal distributions inherent in dynamic flight maneuvers. To address this, this paper proposes a novel unsupervised sensor data processing framework to overcome these issues. Our core innovation is a deep generative model, VAE-WRBM-MDN, specifically engineered for stable feature extraction from non-linear time-series sensor data. We demonstrate that while standard Variational Autoencoders (VAEs) often struggle to converge on this task, our introduction of Weighted-uncertainty Restricted Boltzmann Machines (WRBM) for layer-wise pre-training ensures stable learning. Furthermore, the integration of a Mixture Density Network (MDN) enables the decoder to accurately reconstruct the complex, multi-modal conditional distributions of sensor readings. Comparative experiments validate our approach, achieving 95.69% classification accuracy in identifying situational patterns. The results confirm that our framework provides robust enabling technology for real-time intelligent sensing and raw data interpretation in autonomous systems. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

34 pages, 2428 KB  
Article
An In-Depth Investigation of Eye Movement Profile of Dyslexic Readers Using a Standardized Text-Reading Aloud Task in French
by Antonin Rossier-Bisaillon, Julie Robidoux, Brigitte Stanké and Boutheina Jemel
Behav. Sci. 2026, 16(1), 18; https://doi.org/10.3390/bs16010018 - 21 Dec 2025
Viewed by 194
Abstract
(1) Background: Most eye-movement studies in dyslexia focus on silent reading in controlled laboratory settings. Yet, oral reading of standardized texts remains central for identifying this disorder. By combining eye-tracking with oral reading, we captured both fixation dynamics and eye–voice span (EVS) measures, [...] Read more.
(1) Background: Most eye-movement studies in dyslexia focus on silent reading in controlled laboratory settings. Yet, oral reading of standardized texts remains central for identifying this disorder. By combining eye-tracking with oral reading, we captured both fixation dynamics and eye–voice span (EVS) measures, offering a richer view of the processes underlying dyslexia. (2) Methods: We tested 10 adults with dyslexia and 14 controls as they read aloud an unpredictable diagnostic text in French. Analyses examined psycholinguistic effects of word length and lexical frequency on fixation probabilities, counts, and durations, alongside EVS measures. (3) Results: Compared to controls, adults with dyslexia read more slowly, made more errors, and showed atypical fixation patterns: persistent word length effects, reduced frequency effects, and diminished, unstable EVS. (4) Conclusions: Together, eye-movement and EVS findings converge on a key mechanism: adults with dyslexia continue to rely heavily on sublexical decoding. This reliance creates a processing bottleneck in oral reading, where difficulties in rapid word identification cascade into sounding-out behavior and disrupted eye–voice coordination. Full article
(This article belongs to the Special Issue Understanding Dyslexia and Developmental Language Disorders)
Show Figures

Figure 1

16 pages, 1945 KB  
Article
Error-Guided Multimodal Sample Selection with Hallucination Suppression for LVLMs
by Huanyu Cheng, Linjiang Shang, Xikang Chen, Tao Feng and Yin Zhang
Computers 2025, 14(12), 564; https://doi.org/10.3390/computers14120564 - 17 Dec 2025
Viewed by 231
Abstract
Building high-quality multimodal instruction datasets is often time-consuming and costly. Recent studies have shown that a small amount of carefully selected high-quality data can be more effective for improving LVLM performance than large volumes of low-quality data. Based on these observations, we propose [...] Read more.
Building high-quality multimodal instruction datasets is often time-consuming and costly. Recent studies have shown that a small amount of carefully selected high-quality data can be more effective for improving LVLM performance than large volumes of low-quality data. Based on these observations, we propose an error-guided multimodal sample selection framework with hallucination suppression for LVLM fine-tuning. First, semantic embeddings of queries are clustered to form balanced subsets that preserve task diversity. A visual contrastive decoding module is then used to reduce hallucinations and expose genuinely difficult examples. For closed-ended tasks, such as object detection, we estimate sample value using prediction accuracy; for open-ended question answering, we use the perplexity of generated responses as a difficulty signal. Within each cluster, high-error or high-perplexity samples are preferentially selected to construct a compact yet informative training set. Experiments on the InsPLAD detection benchmark and the PowerQA visual question answering dataset show that our method consistently outperforms random sampling under the same data budget, achieving higher F1, cosine similarity, BLEU (Bilingual Evaluation Understudy), and GPT-4o-based evaluation scores. This demonstrates that hallucination-aware, uncertainty-driven data selection can improve LVLM robustness and data efficiency. Full article
Show Figures

Figure 1

22 pages, 5552 KB  
Article
MSA-UNet: Multiscale Feature Aggregation with Attentive Skip Connections for Precise Building Extraction
by Guobiao Yao, Yan Chen, Wenxiao Sun, Zeyu Zhang, Yifei Tang and Jingxue Bi
ISPRS Int. J. Geo-Inf. 2025, 14(12), 497; https://doi.org/10.3390/ijgi14120497 - 17 Dec 2025
Viewed by 223
Abstract
An accurate and reliable extraction of building structures from high-resolution (HR) remote sensing images is an important research topic in 3D cartography and smart city construction. However, despite the strong overall performance of recent deep learning models, limitations remain in handling significant variations [...] Read more.
An accurate and reliable extraction of building structures from high-resolution (HR) remote sensing images is an important research topic in 3D cartography and smart city construction. However, despite the strong overall performance of recent deep learning models, limitations remain in handling significant variations in building scales and complex architectural forms, which may lead to inaccurate boundaries or difficulties in extracting small or irregular structures. Therefore, the present study proposes MSA-UNet, a reliable semantic segmentation framework that leverages multiscale feature aggregation and attentive skip connections for an accurate extraction of building footprints. This framework is constructed based on the U-Net architecture, incorporating VGG16 as a replacement for the original encoder structure, which enhances its ability to capture low-discriminative features. To further improve the representation of image buildings with different scales and shapes, a serial coarse-to-fine feature aggregation mechanism was used. Additionally, a novel skip connection was built between the encoder and decoder layers to enable adaptive weights. Furthermore, a dual-attention mechanism, implemented through the convolutional block attention module, was integrated to enhance the focus of the network on building regions. Extensive experiments conducted on the WHU and Inria building datasets validated the effectiveness of MSA-UNet. On the WHU dataset, the model demonstrated a state-of-the-art performance with a mean Intersection over Union (mIoU) of 94.26%, accuracy of 98.32%, F1-score of 96.57%, and mean Pixel accuracy (mPA) of 96.85%, corresponding to gains of 1.41% in mIoU over the baseline U-Net. On the more challenging Inria dataset, MSA-UNet achieved an mIoU of 85.92%, indicating a consistent improvement of up to 1.9% over the baseline U-Net. These results confirmed that MSA-UNet markedly improved the accuracy and boundary integrity of building extraction from HR data, outperforming existing classic models in terms of segmentation quality and robustness. Full article
(This article belongs to the Special Issue Spatial Data Science and Knowledge Discovery)
Show Figures

Figure 1

24 pages, 4961 KB  
Article
U-PKAN: A Dual-Module Kolmogorov–Arnold Network for Agricultural Plant Disease Detection
by Dejun Xi, Baotong Zhang and Yi-Jia Wang
Agriculture 2025, 15(24), 2599; https://doi.org/10.3390/agriculture15242599 - 16 Dec 2025
Viewed by 258
Abstract
Crop diseases and pests have a significant impact on planting costs and crop yields and, in severe cases, can threaten food security and farmers’ incomes. Currently, most researchers employ various deep learning methods, such as the YOLO series algorithms and U-Net and its [...] Read more.
Crop diseases and pests have a significant impact on planting costs and crop yields and, in severe cases, can threaten food security and farmers’ incomes. Currently, most researchers employ various deep learning methods, such as the YOLO series algorithms and U-Net and its variants, for the detection of agricultural plant diseases. However, the existing algorithms suffer from insufficient interpretability and are limited to linear modeling, which can lead to issues such as trust crises in current technologies, restricted applications and difficulties in tracing and correcting errors. To address these issues, a dual-module Kolmogorov–Arnold Network (U-PKAN) is proposed for agricultural plant disease detection in this paper. A KAN encoder–decoder structure is adopted to construct the network. To ensure the network fully extracts features, two different modules, namely Patchembed-KAN (P-KAN) and Decoder-KAN (D-KAN), are designed. To enhance the network’s feature fusion capability, a KAN-based symmetrical structure for skip connections is designed. The proposed method places learnable activation functions on weights, enabling it to achieve higher accuracy with fewer parameters. Moreover, it can reveal the compositional structure and variable dependencies of synthetic datasets through symbolic formulas, thus exhibiting excellent interpretability. A field corn disease image dataset was collected and constructed. Additionally, the performance of the U-PKAN model was verified using the open plant disease dataset PlantDoc and a gear pitting dataset. To better understand the performance differences between different methods, U-PKAN was compared with U-KAN, U-Net, AttUNet, and U-Net++ models for performance benchmarking. IoU and the Dice coefficient were chosen as evaluation metrics. The experimental results demonstrate that the proposed method achieves faster convergence and higher segmentation accuracy. Overall, the proposed method demonstrates outstanding performance in aspects such as function approximation, global perception, interpretability and computational efficiency. Full article
Show Figures

Figure 1

25 pages, 1790 KB  
Article
Writing with Decoding and Spelling Difficulties—A Qualitative Perspective
by Yvonne Knospe, Nina Vandermeulen, Maria Levlin, Christian Waldmann and Eva Lindgren
Educ. Sci. 2025, 15(12), 1637; https://doi.org/10.3390/educsci15121637 - 5 Dec 2025
Viewed by 386
Abstract
Writers with decoding and/or spelling difficulties often produce short, lower-quality texts and experience less fluent writing, frequently interrupted by long pauses at the word level. Research suggests that from adolescence onward, such writers become increasingly aware of their difficulties, which influences behaviours such [...] Read more.
Writers with decoding and/or spelling difficulties often produce short, lower-quality texts and experience less fluent writing, frequently interrupted by long pauses at the word level. Research suggests that from adolescence onward, such writers become increasingly aware of their difficulties, which influences behaviours such as avoiding difficult-to-spell words and pausing for lexical decisions. The objective of this study was to deepen the understanding of how adolescent students with decoding and spelling difficulties engage in the task of text composition. In this multiple case study, we qualitatively investigated argumentative texts and writing processes produced by three Swedish upper-secondary students with such difficulties. Data were collected through keystroke logging and analyses of texts and keystroke logs provided detailed insights into their individual writing approaches. The results generally align with previous findings but reveal notable differences depending on the severity of the difficulties. Two students with moderate challenges paused extensively to consider spelling, formulation, and word choice, while one student with more pronounced difficulties wrote rapidly and briefly to complete the task quickly. This nuanced analysis highlights the diversity of writing profiles among students with decoding and spelling difficulties and underscores the need for tailored support to help them produce higher-quality texts. Full article
(This article belongs to the Special Issue Students with Special Educational Needs in Reading and Writing)
Show Figures

Figure 1

29 pages, 43944 KB  
Article
GPRNet: A Geometric Prior-Refined Semantic Segmentation Network for Land Use and Land Cover Mapping
by Zhuozheng Li, Zhennan Xu, Runliang Xia, Jiahao Sun, Ruihui Mu, Liang Chen, Daofang Liu and Xin Li
Remote Sens. 2025, 17(23), 3856; https://doi.org/10.3390/rs17233856 - 28 Nov 2025
Viewed by 385
Abstract
Semantic segmentation of high-resolution remote sensing images remains a challenging task due to the intricate spatial structures, scale variability, and semantic ambiguity among ground objects. Moreover, the reliable delineation of fine-grained boundaries continues to impose difficulties on existing CNN- and transformer-based models, particularly [...] Read more.
Semantic segmentation of high-resolution remote sensing images remains a challenging task due to the intricate spatial structures, scale variability, and semantic ambiguity among ground objects. Moreover, the reliable delineation of fine-grained boundaries continues to impose difficulties on existing CNN- and transformer-based models, particularly in heterogeneous urban and rural environments. In this study, we propose GPRNet, a novel geometry-aware segmentation framework that leverages geometric priors and cross-stage semantic alignment for more precise land-cover classification. Central to our approach is the Geometric Prior-Refined Block (GPRB), which learns directional derivative filters, initialized with Sobel-like operators, to generate edge-aware strength and orientation maps that explicitly encode structural cues. These maps are used to guide structure-aware attention modulation, enabling refined spatial localization. Additionally, we introduce the Mutual Calibrated Fusion Module (MCFM) to mitigate the semantic gap between encoder and decoder features by incorporating cross-stage geometric alignment and semantic enhancement mechanisms. Extensive experiments conducted on the ISPRS Potsdam and LoveDA datasets validate the effectiveness of the proposed method, with GPRNet achieving improvements of up to 1.7% mIoU on Potsdam and 1.3% mIoU on LoveDA over strong recent baselines. Furthermore, the model maintains competitive inference efficiency, suggesting a favorable balance between accuracy and computational cost. These results demonstrate the promising potential of geometric-prior integration and mutual calibration in advancing semantic segmentation in complex environments. Full article
Show Figures

Figure 1

20 pages, 2776 KB  
Article
AgriFusion: Multiscale RGB–NIR Fusion for Semantic Segmentation in Airborne Agricultural Imagery
by Xuechen Li, Lang Qiao and Ce Yang
AgriEngineering 2025, 7(11), 388; https://doi.org/10.3390/agriengineering7110388 - 15 Nov 2025
Cited by 1 | Viewed by 894
Abstract
The rapid development of unmanned aerial vehicles (UAVs) and deep learning has accelerated the application of semantic segmentation in precision agriculture (SSPA). A key driver of this progress lies in multimodal fusion, which leverages complementary structural, spectral, and physiological information to enhance the [...] Read more.
The rapid development of unmanned aerial vehicles (UAVs) and deep learning has accelerated the application of semantic segmentation in precision agriculture (SSPA). A key driver of this progress lies in multimodal fusion, which leverages complementary structural, spectral, and physiological information to enhance the representation of complex agricultural scenes. Despite advancements, the efficacy of multimodal fusion in SSPA is limited by modality heterogeneity and the difficulty of simultaneously retaining fine details and capturing global context. To address these challenges, we propose AgriFusion, a dual-encoder framework based on convolutional and transformer architectures for SSPA tasks. Specifically, convolutional and transformer encoders are first used to extract crop-related local structural details and global contextual features from multimodal inputs. Then, an attention-based fusion module adaptively integrates these complementary features in a modality-aware manner. Finally, a MLP-based decoder aggregates multi-scale representations to generate accurate segmentation results efficiently. Experiments conducted on the Agriculture-Vision dataset demonstrate that AgriFusion achieves a mean Intersection over Union (mIoU) of 49.31%, Pixel Accuracy (PA) of 81.72%, and F1 score of 67.85%, outperforming competitive baselines including SegFormer, DeepLab, and AAFormer. Ablation studies further reveal that unimodal or shallow fusion strategies suffer from limited discriminative capacity, whereas AgriFusion adaptively integrates complementary multimodal features and balances fine-grained local detail with global contextual information, yielding consistent improvements in identifying planting anomalies and crop stresses. These findings validate our central claims that modality-aware spectral fusion and balanced multi-scale representation are critical to advancing agricultural semantic segmentation, and establish AgriFusion as a principled framework for enhancing remote sensing-based monitoring with practical implications for sustainable crop management and precision farming. Full article
Show Figures

Figure 1

19 pages, 4748 KB  
Article
MPCFN: A Multilevel Predictive Cross-Fusion Network for Multimodal Named Entity Recognition in Social Media
by Qinjun Qiu, Bo Tan, Yukuan Zhou, Wenjing Chen, Miao Tian and Liufeng Tao
Appl. Sci. 2025, 15(22), 11855; https://doi.org/10.3390/app152211855 - 7 Nov 2025
Viewed by 461
Abstract
The goal of the Multimodal Named Entity Recognition (MNER) job is to identify and classify named entities by combining various data modalities (such as text and images) and assigning them to specified categories. The growing prevalence of multimodal social media posts has spurred [...] Read more.
The goal of the Multimodal Named Entity Recognition (MNER) job is to identify and classify named entities by combining various data modalities (such as text and images) and assigning them to specified categories. The growing prevalence of multimodal social media posts has spurred heightened interest in MNER, particularly due to its pivotal role in applications ranging from intention comprehension to personalized user recommendations. In the MNER task, the inconsistency between image information and text information and the difficulty of fully utilizing the image information to complement the text information are the two main difficulties currently faced. In order to solve these problems, this study proposes a Multilevel Predictive Cross-Fusion Network (MPCFN) approach for Multimodal Named Entity Recognition. First, textual features are extracted using BERT and visual features are extracted using ResNet, then irrelevant information in the image is filtered using the Correlation Prediction Gate. Second, the hierarchy of visual features received by each Transformer block is controlled by the Dynamic Gate and aligned between image and textual features using the Cross-Fusion Module to align the image and text features. Finally, the hidden layer representation is fed into the CRF layer optimized for decoding using Flooding. Through experiments on TWITTER-2015, TWITTER-2017, and WuKong datasets, our method achieves F1 scores of 76.74%, 87.61%, and 82.35%, outperforming the existing mainstream state-of-the-art models and proving the effectiveness and superiority of our method. Full article
Show Figures

Figure 1

25 pages, 1392 KB  
Article
Theoretical Foundation and Validation of the Record of Decision-Making (RODM)
by Emily M. Rodgers and Jerome V. D’Agostino
Educ. Sci. 2025, 15(11), 1483; https://doi.org/10.3390/educsci15111483 - 4 Nov 2025
Viewed by 568
Abstract
This study presents the development and validation of the Record of Decision-Making (RODM), a formative assessment designed to measure beginning readers’ use of phonic elements to decode unknown words while reading. Grounded in overlapping wave theory and theories of early reading development, the [...] Read more.
This study presents the development and validation of the Record of Decision-Making (RODM), a formative assessment designed to measure beginning readers’ use of phonic elements to decode unknown words while reading. Grounded in overlapping wave theory and theories of early reading development, the RODM captures adaptive strategy use during oral reading, including rereading and subword analysis. Using multifaceted Rasch modeling, the authors demonstrate that RODM scores align with a unidimensional reading proficiency scale and reflect predictable patterns of strategy use across proficiency levels. Findings indicate that as reading proficiency increases, students employ a broader range of phonic elements and shift from basic strategies (e.g., initial letter use) to more sophisticated ones (e.g., medial and final letter use). Additionally, proficient readers exhibit greater self-correction and reduced reliance on rereading. Generalizability analysis yielded strong interrater reliability and accuracy with minimal training, suggesting its practical utility for frequent classroom use. Implications for instruction include the need to teach flexible, efficient decoding strategies that adapt to task difficulty. Future research should explore score consistency with educators in classroom settings and instructional impact. Full article
(This article belongs to the Special Issue Advances in Evidence-Based Literacy Instructional Practices)
Show Figures

Figure 1

18 pages, 16806 KB  
Article
Refined Extraction of Sugarcane Planting Areas in Guangxi Using an Improved U-Net Model
by Tao Yue, Zijun Ling, Yuebiao Tang, Jingjin Huang, Hongteng Fang, Siyuan Ma, Jie Tang, Yun Chen and Hong Huang
Drones 2025, 9(11), 754; https://doi.org/10.3390/drones9110754 - 30 Oct 2025
Viewed by 427
Abstract
Sugarcane, a vital economic crop and renewable energy source, requires precise monitoring of the area in which it has been planted to ensure sugar industry security, optimize agricultural resource allocation, and allow the assessment of ecological benefits. Guangxi Zhuang Autonomous Region, leveraging its [...] Read more.
Sugarcane, a vital economic crop and renewable energy source, requires precise monitoring of the area in which it has been planted to ensure sugar industry security, optimize agricultural resource allocation, and allow the assessment of ecological benefits. Guangxi Zhuang Autonomous Region, leveraging its subtropical climate and abundant solar thermal resources, accounts for over 63% of China’s total sugarcane cultivation area. In this study, we constructed an enhanced RCAU-net model and developed a refined extraction framework that considers different growth stages to enable rapid identification of sugarcane planting areas. This study addresses key challenges in remote-sensing-based sugarcane extraction, namely, the difficulty of distinguishing spectrally similar objects, significant background interference, and insufficient multi-scale feature fusion. To significantly enhance the accuracy and robustness of sugarcane identification, an improved RCAU-net model based on the U-net architecture was designed. The model incorporates three key improvements: it replaces the original encoder with ResNet50 residual modules to enhance discrimination of similar crops; it integrates a Convolutional Block Attention Module (CBAM) to focus on critical features and effectively suppress background interference; and it employs an Atrous Spatial Pyramid Pooling (ASPP) module to bridge the encoder and decoder, thereby optimizing the extraction of multi-scale contextual information. A refined extraction framework that accounts for different growth stages was ultimately constructed to achieve rapid identification of sugarcane planting areas in Guangxi. The experimental results demonstrate that the RCAU-net model performed excellently, achieving an Overall Accuracy (OA) of 97.19%, a Mean Intersection over Union (mIoU) of 94.47%, a Precision of 97.31%, and an F1 Score of 97.16%. These results represent significant improvements of 7.20, 10.02, 6.82, and 7.28 percentage points in OA, mIoU, Precision, and F1 Score, respectively, relative to the original U-net. The model also achieved a Kappa coefficient of 0.9419 and a Recall rate of 96.99%. The incorporation of residual structures significantly reduced the misclassification of similar crops, while the CBAM and ASPP modules minimized holes within large continuous patches and false extractions of small patches, resulting in smoother boundaries for the extracted areas. This work provides reliable data support for the accurate calculation of sugarcane planting area and greatly enhances the decision-making value of remote sensing monitoring in modern agricultural management of sugarcane. Full article
Show Figures

Figure 1

26 pages, 7247 KB  
Article
DyslexiaNet: Examining the Viability and Efficacy of Eye Movement-Based Deep Learning for Dyslexia Detection
by Ramis İleri, Çiğdem Gülüzar Altıntop, Fatma Latifoğlu and Esra Demirci
J. Eye Mov. Res. 2025, 18(5), 56; https://doi.org/10.3390/jemr18050056 - 15 Oct 2025
Viewed by 688
Abstract
Dyslexia is a neurodevelopmental disorder that impairs reading, affecting 5–17.5% of children and representing the most common learning disability. Individuals with dyslexia experience decoding, reading fluency, and comprehension difficulties, hindering vocabulary development and learning. Early and accurate identification is essential for targeted interventions. [...] Read more.
Dyslexia is a neurodevelopmental disorder that impairs reading, affecting 5–17.5% of children and representing the most common learning disability. Individuals with dyslexia experience decoding, reading fluency, and comprehension difficulties, hindering vocabulary development and learning. Early and accurate identification is essential for targeted interventions. Traditional diagnostic methods rely on behavioral assessments and neuropsychological tests, which can be time-consuming and subjective. Recent studies suggest that physiological signals, such as electrooculography (EOG), can provide objective insights into reading-related cognitive and visual processes. Despite this potential, there is limited research on how typeface and font characteristics influence reading performance in dyslexic children using EOG measurements. To address this gap, we investigated the most suitable typefaces for Turkish-speaking children with dyslexia by analyzing EOG signals recorded during reading tasks. We developed a novel deep learning framework, DyslexiaNet, using scalogram images from horizontal and vertical EOG channels, and compared it with AlexNet, MobileNet, and ResNet. Reading performance indicators, including reading time, blink rate, regression rate, and EOG signal energy, were evaluated across multiple typefaces and font sizes. Results showed that typeface significantly affects reading efficiency in dyslexic children. The BonvenoCF font was associated with shorter reading times, fewer regressions, and lower cognitive load. DyslexiaNet achieved the highest classification accuracy (99.96% for horizontal channels) while requiring lower computational load than other networks. These findings demonstrate that EOG-based physiological measurements combined with deep learning offer a non-invasive, objective approach for dyslexia detection and personalized typeface selection. This method can provide practical guidance for designing educational materials and support clinicians in early diagnosis and individualized intervention strategies for children with dyslexia. Full article
Show Figures

Figure 1

32 pages, 16554 KB  
Article
A Multi-Task Fusion Model Combining Mixture-of-Experts and Mamba for Facial Beauty Prediction
by Junying Gan, Zhenxin Zhuang, Hantian Chen, Wenchao Xu, Zhen Chen and Huicong Li
Symmetry 2025, 17(10), 1600; https://doi.org/10.3390/sym17101600 - 26 Sep 2025
Viewed by 1908
Abstract
Facial beauty prediction (FBP) is a cutting-edge task in deep learning that aims to equip machines with the ability to assess facial attractiveness in a human-like manner. In human perception, facial beauty is strongly associated with facial symmetry, where balanced structures often reflect [...] Read more.
Facial beauty prediction (FBP) is a cutting-edge task in deep learning that aims to equip machines with the ability to assess facial attractiveness in a human-like manner. In human perception, facial beauty is strongly associated with facial symmetry, where balanced structures often reflect aesthetic appeal. Leveraging symmetry provides an interpretable prior for FBP and offers geometric constraints that enhance feature learning. However, existing multi-task FBP models still face challenges such as limited annotated data, insufficient frequency–temporal modeling, and feature conflicts from task heterogeneity. The Mamba model excels in feature extraction and long-range dependency modeling but encounters difficulties in parameter sharing and computational efficiency in multi-task settings. In contrast, mixture-of-experts (MoE) enables adaptive expert selection, reducing redundancy while enhancing task specialization. This paper proposes MoMamba, a multi-task decoder combining Mamba’s state-space modeling with MoE’s dynamic routing to improve multi-scale feature fusion and adaptability. A detail enhancement module fuses high- and low-frequency components from discrete cosine transform with temporal features from Mamba, and a state-aware MoE module incorporates low-rank expert modeling and task-specific decoding. Experiments on SCUT-FBP and SCUT-FBP5500 demonstrate superior performance in both classification and regression, particularly in symmetry-related perception modeling. Full article
Show Figures

Figure 1

27 pages, 13123 KB  
Article
Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images
by Shengnan Zhang, Xinming Cui, Guangkun Ma and Ronghui Tian
Symmetry 2025, 17(9), 1506; https://doi.org/10.3390/sym17091506 - 10 Sep 2025
Cited by 1 | Viewed by 3945
Abstract
Accurate glomerular segmentation in renal pathological images is a key challenge for chronic kidney disease diagnosis and assessment. Due to the high visual similarity between pathological glomeruli and surrounding tissues in color, texture, and morphology, significant “camouflage phenomena” exist, leading to boundary identification [...] Read more.
Accurate glomerular segmentation in renal pathological images is a key challenge for chronic kidney disease diagnosis and assessment. Due to the high visual similarity between pathological glomeruli and surrounding tissues in color, texture, and morphology, significant “camouflage phenomena” exist, leading to boundary identification difficulties. To address this problem, we propose BM-UNet, a novel segmentation framework that embeds boundary guidance mechanisms into a Mamba architecture with a symmetric encoder–decoder design. The framework enhances feature transmission through explicit boundary detection, incorporating four core modules designed for key challenges in pathological image segmentation. The Multi-scale Adaptive Fusion (MAF) module processes irregular tissue morphology, the Hybrid Boundary Detection (HBD) module handles boundary feature extraction, the Boundary-guided Attention (BGA) module achieves boundary-aware feature refinement, and the Mamba-based Fused Decoder Block (MFDB) completes boundary-preserving reconstruction. By introducing explicit boundary supervision mechanisms, the framework achieves significant segmentation accuracy improvements while maintaining linear computational complexity. Validation on the KPIs2024 glomerular dataset and HuBMAP renal tissue samples demonstrates that BM-UNet achieves a 92.4–95.3% mean Intersection over Union across different CKD pathological conditions, with a 4.57% improvement over the Mamba baseline and a processing speed of 113.7 FPS. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

14 pages, 2024 KB  
Article
Field Robotics Education Through Educational Escape Rooms—A Design Study
by Robert Ross and Matthew Felicetti
Big Data Cogn. Comput. 2025, 9(9), 233; https://doi.org/10.3390/bdcc9090233 - 8 Sep 2025
Viewed by 850
Abstract
One challenge faced by many educators is strongly engaging students to improve their intrinsic motivation in learning. This paper describes the design and beta testing of two educational escape rooms targeted towards teaching students concepts related to field robotics—an area in which educational [...] Read more.
One challenge faced by many educators is strongly engaging students to improve their intrinsic motivation in learning. This paper describes the design and beta testing of two educational escape rooms targeted towards teaching students concepts related to field robotics—an area in which educational escape rooms have yet to be used. These table-top activities are designed to strongly engage students with robotics-centric puzzles, a fun narrative, and collaborative problem-solving, with validation provided by an electronic decoder box. The sets of puzzles were beta-tested by teams of academics with a robotics background and by undergraduate students. The results indicate that participants had a high level of enjoyment and intrinsic motivation to partake in the activities, although the difficulty and in-game dynamics of some of the tasks will need to be modified for widespread deployment in the classroom. Full article
(This article belongs to the Special Issue Field Robotics and Artificial Intelligence (AI))
Show Figures

Figure 1

Back to TopTop