MDPI - Publisher of Open Access Journals

15 pages, 667 KB

Open AccessArticle

Speech-to-Sign Gesture Translation for Kazakh: Dataset and Sign Gesture Translation System

by Akdaulet Mnuarbek, Akbayan Bekarystankyzy, Mussa Turdalyuly, Dina Oralbekova and Alibek Dyussemkhanov

Computers 2026, 15(3), 188; https://doi.org/10.3390/computers15030188 - 15 Mar 2026

Viewed by 320

Abstract

This paper presents the first prototype of a speech-to-sign language translation system for Kazakh Sign Language (KRSL). The proposed pipeline integrates the NVIDIA FastConformer model for automatic speech recognition (ASR) in the Kazakh language and addresses the challenges of sign language translation in [...] Read more.

This paper presents the first prototype of a speech-to-sign language translation system for Kazakh Sign Language (KRSL). The proposed pipeline integrates the NVIDIA FastConformer model for automatic speech recognition (ASR) in the Kazakh language and addresses the challenges of sign language translation in a low-resource setting. Unlike American or British Sign Languages, KRSL lacks publicly available datasets and established translation systems. The pipeline follows a multi-stage process: speech input is converted into text via ASR, segmented into phrases, matched with corresponding gestures, and visualized as sign language. System performance is evaluated using word error rate (WER) for ASR and accuracy metrics for speech-to-sign translation. This study also introduces the first KRSL dataset, consisting of 1200 manually recreated signs, including 95% static images and 5% dynamic gesture videos. To improve robustness under resource-constrained conditions, a Weighted Hybrid Similarity Score (WHSS)-based gesture matching method is proposed. Experimental results show that the FastConformer model achieves an average WER of 10.55%, with 7.8% for isolated words and 13.3% for full sentences. At the phrase level, the system achieves 92.1% accuracy for unigrams, 84.6% for bigrams, and 78.3% for trigrams. The complete pipeline reaches 85% accuracy for individual words and 70% for sentences, with an average latency of 310 ms. These results demonstrate the feasibility and effectiveness of the proposed system for supporting people with hearing and speech impairments in Kazakhstan. Full article

(This article belongs to the Special Issue Machine Learning: Innovation, Implementation, and Impact)

► Show Figures

Figure 1

19 pages, 764 KB

Open AccessArticle

FeOCR: Domain-Adaptive Chinese OCR with Visual Character Disambiguation and LLM-Based Correction for Metallurgical Documents

by Qiang Zheng, Yaxuan Sun, Lin Wang, Haoning Zhang, Fanjie Meng and Minghui Li

Electronics 2026, 15(6), 1144; https://doi.org/10.3390/electronics15061144 - 10 Mar 2026

Viewed by 294

Abstract

High-quality text corpora are essential for knowledge graph construction and domain-specific large model pre-training in technology-intensive industries, with the steel metallurgy sector serving as a representative case. However, many industrial documents remain in scanned or PDF formats, where general-purpose Optical Character Recognition (OCR) [...] Read more.

High-quality text corpora are essential for knowledge graph construction and domain-specific large model pre-training in technology-intensive industries, with the steel metallurgy sector serving as a representative case. However, many industrial documents remain in scanned or PDF formats, where general-purpose Optical Character Recognition (OCR) systems exhibit systematic errors when recognizing Chinese metallurgical documents. In particular, visually similar Chinese characters that differ by only minor strokes are frequently confused, leading to severe degradation of text reliability and cascading errors in downstream knowledge extraction. This paper proposes FeOCR, a general-purpose domain-adaptive framework for machine-printed Chinese characters, which is specifically evaluated within the context of the steel metallurgy industry. The framework integrates visual character disambiguation with context-aware semantic correction. We first construct a metallurgy-specific OCR dataset emphasizing high-frequency confusable Chinese word pairs and enhance data diversity through font perturbation and noise synthesis. Parameter-efficient fine-tuning (LoRA) is then applied to adapt a general OCR model to domain-specific visual patterns. Furthermore, a Large Language Model-based correction module performs semantic refinement of residual errors under domain lexical constraints. Experiments demonstrate significant reductions in character and word error rates, especially for confusable technical terms, providing a reliable foundation for industrial Chinese document digitization. Full article

► Show Figures

Figure 1

13 pages, 1494 KB

Open AccessArticle

Development and Clinical Validation of an Artificial Intelligence-Based Automated Visual Acuity Testing System

by Kelvin Zhenghao Li, Hnin Hnin Oo, Kenneth Chee Wei Liang, Najah Ismail, Jasmine Ling Ling Chua, Jackson Jie Sheng Chng, Yang Wu, Daryl Wei Ren Wong, Sumaya Rani Khan, Boon Peng Yap, Rong Tong, Choon Meng Kiew, Yufei Huang, Chun Hau Chua, Alva Khai Shin Lim and Xiuyi Fan

Life 2026, 16(2), 357; https://doi.org/10.3390/life16020357 - 20 Feb 2026

Viewed by 646

Abstract

Background: To develop and validate an automated visual acuity (VA) testing system integrating artificial intelligence (AI)–driven speech and image recognition technologies, enabling self-administered, clinic-based VA assessment; Methods: The system incorporated a fine-tuned Whisper speech-recognition model with Silero voice activity detection and pose estimation [...] Read more.

Background: To develop and validate an automated visual acuity (VA) testing system integrating artificial intelligence (AI)–driven speech and image recognition technologies, enabling self-administered, clinic-based VA assessment; Methods: The system incorporated a fine-tuned Whisper speech-recognition model with Silero voice activity detection and pose estimation through facial landmark and ArUco marker detection. A state-driven interface guided users through sequential testing with and without a pinhole. Speech recognition was enhanced using a local Singaporean English dataset. Laboratory validation assessed speech and pose recognition performance, while clinical validation compared automated and manual VA testing at a tertiary eye clinic; Results: The fine-tuned model reduced word error rates from 17.83% to 9.81% for letters and 2.76% to 1.97% for numbers. Pose detection accurately identified valid occluder states. Among 72 participants (144 eyes), automated unaided VA showed good agreement with manual VA (ICC = 0.77, 95% CI 0.62–0.85), while pinhole VA demonstrated moderate agreement (ICC = 0.63, 95% CI 0.25–0.83). Automated testing took longer (132.1 ± 47.5 s vs. 97.1 ± 47.8 s; p < 0.001), but user experience remained positive (mean Likert scale score 4.3 ± 0.8); Conclusions: The AI-based automated VA system delivered accurate, reliable, and user-friendly performance, supporting its feasibility for clinical implementation. Full article

(This article belongs to the Section Biochemistry, Biophysics and Computational Biology)

► Show Figures

Figure 1

35 pages, 1268 KB

Open AccessArticle

Examining Local Residents’ Awareness of Tourism: The Case of Bingöl Province, Türkiye

by Zeki Gürbüz and Semra Çamuka

Tour. Hosp. 2026, 7(2), 46; https://doi.org/10.3390/tourhosp7020046 - 13 Feb 2026

Viewed by 839

Abstract

The awareness and participation of local communities play a critical role in the sustainable development of tourism. This study aims to examine, in depth and systematically, the levels of awareness among the local population regarding the natural, cultural, and gastronomic assets of Bingöl [...] Read more.

The awareness and participation of local communities play a critical role in the sustainable development of tourism. This study aims to examine, in depth and systematically, the levels of awareness among the local population regarding the natural, cultural, and gastronomic assets of Bingöl Province. The study includes structured questions, photo-based applications, word cloud analysis, and an interactive puzzle technique (word search puzzle) to measure participants’ levels of recognition and awareness of touristic values. The data were collected through face-to-face structured interviews conducted with 25 participants. The findings reveal that while participants exhibited high levels of awareness regarding well-known values such as the Floating Islands, Natural Monument, Haserek Ski Center, Bingöl Honey, Kös Thermal Springs, and the 33 Martyrs’ Monument, their awareness was comparatively limited for lesser known or newly registered values. The correct identification rates of the photographs indicate which local assets are more familiar to the public and which require further promotion efforts. The puzzle technique used within the study enabled participants to learn about touristic and cultural assets in a more effective, interactive, and engaging manner, thereby enhancing their visual and cognitive awareness. Overall, the use of visual materials and interactive methods in the promotion of Bingöl’s touristic and cultural assets is expected to increase public awareness and contribute to the preservation of local cultural heritage. This approach also provides valuable insights for the development of regional tourism strategies and sustainable tourism planning. Full article

► Show Figures

Figure 1

17 pages, 373 KB

Open AccessArticle

Exploring the Character Transposition Effect and Locus in Chinese Word Recognition: Evidence from Left–Right Visual Field Processing in Primary School Children

by Yi Song, Yuhan Jiang, Yuru Cheng, Lei Zhang and Jingxin Wang

Behav. Sci. 2026, 16(2), 251; https://doi.org/10.3390/bs16020251 - 9 Feb 2026

Viewed by 290

Abstract

Prior research has offered substantial evidence for letter transposition effect in word reading, yet studies in logographic languages such as Chinese are scarce and have largely focused on adults. This study aimed to determine whether second-grade children show character transposition effect impact in [...] Read more.

Prior research has offered substantial evidence for letter transposition effect in word reading, yet studies in logographic languages such as Chinese are scarce and have largely focused on adults. This study aimed to determine whether second-grade children show character transposition effect impact in recognizing two-character Chinese words and to examine potential differences between the left and right visual fields corresponding to the two cerebral hemispheres. A lexical decision task was used across two experiments. Experiment 1 tested 56 second graders and manipulated three stimulus types—normal words, Transposed pseudo-words, and Substituted pseudo-words—to verify the presence of the effect. Experiment 2 recruited an independent sample of 97 second graders and applied a lateralized presentation paradigm, presenting stimuli to either the right or left visual field (RVF/LVF), which project to the left and right hemispheres (LH/RH), respectively, to assess hemispheric differences. Experiment 1 revealed a significant character transposition effect among second-grade children. Experiment 2 showed no significant differences in the magnitude of the effect between the two visual fields. These findings provide new developmental evidence for Chinese word reading and important implications for theories of position encoding. Future studies should trace its developmental trajectory across a wider age range and diverse learning contexts. Full article

(This article belongs to the Section Cognition)

22 pages, 2447 KB

Open AccessArticle

Word-Level Motion Learning for Contactless QWERTY Typing with a Single Camera

by Sung-Sic Yoo and Heung-Shik Lee

Sensors 2026, 26(4), 1087; https://doi.org/10.3390/s26041087 - 7 Feb 2026

Viewed by 312

Abstract

Contactless text entry is increasingly important in immersive and constrained computing environments, yet most vision-based approaches rely on character-level recognition or key localization, which are fragile under monocular sensing. This study investigates the feasibility of recognizing natural QWERTY typing motions directly at the [...] Read more.

Contactless text entry is increasingly important in immersive and constrained computing environments, yet most vision-based approaches rely on character-level recognition or key localization, which are fragile under monocular sensing. This study investigates the feasibility of recognizing natural QWERTY typing motions directly at the word level using only a single RGB camera, under a fixed single-user and single-camera configuration. We propose a word-level contactless typing framework that models each word as a distinctive spatiotemporal finger motion pattern derived from hand joint trajectories. Typing motions are temporally segmented, and direction-aware finger displacements are accumulated to construct compact motion representations that are relatively insensitive to absolute hand position and typing duration within the evaluated setup. Each word is represented by multiple motion prototypes that are incrementally updated through online learning with a trial-delayed adaptation protocol. Experiments with vocabularies of up to 200 words show that the proposed approach progressively learns and recalls word-level motion patterns through repeated interaction, achieving stable recognition performance within the tested configuration at realistic typing speeds. Additional evaluations demonstrate that learned motion representations can transfer from physical keyboards to flat-surface typing within the same experimental setting, even when tactile feedback and visual layout cues are reduced. These results support the feasibility of reframing contactless typing as a word-level motion recall problem, and suggest its potential role as a complementary component to character-centric camera-based input methods under constrained monocular sensing. Full article

(This article belongs to the Topic AI Sensors and Transducers)

► Show Figures

Figure 1

20 pages, 2482 KB

Open AccessArticle

Compression-Efficient Feature Extraction Method for a CMOS Image Sensor

by Keiichiro Kuroda, Yu Osuka, Ryoya Iegaki, Ryuichi Ujiie, Hideki Shima, Kota Yoshida and Shunsuke Okura

Sensors 2026, 26(3), 962; https://doi.org/10.3390/s26030962 - 2 Feb 2026

Viewed by 383

Abstract

To address the power constraints of the emerging Internet of Things (IoT) era, we propose a compression-efficient feature extraction method for a CMOS image sensor that can extract binary feature data. This sensor outputs six-channel binary feature data, comprising three channels of binarized [...] Read more.

To address the power constraints of the emerging Internet of Things (IoT) era, we propose a compression-efficient feature extraction method for a CMOS image sensor that can extract binary feature data. This sensor outputs six-channel binary feature data, comprising three channels of binarized luminance signals and three channels of horizontal edge signals, compressed via a run length encoding (RLE) method. This approach significantly reduces data transmission volume while maintaining image recognition accuracy. The simulation results obtained using a YOLOv7-based model designed for edge GPUs demonstrate that our approach achieves a large object recognition accuracy (

{AP}_{L} 50

) of 60.7% on the COCO dataset while reducing the data size by 99.2% relative to conventional 8-bit RGB color images. Furthermore, the image classification results using MobileNetV3 tailored for mobile devices on the Visual Wake Words (VWW) dataset show that our approach reduces data size by 99.0% relative to conventional 8-bit RGB color images and achieves an image classification accuracy of 89.4%. These results are superior to the conventional trade-off between recognition accuracy and data size, thereby enabling the realization of low-power image recognition systems. Full article

(This article belongs to the Special Issue Advanced Signal and Image Processing Techniques for Sensor Applications)

► Show Figures

Figure 1

14 pages, 1691 KB

Open AccessArticle

Phonological Neighborhood Density and Type Modulate Visual Recognition of Mandarin Chinese: Evidence from Monosyllabic Words

by Zhongyan Jiao, Xianhui Zhou and Wenjun Chen

Brain Sci. 2025, 15(12), 1304; https://doi.org/10.3390/brainsci15121304 - 2 Dec 2025

Viewed by 550

Abstract

Background: Examining the influence of phonological neighborhoods on the early stages of visual word recognition provides insights into the architecture and dynamics of lexical representation and processing. Methods: Using event-related potentials (ERPs), this investigation explored how phonological neighborhood density (PND; large vs. small) [...] Read more.

Background: Examining the influence of phonological neighborhoods on the early stages of visual word recognition provides insights into the architecture and dynamics of lexical representation and processing. Methods: Using event-related potentials (ERPs), this investigation explored how phonological neighborhood density (PND; large vs. small) and type (PNT; tone-edit vs. constituent-edit neighbors) influence the recognition of monosyllabic words in Mandarin Chinese. Participants engaged in a priming paradigm combined with a visual lexical decision task. Results: Behavioral data demonstrated the main effect of PNT: words with tone-edit neighbors produced greater processing inhibition compared to those with constituent-edit neighbors. ERP results revealed that large PND enhanced the P200 amplitude, a frontal-mediated effect that was particularly pronounced for tone-edit neighbors. This early differentiation subsequently propelled a stronger N400 response to tone-edit neighbors, culminating in a significant interaction between PND and PNT during the N400 window. Conclusions: These findings support a cascaded competition model: early PND assessment (P200), enhanced for tone neighbors, amplifies their later N400 conflict. This neural mechanism elucidates the hierarchical organization of phonological processing in Chinese monosyllabic words, thereby clarifying a core component which underpins the recognition of more complex words in Mandarin. Full article

(This article belongs to the Section Neurolinguistics)

► Show Figures

Figure 1

15 pages, 255 KB

Open AccessArticle

The First Shall Be First: Letter-Position Coding and Spatial Invariance in Two Cases of Attentional Dyslexia

by Jeremy J. Tree and David R. Playfoot

Brain Sci. 2025, 15(9), 967; https://doi.org/10.3390/brainsci15090967 - 6 Sep 2025

Viewed by 916

Abstract

Background/Objectives: Previous research has demonstrated that the initial letters of a word likely play a privileged role in visual word recognition, such that reading and visual recognition errors reflecting changes in this position are much less likely. For example, prior case studies of [...] Read more.

Background/Objectives: Previous research has demonstrated that the initial letters of a word likely play a privileged role in visual word recognition, such that reading and visual recognition errors reflecting changes in this position are much less likely. For example, prior case studies of attentional dyslexia reported that participants were most accurate at rejecting nonwords formed by transposing a word’s first two letters (e.g., WONER from OWNER) compared to transpositions in later positions. The current study aimed to replicate and extend this finding in patients with posterior cortical atrophy (PCA), a neurodegenerative condition associated with visuospatial and attentional impairments. Methods: Two PCA patients completed lexical decision tasks involving five-letter real words and nonwords created either by transposing adjacent letters (in positions 1 + 2, 2 + 3, 3 + 4, or 4 + 5) or using matched nonword controls. To assess robustness, tasks were repeated across test–retest sessions. Stimuli were presented in both canonical horizontal and non-canonical vertical (marquee) formats. Accuracy, response bias, and sensitivity (d′) were estimated, with 95% confidence intervals derived from a nonparametric bootstrap procedure. Within-case logistic regressions were also conducted to illustrate the findings. Results: Both patients showed significantly higher accuracy and lower response bias for 1 + 2 transposition nonwords relative to other positions. This early-letter advantage persisted across test–retest observations and was maintained when words were presented in the vertical format, suggesting orientation-invariant effects. The bootstrap and regression analyses provided convergent support for these results. Conclusions: The findings provide novel evidence in PCA that the encoding of early letter positions operates independently of visual orientation and persists despite attentional deficits. This supports models in which the initial letters serve as a key anchor point in orthographic processing, highlighting the privileged and resilient status of early letter encoding in visual word recognition. Full article

(This article belongs to the Special Issue Language Dysfunction in Posterior Cortical Atrophy)

32 pages, 9129 KB

Open AccessArticle

Detection and Recognition of Bilingual Urdu and English Text in Natural Scene Images Using a Convolutional Neural Network–Recurrent Neural Network Combination with a Connectionist Temporal Classification Decoder

by Khadija Tul Kubra, Muhammad Umair, Muhammad Zubair, Muhammad Tahir Naseem and Chan-Su Lee

Sensors 2025, 25(16), 5133; https://doi.org/10.3390/s25165133 - 19 Aug 2025

Cited by 2 | Viewed by 2382

Abstract

Urdu and English are widely used for visual text communications worldwide in public spaces such as signboards and navigation boards. Text in such natural scenes contains useful information for modern-era applications such as language translation for foreign visitors, robot navigation, and autonomous vehicles, [...] Read more.

Urdu and English are widely used for visual text communications worldwide in public spaces such as signboards and navigation boards. Text in such natural scenes contains useful information for modern-era applications such as language translation for foreign visitors, robot navigation, and autonomous vehicles, highlighting the importance of extracting these texts. Previous studies focused on Urdu alone or printed text pasted manually on images and lacked sufficiently large datasets for effective model training. Herein, a pipeline for Urdu and English (bilingual) text detection and recognition in complex natural scene images is proposed. Additionally, a unilingual dataset is converted into a bilingual dataset and augmented using various techniques. For implementations, a customized convolutional neural network is used for feature extraction, a recurrent neural network (RNN) is used for feature learning, and connectionist temporal classification (CTC) is employed for text recognition. Experiments are conducted using different RNNs and hidden units, which yield satisfactory results. Ablation studies are performed on the two best models by eliminating model components. The proposed pipeline is also compared to existing text detection and recognition methods. The proposed models achieved average accuracies of 98.5% for Urdu character recognition, 97.2% for Urdu word recognition, and 99.2% for English character recognition. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Graphical abstract

14 pages, 733 KB

Open AccessArticle

Investigating Foreign Language Vocabulary Recognition in Children with ADHD and Autism with the Use of Eye Tracking Technology

by Georgia Andreou and Ariadni Argatzopoulou

Brain Sci. 2025, 15(8), 876; https://doi.org/10.3390/brainsci15080876 - 18 Aug 2025

Cited by 2 | Viewed by 2161

Abstract

Background: Neurodivergent students, including those with Autism Spectrum Disorder (ASD) and Attention Deficit/Hyperactivity Disorder (ADHD), frequently encounter challenges in several areas of foreign language (FL) learning, including vocabulary acquisition. This exploratory study aimed to investigate real-time English as a Foreign Language (EFL) word [...] Read more.

Background: Neurodivergent students, including those with Autism Spectrum Disorder (ASD) and Attention Deficit/Hyperactivity Disorder (ADHD), frequently encounter challenges in several areas of foreign language (FL) learning, including vocabulary acquisition. This exploratory study aimed to investigate real-time English as a Foreign Language (EFL) word recognition using eye tracking within the Visual World Paradigm (VWP). Specifically, it examined whether gaze patterns could serve as indicators of successful word recognition, how these patterns varied across three distractor types (semantic, phonological, unrelated), and whether age and vocabulary knowledge influenced visual attention during word processing. Methods: Eye-tracking data were collected from 17 children aged 6–10 years with ADHD or ASD while they completed EFL word recognition tasks. Analyses focused on gaze metrics across target and distractor images to identify percentile-based thresholds as potential data-driven markers of recognition. Group differences (ADHD vs. ASD) and the roles of age and vocabulary knowledge were also examined. Results: Children with ADHD exhibited increased fixations on phonological distractors, indicating higher susceptibility to interference, whereas children with ASD demonstrated more distributed attention, often attracted by semantic cues. Older participants and those with higher vocabulary scores showed more efficient gaze behavior, characterized by increased fixations on target images, greater attention to relevant stimuli, and reduced attention to distractors. Conclusions: Percentile-based thresholds in gaze metrics may provide useful markers of word recognition in neurodivergent learners. Findings underscore the importance of differentiated instructional strategies in EFL education for children with ADHD and ASD. The study further supports the integration of eye tracking with behavioral assessments to advance understanding of language processing in atypical developmental contexts. Full article

(This article belongs to the Special Issue Eye-Tracking Monitoring of Neurological and Psychiatric Conditions Across Life Span)

► Show Figures

Figure 1

18 pages, 1268 KB

Open AccessArticle

Visual Word Segmentation Cues in Tibetan Reading: Comparing Dictionary-Based and Psychological Word Segmentation

by Dingyi Niu, Zijian Xie, Jiaqi Liu, Chen Wang and Ze Zhang

J. Eye Mov. Res. 2025, 18(4), 33; https://doi.org/10.3390/jemr18040033 - 4 Aug 2025

Viewed by 986

Abstract

This study utilized eye-tracking technology to explore the role of visual word segmentation cues in Tibetan reading, with a particular focus on the effects of dictionary-based and psychological word segmentation on reading and lexical recognition. The experiment employed a 2 × 3 design, [...] Read more.

This study utilized eye-tracking technology to explore the role of visual word segmentation cues in Tibetan reading, with a particular focus on the effects of dictionary-based and psychological word segmentation on reading and lexical recognition. The experiment employed a 2 × 3 design, comparing six conditions: normal sentences, dictionary word segmentation (spaces), psychological word segmentation (spaces), normal sentences (green), dictionary word segmentation (color alternation), and psychological word segmentation (color alternation). The results revealed that word segmentation with spaces (whether dictionary-based or psychological) significantly improved reading efficiency and lexical recognition, whereas color alternation showed no substantial facilitative effect. Psychological and dictionary word segmentation performed similarly across most metrics, though psychological segmentation slightly outperformed in specific indicators (e.g., sentence reading time and number of fixations), and dictionary word segmentation slightly outperformed in other indicators (e.g., average saccade amplitude and number of regressions). The study further suggests that Tibetan reading may involve cognitive processes at different levels, and the basic units of different levels of cognitive processes may not be consistent. These findings hold significant implications for understanding the cognitive processes involved in Tibetan reading and for optimizing the presentation of Tibetan text. Full article

► Show Figures

Figure 1

16 pages, 1047 KB

Open AccessArticle

Measuring Adult Heritage Language Lexical Proficiency for Studies on Facilitative Processing of Gender

by Zuzanna Fuchs, Emma Kealey, Esra Eldem-Tunç, Leo Mermelstein, Linh Pham, Anna Runova, Yue Chen, Metehan Oğuz, Seoyoon Hong, Catherine Pan and JK Subramony

Languages 2025, 10(8), 189; https://doi.org/10.3390/languages10080189 - 4 Aug 2025

Viewed by 2435

Abstract

The present study analyzes individual differences in the facilitative processing of grammatical gender by heritage speakers of Spanish, asking whether these differences correlate with lexical proficiency. Results from an eye-tracking study in the Visual World Paradigm replicate prior findings that, as a group, [...] Read more.

The present study analyzes individual differences in the facilitative processing of grammatical gender by heritage speakers of Spanish, asking whether these differences correlate with lexical proficiency. Results from an eye-tracking study in the Visual World Paradigm replicate prior findings that, as a group, heritage speakers of Spanish show facilitative processing of gender. Importantly, in a follow-up within-group analysis, we test whether three measures of lexical proficiency—oral picture-naming, verbal fluency, and LexTALE—predict individual performance. We find that lexical proficiency, as measured by LexTALE, predicts overall word recognition; however, we observe no effects of the other measures and no evidence that lexical proficiency modulates the strength of the facilitative effect. Our results highlight the importance of carefully selecting tools for proficiency assessment in experimental studies involving heritage speakers, underscoring that the absence of evidence for an effect of proficiency based on a single measure should not be taken as evidence of absence. Full article

(This article belongs to the Special Issue Language Processing in Spanish Heritage Speakers)

► Show Figures

Figure 1

13 pages, 1996 KB

Open AccessArticle

You Can Stand Under My Umbrella: Cognitive Load in Second-Language Reading

by Francisco Rocabado, Gianna Schmitz and Jon Andoni Duñabeitia

Behav. Sci. 2025, 15(8), 1051; https://doi.org/10.3390/bs15081051 - 3 Aug 2025

Viewed by 1585

Abstract

Second-language (L2) written processing has often been linked to cognitive disfluency, resembling fluency disruptions caused by perceptual challenges, such as visual degradation. This study used Virtual Reality to investigate whether cognitive disfluency in L2 mirrors perceptual disfluency by simulating adverse weather conditions (sunny [...] Read more.

Second-language (L2) written processing has often been linked to cognitive disfluency, resembling fluency disruptions caused by perceptual challenges, such as visual degradation. This study used Virtual Reality to investigate whether cognitive disfluency in L2 mirrors perceptual disfluency by simulating adverse weather conditions (sunny vs. rainy) and applying visual masking. Spanish–English bilinguals completed a language decision task, identifying orthotactically unmarked words as either Spanish (L1) or English (L2) while experiencing these perceptual manipulations. Results showed that visual masking significantly increased reaction times, particularly for L1 words, suggesting that masking can diminish the native language advantage. Spanish words under masking elicited slower responses than unmasked ones, whereas L2 word recognition remained comparatively stable. Additionally, rainy weather conditions consistently slowed responses across both languages, indicating a general effect of environmental disfluency. A significant interaction between language and masking emerged, highlighting distinct cognitive effects for different disfluency types. These findings suggest that cognitive disfluency in L2 does not equate to perceptual disfluency; each affects processing differently. The use of Virtual Reality enabled the controlled manipulation of realistic environmental variables, offering valuable insights into how perceptual and linguistic challenges jointly influence bilingual language processing. Full article

(This article belongs to the Section Cognition)

► Show Figures

Figure 1

27 pages, 6771 KB

Open AccessArticle

A Deep Neural Network Framework for Dynamic Two-Handed Indian Sign Language Recognition in Hearing and Speech-Impaired Communities

by Vaidhya Govindharajalu Kaliyaperumal and Paavai Anand Gopalan

Sensors 2025, 25(12), 3652; https://doi.org/10.3390/s25123652 - 11 Jun 2025

Cited by 1 | Viewed by 1339

Abstract

Language is that kind of expression by which effective communication with another can be well expressed. One may consider such as a connecting bridge for bridging communication gaps for the hearing- and speech-impaired, even though it remains as an advanced method for hand [...] Read more.

Language is that kind of expression by which effective communication with another can be well expressed. One may consider such as a connecting bridge for bridging communication gaps for the hearing- and speech-impaired, even though it remains as an advanced method for hand gesture expression along with identification through the various different unidentified signals to configure their palms. This challenge can be met with a novel Enhanced Convolutional Transformer with Adaptive Tuna Swarm Optimization (ECT-ATSO) recognition framework proposed for double-handed sign language. In order to improve both model generalization and image quality, preprocessing is applied to images prior to prediction, and the proposed dataset is organized to handle multiple dynamic words. Feature graining is employed to obtain local features, and the ViT transformer architecture is then utilized to capture global features from the preprocessed images. After concatenation, this generates a feature map that is then divided into various words using an Inverted Residual Feed-Forward Network (IRFFN). Using the Tuna Swarm Optimization (TSO) algorithm in its enhanced form, the provided Enhanced Convolutional Transformer (ECT) model is optimally tuned to handle the problem dimensions with convergence problem parameters. In order to solve local optimization constraints when adjusting the position for the tuna update process, a mutation operator was introduced. The dataset visualization that demonstrates the best effectiveness compared to alternative cutting-edge methods, recognition accuracy, and convergences serves as a means to measure performance of this suggested framework. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

Search Results (125)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (125)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI