Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (133)

Search Parameters:
Keywords = lexical knowledge

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 914 KB  
Article
Exploring the “Tip of the Tongue” and “Feeling of Knowing” Phenomena During Advanced Aging: The Interplay of Age of Acquisition, Vocabulary and Verbal Fluency
by Carlos Rojas, Yasna Sandoval, Bárbara Farías, Gabriel Lagos, Álvaro Poza, Bernardo Riffo and Ernesto Guerra
Behav. Sci. 2025, 15(12), 1686; https://doi.org/10.3390/bs15121686 - 5 Dec 2025
Viewed by 353
Abstract
Background/Objectives: The “tip of the tongue” (TOT) and “feeling of knowing” (FOK) phenomena were cognitive experiences that notably affected word retrieval, particularly among older adults. The study aimed to investigate the influences of age of acquisition (AoA), vocabulary size, and verbal fluency on [...] Read more.
Background/Objectives: The “tip of the tongue” (TOT) and “feeling of knowing” (FOK) phenomena were cognitive experiences that notably affected word retrieval, particularly among older adults. The study aimed to investigate the influences of age of acquisition (AoA), vocabulary size, and verbal fluency on the frequency and nature of TOT and FOK occurrences as individuals aged. Methods: A behavioral experiment was conducted based on the two-step word retrieval framework established by Gollan and Brown in 2006. Early and late acquisition words were utilized to induce tip-of-the-tongue phenomena and the feeling of knowing. Additionally, vocabulary and verbal fluency tests were administered. Sixty monolingual older adults participated in the study (35 female, 25 male; mean age: 77.66 years). Mixed-effects linear regressions had been used to analyze the data. Results: The logistic regression analysis identified age of acquisition as the most significant predictor of TOT and FOK experiences (p < 0.0001), highlighting that earlier vocabulary acquisition enhanced retrieval efficiency. Notable interactions between vocabulary size and verbal fluency illustrated that increased lexical knowledge diminished reliance on age of acquisition for successful retrieval. Conclusions: The findings underscore the importance of early vocabulary acquisition as a protective factor against cognitive decline in older adults, emphasizing the necessity for interventions aimed at enhancing vocabulary and fluency. This study contributed valuable insights into the cognitive mechanisms underlying language retrieval and suggested that fostering rich linguistic environments throughout life could facilitate better cognitive health in aging populations. Full article
Show Figures

Figure 1

26 pages, 1005 KB  
Article
A Context-Aware Lightweight Framework for Source Code Vulnerability Detection
by Yousef Sanjalawe, Budoor Allehyani and Salam Al-E’mari
Future Internet 2025, 17(12), 557; https://doi.org/10.3390/fi17120557 - 3 Dec 2025
Viewed by 288
Abstract
As software systems grow increasingly complex and interconnected, detecting vulnerabilities in source code has become a critical and challenging task. Traditional static analysis methods often fall short in capturing deep, context-dependent vulnerabilities and adapting to rapidly evolving threat landscapes. Recent efforts have explored [...] Read more.
As software systems grow increasingly complex and interconnected, detecting vulnerabilities in source code has become a critical and challenging task. Traditional static analysis methods often fall short in capturing deep, context-dependent vulnerabilities and adapting to rapidly evolving threat landscapes. Recent efforts have explored knowledge graphs and transformer-based models to enhance semantic understanding; however, these solutions frequently rely on static knowledge bases, exhibit high computational overhead, and lack adaptability to emerging threats. To address these limitations, we propose DynaKG-NER++, a novel and lightweight framework for context-aware vulnerability detection in source code. Our approach integrates lexical, syntactic, and semantic features using a transformer-based token encoder, dynamic knowledge graph embeddings, and a Graph Attention Network (GAT). We further introduce contrastive learning on vulnerability–patch pairs to improve discriminative capacity and design an attention-based fusion module to combine token and entity representations adaptively. A key innovation of our method is the dynamic construction and continual update of the knowledge graph, allowing the model to incorporate newly published CVEs and evolving relationships without retraining. We evaluate DynaKG-NER++ on five benchmark datasets, demonstrating superior performance across span-level F1 (89.3%), token-level accuracy (93.2%), and AUC-ROC (0.936), while achieving the lowest false positive rate (5.1%) among state-of-the-art baselines. Sta tistical significance tests confirm that these improvements are robust and meaningful. Overall, DynaKG-NER++ establishes a new standard in vulnerability detection, balancing accuracy, adaptability, and efficiency, making it highly suitable for deployment in real-world static analysis pipelines and resource-constrained environments. Full article
(This article belongs to the Topic Addressing Security Issues Related to Modern Software)
Show Figures

Figure 1

16 pages, 1176 KB  
Article
Hearing Tones, Missing Boundaries: Cross-Level Selective Transfer of Prosodic Boundaries Among Chinese–English Learners
by Lan Fang, Zilong Li, Keke Yu, John W. Schwieter and Ruiming Wang
Behav. Sci. 2025, 15(12), 1605; https://doi.org/10.3390/bs15121605 - 21 Nov 2025
Viewed by 277
Abstract
Second language (L2) learners often struggle to process prosodic boundaries, which are essential for speech comprehension. This study investigated the nature of these difficulties and how first language (L1) cue-weighting strategies transfer to L2 processing among Chinese (Mandarin)–English learners. The rising pitch that [...] Read more.
Second language (L2) learners often struggle to process prosodic boundaries, which are essential for speech comprehension. This study investigated the nature of these difficulties and how first language (L1) cue-weighting strategies transfer to L2 processing among Chinese (Mandarin)–English learners. The rising pitch that cues English phrase boundaries acoustically overlaps with functionally distinct Chinese lexical tones. Through two experiments comparing Chinese–English learners and native English speakers, we assessed sensitivity across lexical constituent, phrase, and sentence boundaries and manipulated acoustic cues (pause, lengthening, pitch) to estimate their perceptual weights during phrase-boundary identification. L2 learners showed reduced discrimination sensitivity only at the phrase level, performing comparably to native speakers at lexical constituent and sentence boundaries. For phrase boundaries, learners over-relied on pitch and under-relied on pre-boundary lengthening compared to native speakers, though both groups weighted pauses strongly. This selective deficit implicates the transfer of L1 cue-weighting strategies more than a global knowledge deficit. Our findings support a dynamic transfer model where L1 sensitivity to lexical tone transfer of L2 phrase perception, elevating the weight of pitch. While learners show partial adaptation, these results refine the Cue-Weighting Transfer Hypothesis by demonstrating that L2 prosodic acquisition involves both integrated L1 transfer and L2-driven reweighting strategies. Full article
Show Figures

Figure 1

28 pages, 20548 KB  
Article
KGGCN: A Unified Knowledge Graph-Enhanced Graph Convolutional Network Framework for Chinese Named Entity Recognition
by Xin Chen, Liang He, Weiwei Hu and Sheng Yi
AI 2025, 6(11), 290; https://doi.org/10.3390/ai6110290 - 13 Nov 2025
Viewed by 617
Abstract
Recent advances in Chinese Named Entity Recognition (CNER) have integrated lexical features and factual knowledge into pretrained language models. However, existing lexicon-based methods often inject knowledge as restricted, isolated token-level information, lacking rich semantic and structural context. Knowledge graphs (KGs), comprising relational triples, [...] Read more.
Recent advances in Chinese Named Entity Recognition (CNER) have integrated lexical features and factual knowledge into pretrained language models. However, existing lexicon-based methods often inject knowledge as restricted, isolated token-level information, lacking rich semantic and structural context. Knowledge graphs (KGs), comprising relational triples, offer explicit relational semantics and reasoning capabilities, while Graph Convolutional Networks (GCNs) effectively capture complex sentence structures. We propose KGGCN, a unified KG-enhanced GCN framework for CNER. KGGCN introduces external factual knowledge without disrupting the original word order, employing a novel end-append serialization scheme and a visibility matrix to control interaction scope. The model further utilizes a two-phase GCN stack, combining a standard GCN for robust aggregation with a multi-head attention GCN for adaptive structural refinement, to capture multi-level structural information. Experiments on four Chinese benchmark datasets demonstrate KGGCN’s superior performance. It achieves the highest F1-scores on MSRA (95.96%) and Weibo (71.98%), surpassing previous bests by 0.26 and 1.18 percentage points, respectively. Additionally, KGGCN obtains the highest Recall on OntoNotes (84.28%) and MSRA (96.14%), and the highest Precision on MSRA (95.82%), Resume (96.40%), and Weibo (72.14%). These results highlight KGGCN’s effectiveness in leveraging structured knowledge and multi-phase graph modeling to enhance entity recognition accuracy and coverage across diverse Chinese texts. Full article
Show Figures

Figure 1

16 pages, 261 KB  
Article
A Qualitative Approach to EFL Postgraduates’ GenAI-Assisted Research Writing Within Social Sciences
by Alejandro Curado Fuentes
Educ. Sci. 2025, 15(11), 1521; https://doi.org/10.3390/educsci15111521 - 11 Nov 2025
Cited by 1 | Viewed by 708
Abstract
Systematic and rigorous approaches are necessary to fully understand GenAI’s (Generative AI’s) impact on L2 English/EFL (English as a Foreign Language) academic writing in higher education. In this scope, postgraduate EFL writing has been explored little. The present qualitative study examines this topic [...] Read more.
Systematic and rigorous approaches are necessary to fully understand GenAI’s (Generative AI’s) impact on L2 English/EFL (English as a Foreign Language) academic writing in higher education. In this scope, postgraduate EFL writing has been explored little. The present qualitative study examines this topic within Social Sciences at the University of Extremadura, Spain, where seven participants with a B2 English level or higher enrolled in a 10-h hybrid course about GenAI for academic English writing in October and November of 2024, focusing on AI tools and Broad Data-Driven Learning (BDDL) resources (e.g., simple online corpora tools) to assist their writing. Participants’ feedback was collected by qualitative means (in-class discussions, task writing annotation, and final survey). Overall findings indicate notably positive responses and usage of these tools for the improvement of their texts (e.g., linguistic analysis, lexical-grammatical refinement, and text style improvement). Participants’ activities also showcase miscellaneous approaches and strategies in their management of GenAI. Despite the study’s small sample, these preliminary findings reveal that these postgraduate EFL writers can exploit expert and linguistic knowledge effectively using GenAI, demonstrating meta-linguistic awareness and digital literacy-related skills. Full article
(This article belongs to the Special Issue Critical Issues of English for Academic Purposes in Higher Education)
21 pages, 979 KB  
Article
How the Stakeholders’ Perception Contributes to the Pharmaceutical Strategies: A Regional Case Study in Latin America
by Talita da Silva Ferreira, Giovanni M. Pauletti and Luis Vázquez-Suárez
J. Mark. Access Health Policy 2025, 13(4), 54; https://doi.org/10.3390/jmahp13040054 - 23 Oct 2025
Viewed by 593
Abstract
Background: Stakeholders’ perception plays a crucial role in shaping pharmaceutical strategies. Stakeholders are groups interested in pharmaceutical companies’ success and outcomes. Stakeholders’ perceptions are multifaceted and impact pharmaceutical strategies, from shaping research to enhancing market access, pricing, and corporate reputation. Understanding and [...] Read more.
Background: Stakeholders’ perception plays a crucial role in shaping pharmaceutical strategies. Stakeholders are groups interested in pharmaceutical companies’ success and outcomes. Stakeholders’ perceptions are multifaceted and impact pharmaceutical strategies, from shaping research to enhancing market access, pricing, and corporate reputation. Understanding and actively managing stakeholders’ perceptions is vital for pharmaceutical companies to succeed in an increasingly complex and competitive industry. Methods: In this case study, knowledge contributions from stakeholders offered insights and strategies for application in the pharmaceutical sector. Results: Qualitative, exploratory research was conducted, which included the participation of sixteen stakeholders from different countries in Latin America, who responded to a semi-structured interview script, whose data were understood through lexical analysis in the Interface de R pour les Analyses Multimensionnelles de Texts et de Questionnaires (IRaMuTeQ). Conclusions: The results of this study underscore the importance of regulatory knowledge for professionals’ support and implementation of international strategies. Regulatory knowledge provides professionals with tools and insights to navigate complex regulatory environments, make informed decisions, and enhance organizational performance in global markets. Full article
Show Figures

Figure 1

18 pages, 864 KB  
Article
Enhanced Semantic BERT for Named Entity Recognition in Education
by Ping Huang, Huijuan Zhu, Ying Wang, Lili Dai and Lei Zheng
Electronics 2025, 14(19), 3951; https://doi.org/10.3390/electronics14193951 - 7 Oct 2025
Viewed by 607
Abstract
To address the technical challenges in the educational domain named entity recognition (NER), such as ambiguous entity boundaries and difficulties with nested entity identification, this study proposes an enhanced semantic BERT model (ES-BERT). The model innovatively adopts an education domain, vocabulary-assisted semantic enhancement [...] Read more.
To address the technical challenges in the educational domain named entity recognition (NER), such as ambiguous entity boundaries and difficulties with nested entity identification, this study proposes an enhanced semantic BERT model (ES-BERT). The model innovatively adopts an education domain, vocabulary-assisted semantic enhancement strategy that (1) applies the term frequency–inverse document frequency (TF-IDF) algorithm to weight domain-specific terms, and (2) fuses the weighted lexical information with character-level features, enabling BERT to generate enriched, domain-aware, character–word hybrid representations. A complete bidirectional long short-term memory-conditional random field (BiLSTM-CRF) recognition framework was established, and a novel focal loss-based joint training method was introduced to optimize the process. The experimental design employed a three-phase validation protocol, as follows: (1) In a comparative evaluation using 5-fold cross-validation on our proprietary computer-education dataset, the proposed ES-BERT model yielded a precision of 90.38%, which is higher than that of the baseline models; (2) Ablation studies confirmed the contribution of domain-vocabulary enhancement to performance improvement; (3) Cross-domain experiments on the 2016 knowledge base question answering datasets and resume benchmark datasets demonstrated outstanding precision of 98.41% and 96.75%, respectively, verifying the model’s transfer-learning capability. These comprehensive experimental results substantiate that ES-BERT not only effectively resolves domain-specific NER challenges in education but also exhibits remarkable cross-domain adaptability. Full article
Show Figures

Figure 1

24 pages, 1991 KB  
Article
Third Languages Acquisition (TLA): Educational Multilingualism at Early Ages
by M.ª Dolores Asensio Ferreiro
Languages 2025, 10(10), 251; https://doi.org/10.3390/languages10100251 - 29 Sep 2025
Viewed by 1386
Abstract
In an increasingly globalized world, learning foreign languages (FLs) is essential, particularly in education. Multilingualism is critical due to the multicultural and interconnected nature of societies, yet early third language acquisition (TLA) is not widely adopted in schools. This study investigates how the [...] Read more.
In an increasingly globalized world, learning foreign languages (FLs) is essential, particularly in education. Multilingualism is critical due to the multicultural and interconnected nature of societies, yet early third language acquisition (TLA) is not widely adopted in schools. This study investigates how the simultaneous learning of Spanish first language (L1), a second language (L2), and a third language (L3) impacts oral language (OL) development in L1 and whether prior L2 knowledge aids L3 acquisition. The study involved bilingual (L1 + L2) and trilingual (L1 + L2 + L3) learners. Data were collected using the Navarre Oral Language Test-Revised, which evaluates phonological, morphological–syntactic, lexical–semantic, and pragmatic competencies in oral communication. Findings revealed that trilingual learners showed better OL development in L1 compared to bilingual learners. Additionally, prior L2 knowledge facilitated L3 learning, highlighting the benefits of early trilingual education. The study demonstrates that early trilingual learning positively impacts OL development in L1. These results contribute significantly to research on TLA and the advancement of multilingual education. Full article
Show Figures

Figure 1

18 pages, 318 KB  
Article
How Morphology, Context, Vocabulary and Reading Shape Lexical Inference in Typical and Dyslexic Readers
by Ifigeneia Dosi
Educ. Sci. 2025, 15(10), 1266; https://doi.org/10.3390/educsci15101266 - 23 Sep 2025
Cited by 1 | Viewed by 1305
Abstract
Children’s ability to infer meanings of unfamiliar words during reading is thought to rely on the interplay between decoding, morphological awareness, contextual support, and vocabulary knowledge, but it remains unclear how these sources operate in typically developing (TD) readers compared to those with [...] Read more.
Children’s ability to infer meanings of unfamiliar words during reading is thought to rely on the interplay between decoding, morphological awareness, contextual support, and vocabulary knowledge, but it remains unclear how these sources operate in typically developing (TD) readers compared to those with developmental dyslexia (DD). This study examined whether morphological cues (suffixes) or/and contextual information facilitate meaning inference and which variables predict performance. Sixty children (30 TD, 30 DD; aged 9–12) completed a battery of tasks assessing pseudoword decoding, expressive vocabulary (breadth) synonyms, antonyms (depth), morphological awareness (deriving and decomposing words), and reading comprehension. The main inference task consisted of 20 short stories in which pseudowords replaced target words; in half the stories, pseudowords included derivational suffixes, while in the other half no such clues were available. Results showed that TD children performed significantly better than DD peers across all tasks. Regression analyses revealed that vocabulary depth and morphological awareness predicted inferencing in both groups, but decoding was uniquely predictive for DD children and reading comprehension only for TD children. These findings suggest that while lexical inference in both groups appears to draw on vocabulary and morphology, TD children may additionally integrate higher-order comprehension, whereas DD children seem to remain more influenced by decoding and partial lexical cues. Full article
(This article belongs to the Special Issue Students with Special Educational Needs in Reading and Writing)
13 pages, 991 KB  
Review
Speech Segmentation with Prosodic and Statistical Cues Is Language-Specific in Infancy
by Mireia Marimon, Amanda Saksida, Barbara Höhle and Alan Langus
Languages 2025, 10(9), 240; https://doi.org/10.3390/languages10090240 - 19 Sep 2025
Viewed by 1207
Abstract
Speech segmentation is one of the first tasks infants face when learning their mother tongue. It has been argued that statistical learning could function as a gateway to speech segmentation in the absence of pre-existing knowledge about the language to be acquired. However, [...] Read more.
Speech segmentation is one of the first tasks infants face when learning their mother tongue. It has been argued that statistical learning could function as a gateway to speech segmentation in the absence of pre-existing knowledge about the language to be acquired. However, infants also segment speech with prosodic cues, such as lexical stress. Here, we review recent evidence from studies that look at how infants weigh statistical and prosodic information when segmenting continuous speech. We argue that the idea that statistical regularities have a main role in early speech segmentation, as evidenced in English-learning infants, is not found with German-learning infants. With more natural speech stimuli, German-learning infants only become sensitive to statistical regularities in the speech signal by their first birthday. We provide further support for this hypothesis by showing that there are cross-linguistic differences in how statistical models segment child-directed speech (CDS) and that CDS changes as infants grow. This suggests that speech input to younger infants is not tailored for speech segmentation with statistical cues, but that it is subject to cross-linguistic differences like prosody. Full article
(This article belongs to the Special Issue Advances in the Acquisition of Prosody)
Show Figures

Figure 1

31 pages, 799 KB  
Article
Knowledge-Aware Arabic Question Generation: A Transformer-Based Framework
by Reham Bin Jabr and Aqil M. Azmi
Mathematics 2025, 13(18), 2975; https://doi.org/10.3390/math13182975 - 14 Sep 2025
Viewed by 1841
Abstract
In this work, we propose a knowledge-aware approach for Arabic automatic question generation (QG) that leverages the multilingual T5 (mT5) transformer augmented with a pre-trained Arabic question-answering model to address challenges posed by Arabic’s morphological richness and limited QG resources. Our system generates [...] Read more.
In this work, we propose a knowledge-aware approach for Arabic automatic question generation (QG) that leverages the multilingual T5 (mT5) transformer augmented with a pre-trained Arabic question-answering model to address challenges posed by Arabic’s morphological richness and limited QG resources. Our system generates both subjective questions and multiple-choice questions (MCQs) with contextually relevant distractors through a dual-model pipeline that tailors the decoding strategy to each subtask: the question generator employs beam search to maximize semantic fidelity and lexical precision, while the distractor generator uses top-k sampling to enhance diversity and contextual plausibility. The QG model is fine-tuned on Arabic SQuAD, and the distractor model is trained on a curated combination of ARCD and Qudrat. Experimental results show that beam search significantly outperforms top-k sampling for fact-based question generation, achieving a BLEU-4 score of 27.49 and a METEOR score of 25.18, surpassing fine-tuned AraT5 and translated English–Arabic baselines. In contrast, top-k sampling is more effective for distractor generation, yielding higher BLEU scores and producing distractors that are more diverse yet remain pedagogically valid, with a BLEU-1 score of 20.28 establishing a strong baseline in the absence of prior Arabic benchmarks. Human evaluation further confirms the quality of the generated questions. This work advances Arabic QG by providing a scalable, knowledge-aware solution with applications in educational technology, while demonstrating the critical role of task-specific decoding strategies and setting a foundation for future research in automated assessment. Full article
Show Figures

Figure 1

16 pages, 846 KB  
Article
MMKT: Multimodal Sentiment Analysis Model Based on Knowledge-Enhanced and Text-Guided Learning
by Chengkai Shi and Yunhua Zhang
Appl. Sci. 2025, 15(17), 9815; https://doi.org/10.3390/app15179815 - 7 Sep 2025
Viewed by 1242
Abstract
Multimodal Sentiment Analysis (MSA) aims to predict subjective human emotions by leveraging multimodal information. However, existing research inadequately utilizes explicit sentiment semantic information at the lexical level in text and overlooks noise interference from non-dominant modalities, such as irrelevant movements in visual modalities [...] Read more.
Multimodal Sentiment Analysis (MSA) aims to predict subjective human emotions by leveraging multimodal information. However, existing research inadequately utilizes explicit sentiment semantic information at the lexical level in text and overlooks noise interference from non-dominant modalities, such as irrelevant movements in visual modalities and background noise in audio modalities. To address this issue, we propose a multimodal sentiment analysis model based on knowledge enhancement and text-guided learning (MMKT). The model constructs a sentiment knowledge graph for the textual modality using the SenticNet knowledge base. This graph directly annotates word-level sentiment polarity, strengthening the model’s understanding of emotional vocabulary. Furthermore, global sentiment knowledge features are generated through graph embedding computations to enhance the multimodal fusion process. Simultaneously, a dynamic text-guided learning approach is introduced, which dynamically leverages multi-scale textual features to actively suppress redundant or conflicting information in visual and audio modalities, thereby generating purer cross-modal representations. Finally, concatenated textual features, cross-modal features, and knowledge features are utilized for sentiment prediction. Experimental results on the CMU-MOSEI and Twitter2019 dataset demonstrate the superior performance of the MMKT model. Full article
Show Figures

Figure 1

18 pages, 1495 KB  
Article
Retrieval-Augmented Generation vs. Baseline LLMs: A Multi-Metric Evaluation for Knowledge-Intensive Content
by Aparna Vinayan Kozhipuram, Samar Shailendra and Rajan Kadel
Information 2025, 16(9), 766; https://doi.org/10.3390/info16090766 - 4 Sep 2025
Cited by 2 | Viewed by 4813
Abstract
(1) Background: The development of Generative Artificial Intelligence (GenAI) is transforming knowledge-intensive domains such as Education. However, Large Language Models (LLMs), which serve as the foundational components for GenAI tools, are trained on static datasets, often producing misleading, factually incorrect, or outdated responses. [...] Read more.
(1) Background: The development of Generative Artificial Intelligence (GenAI) is transforming knowledge-intensive domains such as Education. However, Large Language Models (LLMs), which serve as the foundational components for GenAI tools, are trained on static datasets, often producing misleading, factually incorrect, or outdated responses. Our study explores the performance gains of Retrieval-Augmented LLMs over baseline LLMs while also identifying the trade-off opportunity between smaller-parameter LLMs augmented with user-specific data to larger parameter LLMs. (2) Methods: We experimented with four different LLMs, each with a different number of parameters, to generate outputs. These outputs were then evaluated across seven lexical and semantic metrics to identify performance trends in Retrieval-Augmented Generation (RAG)-Augmented LLMs and analyze the impact of parameter size on LLM performance. (3) Results and Discussions: We have synthesized 968 different combinations to identify this trend with the help of different LLM sizes/parameters: TinyLlama 1.1B, Mistral 7B, Llama 3.1 8B, and Llama 1 13 B. These studies were grouped into two themes: RAG-Augmented LLM percentage improvements to baseline LLMs and compelling trade-off possibilities of RAG-Augmented smaller-parameter LLMs to larger-parameter LLMs. Our experiments show that RAG-Augmented LLMs demonstrate high lexical and semantic scores relative to baseline LLMs. This offers RAG-Augmented LLMs as a compelling trade-off for reducing the number of parameters in LLMs and lowering overall resource demands. (4) Conclusions: The findings outline that by leveraging RAG-Augmented LLMs, smaller-parameter LLMs can perform better or equivalently to large-parameter LLMs, particularly demonstrating strong lexical improvements. They reduce the risks of hallucination and keep the output more contextualized, making them a better choice for knowledge-intensive content in academic and research sectors. Full article
Show Figures

Figure 1

24 pages, 1479 KB  
Article
Beyond L2 Learners: Evaluating LexTALE-ESP as a Proficiency Measure for Heritage Language Learners of Spanish
by Cristina Lozano-Argüelles and Alberta Gatti
Languages 2025, 10(9), 223; https://doi.org/10.3390/languages10090223 - 30 Aug 2025
Viewed by 1707
Abstract
LexTALE has emerged as a popular measure of language proficiency in research studies. While it has been widely validated for L2 learners across multiple languages, its applicability to heritage language learners (HLLs)—who often show distinct language development from L2ers—has not been established. Here, [...] Read more.
LexTALE has emerged as a popular measure of language proficiency in research studies. While it has been widely validated for L2 learners across multiple languages, its applicability to heritage language learners (HLLs)—who often show distinct language development from L2ers—has not been established. Here, we evaluate the Spanish version of LexTALE (LexTALE-Esp) as a predictor of writing proficiency among college-aged HLLs in the United States. We show that LexTALE-Esp scores significantly correlate with ACTFL-rated functional writing levels and outperform self-assessment as a predictor of proficiency. Our results suggest that, despite concerns about HLLs’ limited experience with written texts in the heritage language, vocabulary-based tasks capture core aspects of written language ability. These findings indicate that vocabulary-based tests like LexTALE-Esp capture proficiency-relevant lexical knowledge across speaker profiles and may tap into dimensions of both core and extended language competence. Full article
Show Figures

Figure 1

20 pages, 1818 KB  
Article
Image Captioning Model Based on Multi-Step Cross-Attention Cross-Modal Alignment and External Commonsense Knowledge Augmentation
by Liang Wang, Meiqing Jiao, Zhihai Li, Mengxue Zhang, Haiyan Wei, Yuru Ma, Honghui An, Jiaqi Lin and Jun Wang
Electronics 2025, 14(16), 3325; https://doi.org/10.3390/electronics14163325 - 21 Aug 2025
Cited by 1 | Viewed by 2000
Abstract
To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and [...] Read more.
To address the semantic mismatch between limited textual descriptions in image captioning training datasets and the multi-semantic nature of images, as well as the underutilized external commonsense knowledge, this article proposes a novel image captioning model based on multi-step cross-attention cross-modal alignment and external commonsense knowledge enhancement. The model employs a backbone architecture comprising CLIP’s ViT visual encoder, Faster R-CNN, BERT text encoder, and GPT-2 text decoder. It incorporates two core mechanisms: a multi-step cross-attention mechanism that iteratively aligns image and text features across multiple rounds, progressively enhancing inter-modal semantic consistency for more accurate cross-modal representation fusion. Moreover, the model employs Faster R-CNN to extract region-based object features. These features are mapped to corresponding entities within the dataset through entity probability calculation and entity linking. External commonsense knowledge associated with these entities is then retrieved from the ConceptNet knowledge graph, followed by knowledge embedding via TransE and multi-hop reasoning. Finally, the fused multimodal features are fed into the GPT-2 decoder to steer caption generation, enhancing the lexical richness, factual accuracy, and cognitive plausibility of the generated descriptions. In the experiments, the model achieves CIDEr scores of 142.6 on MSCOCO and 78.4 on Flickr30k. Ablations confirm both modules enhance caption quality. Full article
Show Figures

Figure 1

Back to TopTop