Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (673)

Search Parameters:
Keywords = character recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 2867 KB  
Article
SDR-Net: A Stage-Wise Degradation-Aware Restoration Network for Robust License Plate Recognition in Complex Port Environments
by Hyungseok Kim, Sungan Yoon and Jeongho Cho
Mathematics 2026, 14(6), 934; https://doi.org/10.3390/math14060934 - 10 Mar 2026
Abstract
Port areas are core hubs for national logistics and high-risk security zones that require constant vehicle access control. However, ensuring the reliability of automatic license plate recognition (ALPR) systems in port environments is severely challenged by complex image degradations, such as dense haze, [...] Read more.
Port areas are core hubs for national logistics and high-risk security zones that require constant vehicle access control. However, ensuring the reliability of automatic license plate recognition (ALPR) systems in port environments is severely challenged by complex image degradations, such as dense haze, low light, and motion blur. In this study, we propose a stage-wise degradation-aware restoration network (SDR-Net), which effectively addresses harsh port conditions by sequentially restoring photometric and structural degradations. Particularly, SDR-Net first secures visual cues lost to haze and low light through a photometric restoration module involving a dark-channel-prior-based dehazing and adaptive brightness adjustment. Next, a structural restoration module based on a generative adversarial network featuring edge-guided structural feature blocks and edge-aware refinement blocks is employed to precisely reconstruct character strokes and outlines damaged by motion blur, stably restoring license plate legibility even under complex degradation conditions. Experiments across various intensities of complex degradation demonstrate that SDR-Net maintains high character recognition accuracies of over 97.35% under mild motion blur and low-concentration haze conditions, indicating its superiority over state-of-the-art models. Notably, the performance gap between SDR-Net and comparison models widened as the degradation intensity increased, and SDR-Net achieved the highest multiscale structural similarity index scores across all intervals. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

19 pages, 764 KB  
Article
FeOCR: Domain-Adaptive Chinese OCR with Visual Character Disambiguation and LLM-Based Correction for Metallurgical Documents
by Qiang Zheng, Yaxuan Sun, Lin Wang, Haoning Zhang, Fanjie Meng and Minghui Li
Electronics 2026, 15(6), 1144; https://doi.org/10.3390/electronics15061144 - 10 Mar 2026
Abstract
High-quality text corpora are essential for knowledge graph construction and domain-specific large model pre-training in technology-intensive industries, with the steel metallurgy sector serving as a representative case. However, many industrial documents remain in scanned or PDF formats, where general-purpose Optical Character Recognition (OCR) [...] Read more.
High-quality text corpora are essential for knowledge graph construction and domain-specific large model pre-training in technology-intensive industries, with the steel metallurgy sector serving as a representative case. However, many industrial documents remain in scanned or PDF formats, where general-purpose Optical Character Recognition (OCR) systems exhibit systematic errors when recognizing Chinese metallurgical documents. In particular, visually similar Chinese characters that differ by only minor strokes are frequently confused, leading to severe degradation of text reliability and cascading errors in downstream knowledge extraction. This paper proposes FeOCR, a general-purpose domain-adaptive framework for machine-printed Chinese characters, which is specifically evaluated within the context of the steel metallurgy industry. The framework integrates visual character disambiguation with context-aware semantic correction. We first construct a metallurgy-specific OCR dataset emphasizing high-frequency confusable Chinese word pairs and enhance data diversity through font perturbation and noise synthesis. Parameter-efficient fine-tuning (LoRA) is then applied to adapt a general OCR model to domain-specific visual patterns. Furthermore, a Large Language Model-based correction module performs semantic refinement of residual errors under domain lexical constraints. Experiments demonstrate significant reductions in character and word error rates, especially for confusable technical terms, providing a reliable foundation for industrial Chinese document digitization. Full article
Show Figures

Figure 1

25 pages, 8082 KB  
Article
A Novel Improved Whale Optimization Algorithm-Based Multi-Scale Fusion Attention Enhanced SwinIR Model for Super-Resolution and Recognition of Text Images on Electrophoretic Displays
by Xin Xiong, Zikang Feng, Peng Li, Xi Hu, Jiyan Liu and Xueqing Liu
Biomimetics 2026, 11(3), 195; https://doi.org/10.3390/biomimetics11030195 - 6 Mar 2026
Viewed by 176
Abstract
Electrophoretic Displays (EPDs) are widely adopted in e-readers and portable devices due to their ultra-low power consumption and eye-friendly reflective characteristics. However, inherent hardware limitations, such as low resolution, slow response speed, and display degradation, frequently result in blurred strokes and degraded text [...] Read more.
Electrophoretic Displays (EPDs) are widely adopted in e-readers and portable devices due to their ultra-low power consumption and eye-friendly reflective characteristics. However, inherent hardware limitations, such as low resolution, slow response speed, and display degradation, frequently result in blurred strokes and degraded text readability. While traditional driving waveform optimizations can mitigate these issues, they are device-dependent and require extensive manual calibration. To address these challenges, this paper proposes an Improved Whale Optimization Algorithm-based Multi-scale Fusion Attention-enhanced SwinIR (IWOA-MFA-SwinIR) model for super-resolution and recognition of text images on EPDs. Structurally, the model incorporates a multi-scale fused attention (MFA) module that synergistically integrates channel, spatial, and gated attention mechanisms to precisely capture high-frequency text details while suppressing background noise within the SwinIR architecture. Furthermore, to enhance model robustness and eliminate manual tuning, an Improved Whale Optimization Algorithm (IWOA) is employed to adaptively optimize critical hyperparameters, including embedding dimension (d), attention head count (h), learning rate (lr), and dimensionality reduction coefficient (r). Experiments conducted on the TextZoom and EPD datasets demonstrate that the proposed model achieves state-of-the-art performance. In the ablation study, it attains a Peak Signal-to-Noise Ratio (PSNR) of 24.406, a Structural Similarity Index (SSIM) of 0.8837, and a Character Recognition Accuracy (CRA) of 89.81%. In the comparative evaluation, the proposed model consistently outperforms the second-best comparison model across three difficulty levels, yielding approximately a 1% improvement in PSNR, a 0.8% improvement in SSIM, and an 8% improvement in CRA. This confirms the proposed model’s superiority over mainstream comparative models in restoring text fidelity and improving recognition rates. Full article
(This article belongs to the Special Issue Bionics in Engineering Practice: Innovations and Applications)
Show Figures

Figure 1

24 pages, 1380 KB  
Article
From Reviews to Recommendations: Discovering Latent Visitor Preferences for Sustainable Wellness Templestay Management
by Min-Hwan Ko
Sustainability 2026, 18(5), 2512; https://doi.org/10.3390/su18052512 - 4 Mar 2026
Viewed by 938
Abstract
The sustainability of experience-intensive wellness tourism services increasingly depends on managers’ ability to understand heterogeneous and implicit tourist preferences that are rarely captured through traditional survey-based approaches. In the context of Korean Templestay tourism, this study develops a data-driven decision-support framework that leverages [...] Read more.
The sustainability of experience-intensive wellness tourism services increasingly depends on managers’ ability to understand heterogeneous and implicit tourist preferences that are rarely captured through traditional survey-based approaches. In the context of Korean Templestay tourism, this study develops a data-driven decision-support framework that leverages large-scale unstructured review data to address managerial challenges such as choice overload, inefficient resource allocation, and cold-start conditions. Using 74,015 user-generated reviews collected between 2020 and 2024, the framework integrates Optical Character Recognition (OCR) to extract image-embedded text, achieving a validated character-level accuracy of 96.8%. In addition, a weak supervision strategy is applied to identify latent tourist preferences in a cost-efficient and scalable manner. Preference classification is conducted using Random Forest models combined with SMOTE, followed by clustering and user-based collaborative filtering to support personalized recommendations. The findings indicate that the Templestay market is better understood as an interconnected preference network rather than a set of mutually exclusive segments. Across user groups, “rest” emerges as a shared foundational value, while differentiated sub-preferences coexist within the network. The proposed framework successfully generates recommendations for all users in the dataset, demonstrating strong applicability for mitigating cold-start risks and supporting adaptive and sustainable program design. Full article
(This article belongs to the Section Tourism, Culture, and Heritage)
Show Figures

Figure 1

34 pages, 4341 KB  
Article
Comparative Morphology and Generic Classification of Catfishes of the Trichomycterus Lineage (Siluriformes: Trichomycteridae)
by Wilson J. E. M. Costa
Taxonomy 2026, 6(1), 20; https://doi.org/10.3390/taxonomy6010020 - 4 Mar 2026
Viewed by 417
Abstract
Recent genomic phylogenies have generated new robust classifications of actinopterygian fishes, making possible greater nomenclatural stability, but genus-level classifications of groups like the diverse catfish subfamily Trichomycterinae are still unclear, containing ill-defined paraphyletic taxa. The focus of the present study is the Trichomycterus [...] Read more.
Recent genomic phylogenies have generated new robust classifications of actinopterygian fishes, making possible greater nomenclatural stability, but genus-level classifications of groups like the diverse catfish subfamily Trichomycterinae are still unclear, containing ill-defined paraphyletic taxa. The focus of the present study is the Trichomycterus Lineage (TL), a clade with great morphological diversity, containing about 170 species widely distributed in South America, occurring in the most important biodiversity hotspots of the world, such as the Atlantic Forest, Cerrado, and the Tropical Andes. Most species are small, but at least one reaches about 400 mm of total length, being used as food and depicted in pre-Hispanic Andean ceramics. Based on a comparative morphological analysis, mainly using osteological characters, supported by concordant genomic phylogenies, a new classification at the genus level is here provided. Many morphological features delimiting TL genera seem to be related to ecological adaptations. Nine genera are here recognised of which five are new. Recognition of the new genera will allow easier descriptions of new species and consequently better biodiversity estimates. Full article
Show Figures

Figure 1

17 pages, 6814 KB  
Article
Grindelia mutabilis (Asteraceae: Astereae), a New South American Species and a Link for Synonymizing Notopappus
by Fernando Fernandes, Bruno de Souza, João Iganci, Tatiana Teixeira de Souza-Chies and Gustavo Heiden
Plants 2026, 15(5), 760; https://doi.org/10.3390/plants15050760 - 1 Mar 2026
Viewed by 333
Abstract
Grindelia mutabilis (Asteraceae, Astereae), a new species from Brazil endemic to the Espinal Ecoregion of the Río de La Plata Grasslands Bioregion and Pampa Province of the Chaco Biogeographical Domain, is proposed and illustrated. The new species is characterized by a combination of [...] Read more.
Grindelia mutabilis (Asteraceae, Astereae), a new species from Brazil endemic to the Espinal Ecoregion of the Río de La Plata Grasslands Bioregion and Pampa Province of the Chaco Biogeographical Domain, is proposed and illustrated. The new species is characterized by a combination of traits: small, rosette cespitose habit, linear to linear–oblanceolate leaves, light-yellow to pastel salmon ray florets, three-winged ray floret cypselae bearing a pappus of two to four elements and two-winged disc floret cypselae bearing a pappus of two elements. It has a highly restricted habitat and is known exclusively within Parque Estadual do Espinilho in Rio Grande do Sul, Brazil. Preliminary conservation assessments classify the new species as Critically Endangered. We provide illustrations and photographs, as well as a distribution map with an identification key for the South American Grindelia species with winged cypselae. The intriguing morphology of this species combines characters traditionally regarded as diagnostic for Notopappus, a genus segregated from Haplopappus and Grindelia. Previously published phylogenetic studies of related taxa indicate that the recognition of Notopappus as monophyletic is not supported and render Grindelia as non-monophyletic too. Based on this combined morphological evidence and existing phylogenetic hypotheses, we reaffirm the non-monophyly of Notopappus and formally propose its synonymization under Grindelia s.l. Full article
(This article belongs to the Special Issue Integrative Taxonomy, Systematics, and Morphology of Land Plants)
Show Figures

Figure 1

24 pages, 6624 KB  
Article
Application of Computer Vision to the Automated Extraction of Metadata from Natural History Specimen Labels: A Case Study on Herbarium Specimens
by Jacopo Zacchigna, Weiwei Liu, Felice Andrea Pellegrino, Adriano Peron, Francesco Roma-Marzio, Lorenzo Peruzzi and Stefano Martellos
Plants 2026, 15(4), 637; https://doi.org/10.3390/plants15040637 - 17 Feb 2026
Viewed by 633
Abstract
Metadata extraction from natural history collection labels is a pivotal task for the online publication of digitized specimens. However, given the scale of these collections—which are estimated to host more than 2 billion specimens worldwide, including ca. 400 million herbarium specimens—manual metadata extraction [...] Read more.
Metadata extraction from natural history collection labels is a pivotal task for the online publication of digitized specimens. However, given the scale of these collections—which are estimated to host more than 2 billion specimens worldwide, including ca. 400 million herbarium specimens—manual metadata extraction is an extremely time-consuming task. Thus, automated data extraction from digital images of specimens and their labels therefore is a promising application of state-of-the-art computer vision techniques. Extracting information from herbarium specimen labels normally involves three main steps: text segmentation, multilingual and handwriting recognition, and data parsing. The primary bottleneck in this workflow lies in the limitations of Optical Character Recognition (OCR) systems. This study explores how the general knowledge embedded in multimodal Transformer models can be transferred to the specific task of herbarium specimen label digitization. The final goal is to develop an easy-to-use, end-to-end solution to mitigate the limitations of classic OCR approaches while offering greater flexibility to adapt to different label formats. Donut-base, a pre-trained visual document understanding (VDU) transformer, was the base model selected for fine-tuning. A dataset from the University of Pisa served as a test bed. The initial attempt achieved an accuracy of 85%, measured using the Tree Edit Distance (TED), demonstrating the feasibility of fine-tuning for this task. Cases with low accuracies were also investigated to identify limitations of the approach. In particular, specimens with multiple labels, especially if combining handwritten and typewritten text, proved to be the most challenging. Strategies aimed at addressing these weaknesses are discussed. Full article
Show Figures

Figure 1

25 pages, 1558 KB  
Article
Towards Scalable Monitoring: An Interpretable Multimodal Framework for Migration Content Detection on TikTok Under Data Scarcity
by Dimitrios Taranis, Gerasimos Razis and Ioannis Anagnostopoulos
Electronics 2026, 15(4), 850; https://doi.org/10.3390/electronics15040850 - 17 Feb 2026
Viewed by 311
Abstract
Short-form video platforms such as TikTok (TikTok Pte. Ltd., Singapore) host large volumes of user-generated, often ephemeral, content related to irregular migration, where relevant cues are distributed across visual scenes, on-screen text, and multilingual captions. Automatically identifying migration-related videos is challenging due to [...] Read more.
Short-form video platforms such as TikTok (TikTok Pte. Ltd., Singapore) host large volumes of user-generated, often ephemeral, content related to irregular migration, where relevant cues are distributed across visual scenes, on-screen text, and multilingual captions. Automatically identifying migration-related videos is challenging due to this multimodal complexity and the scarcity of labeled data in sensitive domains. This paper presents an interpretable multimodal classification framework designed for deployment under data-scarce conditions. We extract features from platform metadata, automated video analysis (Google Cloud Video Intelligence), and Optical Character Recognition (OCR) text, and compare text-only, OCR-only, and vision-only baselines against a multimodal fusion approach using Logistic Regression, Random Forest, and XGBoost. In this pilot study, multimodal fusion consistently improves class separation over single-modality models, achieving an F1-score of 0.92 for the migration-related class under stratified cross-validation. Given the limited sample size, these results are interpreted as evidence of feature separability rather than definitive generalization. Feature importance and SHAP analyses identify OCR-derived keywords, maritime cues, and regional indicators as the most influential predictors. To assess robustness under data scarcity, we apply SMOTE to synthetically expand the training set to 500 samples and evaluate performance on a small held-out set of real videos, observing stable results that further support feature-level robustness. Finally, we demonstrate scalability by constructing a weakly labeled corpus of 600 videos using the identified multimodal cues, highlighting the suitability of the proposed feature set for weakly supervised monitoring at scale. Overall, this work serves as a methodological blueprint for building interpretable multimodal monitoring pipelines in sensitive, low-resource settings. Full article
(This article belongs to the Special Issue Multimodal Learning for Multimedia Content Analysis and Understanding)
Show Figures

Figure 1

9 pages, 2031 KB  
Proceeding Paper
Leveraging LLMs and Computer Vision for Personalized Nutrition Advice
by Mateo Tokić, Časlav Livada, Tomislav Galba and Alfonzo Baumgartner
Eng. Proc. 2026, 125(1), 21; https://doi.org/10.3390/engproc2026125021 - 16 Feb 2026
Viewed by 166
Abstract
This paper investigates the application of large language models (LLMs) in the domain of dietary advice, focusing on the recognition of ingredients and nutritional values from food products and the integration of this information into a system capable of delivering personalized recommendations. The [...] Read more.
This paper investigates the application of large language models (LLMs) in the domain of dietary advice, focusing on the recognition of ingredients and nutritional values from food products and the integration of this information into a system capable of delivering personalized recommendations. The research involved the development of a mobile application utilizing React Native and Python Flask frameworks. Optical character recognition (OCR) was implemented through the docTR model to extract nutritional information and ingredients from product images. Based on the extracted data and user profiles stored in a Firestore database, the system generates tailored dietary advice employing OpenAI’s GPT-3.5-turbo model. The findings demonstrate the feasibility of using LLMs to provide personalized dietary recommendations, thereby opening new opportunities in the digital transformation of nutrition and dietary planning. Full article
Show Figures

Figure 1

16 pages, 2339 KB  
Article
DRAG: Dual-Channel Retrieval-Augmented Generation for Hybrid-Modal Document Understanding
by Zhe Xin, Shuyuan Xia and Xin Guo
Electronics 2026, 15(4), 843; https://doi.org/10.3390/electronics15040843 - 16 Feb 2026
Viewed by 303
Abstract
Large Language Models (LLMs) have acquired vast amounts of knowledge during pre-training. However, there are a lot of challenges when it is deployed in real-world applications, such as poor interpretability, hallucinations, and the inability to reference private data. To address these issues, Retrieval-Augmented [...] Read more.
Large Language Models (LLMs) have acquired vast amounts of knowledge during pre-training. However, there are a lot of challenges when it is deployed in real-world applications, such as poor interpretability, hallucinations, and the inability to reference private data. To address these issues, Retrieval-Augmented Generation (RAG) has been proposed. Traditional RAG relying on text-based retrievers often converts documents using Optical Character Recognition (OCR) before retrieval. While testing has revealed that it tends to overlook tables and images contained within the documents. RAG, relying on vision-based retrievers, often loses information on text-dense pages. To address these limitations, we propose DRAG: Dual-channel Retrieval-Augmented Generation for Hybrid-Modal Document Understanding, a novel retrieval paradigm. The DRAG method proposed in this paper primarily comprises two core improvements: first, a parallel dual-channel processing architecture is adopted to separately extract and preserve the visual structural information and deep semantic information of documents, thereby effectively enhancing information integrity; second, a novel dynamic weighted fusion mechanism is proposed to integrate the retrieval results from both channels, enabling precise screening of the most relevant information segments. Empirical results demonstrate that our method achieves Competitive performance across multiple general benchmarks. Furthermore, performance on biomedical datasets (e.g., BioM) specifically highlights its potential in specialized, vertical domains such as elderly care and rehabilitation, where documents are characterized by dense hybrid-modal information. Full article
(This article belongs to the Special Issue AI-Driven Intelligent Systems in Energy, Healthcare, and Beyond)
Show Figures

Figure 1

47 pages, 668 KB  
Review
The Taxonomy of the Genus Entamoeba (Archamoebea: Endamoebidae): A Historical and Nomenclatural Review
by Lorena Esteban-Sánchez, Rafael Alberto Martínez-Díaz and Francisco Ponce-Gordo
Pathogens 2026, 15(2), 213; https://doi.org/10.3390/pathogens15020213 - 13 Feb 2026
Viewed by 383
Abstract
Throughout history, species within the genus Entamoeba have been described using criteria that were not always applied consistently, resulting in an often confusing and controversial taxonomy. Several factors contributed to this situation, including the limited number of morphological characters available for taxonomic studies, [...] Read more.
Throughout history, species within the genus Entamoeba have been described using criteria that were not always applied consistently, resulting in an often confusing and controversial taxonomy. Several factors contributed to this situation, including the limited number of morphological characters available for taxonomic studies, overlapping host ranges, mixed infections, and a cosmopolitan distribution associated with human and animal movements. The incorporation of genetic data as diagnostic and differential criteria during the second half of the twentieth century enabled the recognition of cryptic species and the proposal of new taxa; however, significant taxonomic issues remain unresolved. This review summarizes the historical development and major controversies in the taxonomy of Entamoeba, from its origins in the late nineteenth century, when morphology and host association were the available criteria, to the present day, in which molecular approaches provide a more realistic view of species diversity and interspecific relationships. Based on this analysis, general principles are proposed as a pragmatic synthesis to guide future taxonomic work on Entamoeba, emphasising lineage-based species delimitation, the central role of molecular evidence when diagnostic morphology is lacking, the contextual value of host data, and the need for nomenclatural decisions grounded in biological evidence and historical rigour. Full article
(This article belongs to the Section Parasitic Pathogens)
28 pages, 21245 KB  
Article
A Comparative Study of OCR Architectures for Korean License Plate Recognition: CNN–RNN-Based Models and MobileNetV3–Transformer-Based Models
by Seungju Lee and Gooman Park
Sensors 2026, 26(4), 1208; https://doi.org/10.3390/s26041208 - 12 Feb 2026
Viewed by 334
Abstract
This paper presents a systematic comparative study of optical character recognition (OCR) architectures for Korean license plate recognition under identical detection conditions. Although recent automatic license plate recognition (ALPR) systems increasingly adopt Transformer-based decoders, it remains unclear whether performance differences arise primarily from [...] Read more.
This paper presents a systematic comparative study of optical character recognition (OCR) architectures for Korean license plate recognition under identical detection conditions. Although recent automatic license plate recognition (ALPR) systems increasingly adopt Transformer-based decoders, it remains unclear whether performance differences arise primarily from sequence modeling strategies or from backbone feature representations. To address this issue, we employ a unified YOLOv12-based license plate detector and evaluate multiple OCR configurations, including a CNN with an Attention-LSTM decoder and a MobileNetV3 with a Transformer decoder. To ensure a fair comparison, a controlled ablation study is conducted in which the CNN backbone is fixed to ResNet-18 while varying only the sequence decoder. Experiments are performed on both static image datasets and tracking-based sequential datasets, assessing recognition accuracy, error characteristics, and processing speed across GPU and embedded platforms. The results demonstrate that the effectiveness of sequence decoders is highly dataset-dependent and strongly influenced by feature quality and region-of-interest (ROI) stability. Quantitative analysis further shows that tracking-induced error accumulation dominates OCR performance in sequential recognition scenarios. Moreover, Korean license plate–specific error patterns reveal failure modes not captured by generic OCR benchmarks. Finally, experiments on embedded platforms indicate that Transformer-based OCR models introduce significant computational and memory overhead, limiting their suitability for real-time deployment. These findings suggest that robust license plate recognition requires joint consideration of detection, tracking, and recognition rather than isolated optimization of OCR architectures. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

24 pages, 837 KB  
Article
HDIM-JER: Modeling Higher-Order Semantic Dependencies for Joint Entity–Relation Extraction in Threat Intelligence Texts
by Siyu Zhu, Weicheng Mao, Lin Miao, Jing Yin, Chao Du, Xin Li, Xiangyun Guo, Liang Wang and Ning Li
Symmetry 2026, 18(2), 340; https://doi.org/10.3390/sym18020340 - 12 Feb 2026
Viewed by 219
Abstract
Extracting structured threat intelligence from unstructured cybersecurity texts requires accurate identification of entities together with their underlying semantic relations. However, threat reports often exhibit intricate sentence structures, long-range contextual dependencies, and tightly coupled entity–relation patterns, which pose substantial challenges for existing extraction approaches. [...] Read more.
Extracting structured threat intelligence from unstructured cybersecurity texts requires accurate identification of entities together with their underlying semantic relations. However, threat reports often exhibit intricate sentence structures, long-range contextual dependencies, and tightly coupled entity–relation patterns, which pose substantial challenges for existing extraction approaches. To address these challenges, this study investigates joint entity–relation extraction from the perspective of semantic dependency modeling and develops HDIM-JER, a unified framework that captures structured interactions among heterogeneous linguistic features. HDIM-JER integrates character-level cues, contextual representations, and higher-order semantic dependency evidence to enhance structural awareness during joint inference, where different second-order dependency configurations provide an interpretable perspective on structurally symmetric and hierarchically asymmetric interaction patterns among entity–relation instances. By incorporating multi-level dependency interactions, HDIM-JER effectively alleviates error propagation associated with pipeline-based architectures and improves the modeling of complex relational dependencies. Extensive experiments on a threat intelligence corpus and a public benchmark dataset demonstrate consistent performance improvements over representative state-of-the-art methods in both entity recognition and relation extraction, confirming the effectiveness of higher-order semantic dependency interaction modeling for threat intelligence analysis. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Computer Vision)
Show Figures

Figure 1

17 pages, 19690 KB  
Article
Multilingual Intelligent Retrieval System via Unified End-to-End OCR and Hybrid Search
by Shuo Yang, Zhandong Liu, Ke Li, Ruixia Song, Yong Li and Xiangwei Qi
Appl. Sci. 2026, 16(4), 1771; https://doi.org/10.3390/app16041771 - 11 Feb 2026
Viewed by 518
Abstract
This study addresses the limitations of current Optical Character Recognition (OCR) systems in supporting minority languages and integrating intelligent retrieval functions. We propose an integrated system that combines an advanced end-to-end OCR model with a novel hybrid search approach. First, we developed the [...] Read more.
This study addresses the limitations of current Optical Character Recognition (OCR) systems in supporting minority languages and integrating intelligent retrieval functions. We propose an integrated system that combines an advanced end-to-end OCR model with a novel hybrid search approach. First, we developed the MultiLang-OCR-30K dataset containing 30,000 annotated samples of handwritten Chinese, Tibetan, and Uyghur texts. Second, we extended the GOT model using a freeze encoder–fine-tune decoder strategy to enhance multilingual capabilities. Finally, we designed a character-level hybrid retrieval framework integrating TF-IDF efficiency with Sentence-BERT semantic strength. Experimental results show our extended GOT model achieves sentence accuracies of 82.3%, 76.5%, and 78.1% for handwritten Chinese, Tibetan, and Uyghur, respectively. The hybrid search improves F1 score by 28.7% over TF-IDF alone while maintaining 23 ms average response time. This system provides a practical solution for multilingual document digitization and management, thereby bridging the technological gap for minority languages. Full article
Show Figures

Figure 1

17 pages, 373 KB  
Article
Exploring the Character Transposition Effect and Locus in Chinese Word Recognition: Evidence from Left–Right Visual Field Processing in Primary School Children
by Yi Song, Yuhan Jiang, Yuru Cheng, Lei Zhang and Jingxin Wang
Behav. Sci. 2026, 16(2), 251; https://doi.org/10.3390/bs16020251 - 9 Feb 2026
Viewed by 227
Abstract
Prior research has offered substantial evidence for letter transposition effect in word reading, yet studies in logographic languages such as Chinese are scarce and have largely focused on adults. This study aimed to determine whether second-grade children show character transposition effect impact in [...] Read more.
Prior research has offered substantial evidence for letter transposition effect in word reading, yet studies in logographic languages such as Chinese are scarce and have largely focused on adults. This study aimed to determine whether second-grade children show character transposition effect impact in recognizing two-character Chinese words and to examine potential differences between the left and right visual fields corresponding to the two cerebral hemispheres. A lexical decision task was used across two experiments. Experiment 1 tested 56 second graders and manipulated three stimulus types—normal words, Transposed pseudo-words, and Substituted pseudo-words—to verify the presence of the effect. Experiment 2 recruited an independent sample of 97 second graders and applied a lateralized presentation paradigm, presenting stimuli to either the right or left visual field (RVF/LVF), which project to the left and right hemispheres (LH/RH), respectively, to assess hemispheric differences. Experiment 1 revealed a significant character transposition effect among second-grade children. Experiment 2 showed no significant differences in the magnitude of the effect between the two visual fields. These findings provide new developmental evidence for Chinese word reading and important implications for theories of position encoding. Future studies should trace its developmental trajectory across a wider age range and diverse learning contexts. Full article
(This article belongs to the Section Cognition)
Back to TopTop