Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (84)

Search Parameters:
Keywords = dependency parsing

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 726 KB  
Article
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
by Dmitriy Rodionov, Evgenii Konnikov, Gleb Golikov and Polina Yakob
Information 2026, 17(1), 22; https://doi.org/10.3390/info17010022 - 31 Dec 2025
Viewed by 183
Abstract
Topic modeling of large news streams is widely used to reconstruct economic and political narratives, which requires coherent topics with low lexical overlap while remaining interpretable to domain experts. We propose TF-SYN-NER-Rel, a structural–semantic term weighting scheme that extends classical TF-IDF by integrating [...] Read more.
Topic modeling of large news streams is widely used to reconstruct economic and political narratives, which requires coherent topics with low lexical overlap while remaining interpretable to domain experts. We propose TF-SYN-NER-Rel, a structural–semantic term weighting scheme that extends classical TF-IDF by integrating positional, syntactic, factual, and named-entity coefficients derived from morphosyntactic and dependency parses of Russian news texts. The method is embedded into a standard Latent Dirichlet Allocation (LDA) pipeline and evaluated on a large Russian-language news corpus from the online archive of Moskovsky Komsomolets (over 600,000 documents), with political, financial, and sports subsets obtained via dictionary-based expert labeling. For each subset, TF-SYN-NER-Rel is compared with standard TF-IDF under identical LDA settings, and topic quality is assessed using the C_v coherence metric. To assess robustness, we repeat model training across multiple random initializations and report aggregate coherence statistics. Quantitative results show that TF-SYN-NER-Rel improves coherence and yields smoother, more stable coherence curves across the number of topics. Qualitative analysis indicates reduced lexical overlap between topics and clearer separation of event-centered and institutional themes, especially in political and financial news. Overall, the proposed pipeline relies on CPU-based NLP tools and sparse linear algebra, providing a computationally lightweight and interpretable complement to embedding- and LLM-based topic modeling in large-scale news monitoring. Full article
Show Figures

Figure 1

25 pages, 69315 KB  
Article
GMGbox: A Graphical Modeling-Based Protocol Adaptation Engine for Industrial Control Systems
by Rong Zheng, Song Zheng, Chaoru Liu, Liang Yue and Hongyu Wu
Appl. Sci. 2025, 15(23), 12792; https://doi.org/10.3390/app152312792 - 3 Dec 2025
Viewed by 297
Abstract
The agility and scalability of modern industrial control systems critically depend on seamlessly integrating of heterogeneous field devices. However, this integration is fundamentally hindered at the communication level by the diversity of proprietary industrial protocols, which creates data silos and impedes the implementation [...] Read more.
The agility and scalability of modern industrial control systems critically depend on seamlessly integrating of heterogeneous field devices. However, this integration is fundamentally hindered at the communication level by the diversity of proprietary industrial protocols, which creates data silos and impedes the implementation of advanced control strategies. To overcome this communication barrier, this paper presents GMGbox, a graphical modeling-based protocol adaptation engine. GMGbox encapsulates protocol parsing and data conversion logic into reusable graphical components, effectively bridging the communication gap between diverse industrial devices and control applications. These components are orchestrated by a graphical modeling program engine that enables codeless protocol configuration and supports dynamic loading of protocol dictionary templates to integrate protocol variants, thereby ensuring high extensibility. Experimental results demonstrate that GMGbox can concurrently and reliably parse multiple heterogeneous industrial communication protocols, such as Mitsubishi MELSEC-QNA, Siemens S7-TCP, and Modbus-TCP. Furthermore, it allows engineers to visually adjust protocol algorithms and parameters online, significantly reducing development complexity and iteration time. The proposed engine provides a flexible and efficient data communication backbone for building reconfigurable industrial control systems. Full article
Show Figures

Figure 1

19 pages, 1572 KB  
Article
Proximity Loses: Real-Time Resolution of Ambiguous Wh-Questions in Japanese
by Chie Nakamura, Suzanne Flynn, Yoichi Miyamoto and Noriaki Yusa
Languages 2025, 10(12), 288; https://doi.org/10.3390/languages10120288 - 26 Nov 2025
Viewed by 329
Abstract
This study investigated how Japanese speakers interpret structurally ambiguous wh-questions, testing whether filler–gap resolution is guided by syntactic resolution based on hierarchical structure or linear locality based on surface word order. We combined behavioral key-press responses with fine-grained eye-tracking data and applied cluster-based [...] Read more.
This study investigated how Japanese speakers interpret structurally ambiguous wh-questions, testing whether filler–gap resolution is guided by syntactic resolution based on hierarchical structure or linear locality based on surface word order. We combined behavioral key-press responses with fine-grained eye-tracking data and applied cluster-based permutation analysis to capture the moment-by-moment time course of syntactic interpretation as sentences were processed in real time. Key-press responses revealed a preference for resolving the dependency at the main clause (MC) gap position. Eye-tracking data showed early predictive fixations to the MC picture, followed by shifts to the embedded clause (EC) picture as the embedded event was described. These shifts occurred prior to the appearance of syntactic cues that signal the presence of an EC structure, such as the complementizer -to, and were therefore most likely guided by referential alignment with the linguistic input rather than by syntactic reanalysis. A subsequent return of the gaze to the MC picture occurred when the clause-final question particle -ka became available, confirming the interrogative use of the wh-phrase. Both key-press and eye-tracking data showed that participants did not commit to the first grammatically available EC interpretation but instead waited until clause-final particle information confirmed the interrogative use of the wh-phrase, ultimately favoring the MC interpretation. This pattern supports the view that filler–gap resolution is guided by structural locality rather than linear locality. By using high-resolution temporal data and statistically robust analytic techniques, this study demonstrates that Japanese comprehenders engage in predictive yet structurally cautious parsing. These findings challenge earlier claims that filler–gap resolution in Japanese is primarily driven by linear locality and instead showed a preference for resolving dependencies at the structurally higher MC position, consistent with parsing biases previously observed in English, despite typological differences in word order between the two languages. This preference also reflects sensitivity to language-specific morpho-syntactic cues in Japanese, such as clause-final particles. Full article
Show Figures

Figure 1

19 pages, 4893 KB  
Article
LLMs in Staging: An Orchestrated LLM Workflow for Structured Augmentation with Fact Scoring
by Giuseppe Trimigno, Gianfranco Lombardo, Michele Tomaiuolo, Stefano Cagnoni and Agostino Poggi
Future Internet 2025, 17(12), 535; https://doi.org/10.3390/fi17120535 - 24 Nov 2025
Viewed by 507
Abstract
Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these [...] Read more.
Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these limitations, we propose an explainable and structured prompt augmentation pipeline that enhances inputs using pre-trained models and rule-based extractors, without requiring external sources. We describe this approach as an orchestrated LLM workflow: a structured sequence in which lightweight LLM modules assume specialized roles. Specifically, (1) an extractor module identifies factual triples from input prompts by combining dependency parsing with a rule-based extraction algorithm; (2) a scorer module, based on a generic lightweight LLM, evaluates the importance of each triple via its self-attention patterns, leveraging internal beliefs to promote explainability and trustworthy cooperation with the downstream model; (3) a performer module processes the augmented prompt for downstream tasks in supervised fine-tuning or zero-shot settings. Much like in a theater staging, each module operates transparently behind the scenes to support and elevate the performer’s final output. We evaluate this approach across multiple performer architectures (encoder-only, encoder-decoder, and decoder-only) and NLP tasks (multiple-choice QA, open-book QA, and summarization). Our results show that this structured augmentation with scored facts yields consistent improvements compared to baseline prompting: up to a 28.78% accuracy improvement for multiple-choice QA, up to a 9.42% BLEURT improvement for open-book QA, and up to a 18.14% ROUGE-L improvement for summarization. By decoupling knowledge scoring from task execution, our method provides a practical, interpretable, and low-cost alternative to RAG in static or knowledge-limited environments. Full article
Show Figures

Graphical abstract

18 pages, 272 KB  
Article
Measuring Narrative Complexity Among Suicide Deaths in the National Violent Death Reporting System (2003–2021 NVDRS)
by Christina Chance, Alina Arseniev-Koehler, Vickie M. Mays, Kai-Wei Chang and Susan D. Cochran
Information 2025, 16(11), 989; https://doi.org/10.3390/info16110989 - 15 Nov 2025
Viewed by 496
Abstract
A widely used repository of violent death records is the U.S. Centers for Disease Control National Violent Death Reporting System (NVDRS). The NVDRS includes narrative data, which researchers frequently utilize to go beyond its structured variables. Prior work has shown that NVDRS narratives [...] Read more.
A widely used repository of violent death records is the U.S. Centers for Disease Control National Violent Death Reporting System (NVDRS). The NVDRS includes narrative data, which researchers frequently utilize to go beyond its structured variables. Prior work has shown that NVDRS narratives vary in length depending on decedent and incident characteristics, including race/ethnicity. Whether these length differences reflect differences in narrative information potential is unclear. We use the 2003–2021 NVDRS to investigate narrative length and complexity measures among 300,323 suicides varying in decedent and incident characteristics. To do so, we operationalized narrative complexity using three manifest measures: word count, sentence count, and dependency tree depth. We then employed regression methods to predict word counts and narrative complexity scores from decedent and incident characteristics. Both were consistently lower for black non-Hispanic decedents compared to white, non-Hispanic decedents. Although narrative complexity is just one aspect of narrative information potential, these findings suggest that the information in NVDRS narratives is more limited for some racial/ethnic minorities. Future studies, possibly leveraging large language models, are needed to develop robust measures to aid in determining whether narratives in the NVDRS have achieved their stated goal of fully describing the circumstances of suicide. Full article
Show Figures

Figure 1

22 pages, 979 KB  
Article
Multi-Modal Semantic Fusion for Smart Contract Vulnerability Detection in Cloud-Based Blockchain Analytics Platforms
by Xingyu Zeng, Qiaoyan Wen and Sujuan Qin
Electronics 2025, 14(21), 4188; https://doi.org/10.3390/electronics14214188 - 27 Oct 2025
Viewed by 754
Abstract
With the growth of trusted computing demand for big data analysis, cloud computing platforms are reshaping trusted data infrastructure by integrating Blockchain as a Service (BaaS), which uses elastic resource scheduling and heterogeneous hardware acceleration to support petabyte level multi-institution data security exchange [...] Read more.
With the growth of trusted computing demand for big data analysis, cloud computing platforms are reshaping trusted data infrastructure by integrating Blockchain as a Service (BaaS), which uses elastic resource scheduling and heterogeneous hardware acceleration to support petabyte level multi-institution data security exchange in medical, financial, and other fields. As the core hub of data-intensive scenarios, the BaaS platform has the dual capabilities of privacy computing and process automation. However, its deep dependence on smart contracts generates new code layer vulnerabilities, resulting in malicious contamination of analysis results. The existing detection schemes are limited to the perspective of single-source data, which makes it difficult to capture both global semantic associations and local structural details in a cloud computing environment, leading to a performance bottleneck in terms of scalability and detection accuracy. To address these challenges, this paper proposes a smart contract vulnerability detection method based on multi-modal semantic fusion for the blockchain analysis platform of cloud computing. Firstly, the contract source code is parsed into an abstract syntax tree, and the key code is accurately located based on the predefined vulnerability feature set. Then, the text features and graph structure features of key codes are extracted in parallel to realize the deep fusion of them. Finally, with the help of attention enhancement, the vulnerability probability is output through the fully connected network. The experiments on Ethereum benchmark datasets show that the detection accuracy of our method for re-entrancy vulnerability, timestamp vulnerability, overflow/underflow vulnerability, and delegatecall vulnerability can reach 92.2%, 96.3%, 91.4%, and 89.5%, surpassing previous methods. Additionally, our method has the potential for practical deployment in cloud-based blockchain service environments. Full article
(This article belongs to the Special Issue New Trends in Cloud Computing for Big Data Analytics)
Show Figures

Figure 1

8 pages, 218 KB  
Proceeding Paper
Towards an Explainable Linguistic Approach to the Identification of Expressive Forms Within Arabic Text
by Zouheir Banou, Sanaa El Filali, El Habib Benlahmar, Fatima-Zahra Alaoui and Laila El Jiani
Eng. Proc. 2025, 112(1), 26; https://doi.org/10.3390/engproc2025112026 - 15 Oct 2025
Viewed by 466
Abstract
This paper presents a rule-based negation and litotes detection system for Modern Standard Arabic. Unlike purely statistical approaches, the proposed pipeline leverages linguistic structures, lexical resources, and dependency parsing to identify negated expressions, exception clauses, and instances of litotic inversion, where rhetorical negation [...] Read more.
This paper presents a rule-based negation and litotes detection system for Modern Standard Arabic. Unlike purely statistical approaches, the proposed pipeline leverages linguistic structures, lexical resources, and dependency parsing to identify negated expressions, exception clauses, and instances of litotic inversion, where rhetorical negation conveys an implicit positive meaning. The system was applied to a large-scale subset of the Arabic OSCAR corpus, filtered by sentence length and syntactic structure. The results show the successful detection of 5193 negated expressions and 1953 litotic expressions through antonym matching. Additionally, 200 instances involving exception prepositions were identified, reflecting their syntactic specificity and rarity in Arabic. The system is fully interpretable, reproducible, and well-suited to low-resource environments where machine learning approaches may not be viable. Its ability to scale across heterogeneous data while preserving linguistic sensitivity demonstrates the relevance of rule-based systems for morphologically rich and structurally complex languages. This work contributes a practical framework for analyzing negation phenomena and offers insight into rhetorical inversion in Arabic discourse. Although coverage of rarer structures is limited, the pipeline provides a solid foundation for future refinement and domain-specific applications in figurative language processing. Full article
Show Figures

Figure 1

16 pages, 235 KB  
Entry
The Computational Study of Old English
by Javier Martín Arista
Encyclopedia 2025, 5(3), 137; https://doi.org/10.3390/encyclopedia5030137 - 4 Sep 2025
Viewed by 1511
Definition
This entry presents a comprehensive overview of the computational study of Old English that surveys the evolution from early digital corpora to recent artificial intelligence applications. Six interconnected domains are examined: textual resources (including the Helsinki Corpus, the Dictionary of Old English [...] Read more.
This entry presents a comprehensive overview of the computational study of Old English that surveys the evolution from early digital corpora to recent artificial intelligence applications. Six interconnected domains are examined: textual resources (including the Helsinki Corpus, the Dictionary of Old English Corpus, and the York-Toronto-Helsinki Parsed Corpus), lexicographical resources (analysing approaches from Bosworth–Toller to the Dictionary of Old English), corpus lemmatisation (covering both prose and poetic texts), treebanks (particularly Universal Dependencies frameworks), and artificial intelligence applications. The paper shows that computational methodologies have transformed Old English studies because they facilitate large-scale analyses of morphology, syntax, and semantics previously impossible through traditional philological methods. Recent innovations are highlighted, including the development of lexical databases like Nerthusv5, dependency parsing methods, and the application of transformer models and NLP libraries to historical language processing. In spite of these remarkable advances, problems persist, including limited corpus size, orthographic inconsistency, and methodological difficulties in applying modern computational techniques to historical languages. The conclusion is reached that the future of computational Old English studies lies in the integration of AI capabilities with traditional philological expertise, an approach that enhances traditional scholarship and opens new avenues for understanding Anglo-Saxon language and culture. Full article
(This article belongs to the Section Arts & Humanities)
16 pages, 1328 KB  
Article
Parsing Old English with Universal Dependencies—The Impacts of Model Architectures and Dataset Sizes
by Javier Martín Arista, Ana Elvira Ojanguren López and Sara Domínguez Barragán
Big Data Cogn. Comput. 2025, 9(8), 199; https://doi.org/10.3390/bdcc9080199 - 30 Jul 2025
Viewed by 2122
Abstract
This study presents the first systematic empirical comparison of neural architectures for Universal Dependencies (UD) parsing in Old English, thus addressing central questions in computational historical linguistics and low-resource language processing. We evaluate three approaches—a baseline spaCy pipeline, a pipeline with a pretrained [...] Read more.
This study presents the first systematic empirical comparison of neural architectures for Universal Dependencies (UD) parsing in Old English, thus addressing central questions in computational historical linguistics and low-resource language processing. We evaluate three approaches—a baseline spaCy pipeline, a pipeline with a pretrained tok2vec component, and a MobileBERT transformer-based model—across datasets ranging from 1000 to 20,000 words. Our results demonstrate that the pretrained tok2vec model consistently outperforms alternatives, because it achieves 83.24% UAS and 74.23% LAS with the largest dataset, whereas the transformer-based approach substantially underperforms despite higher computational costs. Performance analysis reveals that basic tagging tasks reach 85–90% accuracy, while dependency parsing achieves approximately 75% accuracy. We identify critical scaling thresholds, with substantial improvements occurring between 1000 and 5000 words and diminishing returns beyond 10,000 words, which provides insights into scaling laws for historical languages. Technical analysis reveals that the poor performance of the transformer stems from parameter-to-data ratio mismatches (1250:1) and the unique orthographic and morphological characteristics of Old English. These findings defy assumptions about transformer superiority in low-resource scenarios and establish evidence-based guidelines for researchers working with historical languages. The broader significance of this study extends to enabling an automated analysis of three million words of extant Old English texts and providing a framework for optimal architecture selection in data-constrained environments. Our results suggest that medium-complexity architectures with monolingual pretraining offer superior cost–benefit trade-offs compared to complex transformer models for historical language processing. Full article
Show Figures

Figure 1

19 pages, 2564 KB  
Article
FLIP: A Novel Feedback Learning-Based Intelligent Plugin Towards Accuracy Enhancement of Chinese OCR
by Xinyue Tao, Yueyue Han, Yakai Jin and Yunzhi Wu
Mathematics 2025, 13(15), 2372; https://doi.org/10.3390/math13152372 - 24 Jul 2025
Viewed by 1185
Abstract
Chinese Optical Character Recognition (OCR) technology is essential for digital transformation in Chinese regions, enabling automated document processing across various applications. However, Chinese OCR systems struggle with visually similar characters, where subtle stroke differences lead to systematic recognition errors that limit practical deployment [...] Read more.
Chinese Optical Character Recognition (OCR) technology is essential for digital transformation in Chinese regions, enabling automated document processing across various applications. However, Chinese OCR systems struggle with visually similar characters, where subtle stroke differences lead to systematic recognition errors that limit practical deployment accuracy. This study develops FLIP (Feedback Learning-based Intelligent Plugin), a lightweight post-processing plugin designed to improve Chinese OCR accuracy across different systems without external dependencies. The plugin operates through three core components as follows: UTF-8 encoding-based output parsing that converts OCR results into mathematical representations, error correction using information entropy and weighted similarity measures to identify and fix character-level errors, and adaptive feedback learning that optimizes parameters through user interactions. The approach functions entirely through mathematical calculations at the character encoding level, ensuring universal compatibility with existing OCR systems while effectively handling complex Chinese character similarities. The plugin’s modular design enables seamless integration without requiring modifications to existing OCR algorithms, while its feedback mechanism adapts to domain-specific terminology and user preferences. Experimental evaluation on 10,000 Chinese document images using four state-of-the-art OCR models demonstrates consistent improvements across all tested systems, with precision gains ranging from 1.17% to 10.37% and overall Chinese character recognition accuracy exceeding 98%. The best performing model achieved 99.42% precision, with ablation studies confirming that feedback learning contributes additional improvements from 0.45% to 4.66% across different OCR architectures. Full article
(This article belongs to the Special Issue Crowdsourcing Learning: Theories, Algorithms, and Applications)
Show Figures

Figure 1

26 pages, 1804 KB  
Article
Dependency-Aware Entity–Attribute Relationship Learning for Text-Based Person Search
by Wei Xia, Wenguang Gan and Xinpan Yuan
Big Data Cogn. Comput. 2025, 9(7), 182; https://doi.org/10.3390/bdcc9070182 - 7 Jul 2025
Viewed by 1147
Abstract
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and [...] Read more.
Text-based person search (TPS), a critical technology for security and surveillance, aims to retrieve target individuals from image galleries using textual descriptions. The existing methods face two challenges: (1) ambiguous attribute–noun association (AANA), where syntactic ambiguities lead to incorrect associations between attributes and the intended nouns; and (2) textual noise and relevance imbalance (TNRI), where irrelevant or non-discriminative tokens (e.g., ‘wearing’) reduce the saliency of critical visual attributes in the textual description. To address these aspects, we propose the dependency-aware entity–attribute alignment network (DEAAN), a novel framework that explicitly tackles AANA through dependency-guided attention and TNRI via adaptive token filtering. The DEAAN introduces two modules: (1) dependency-assisted implicit reasoning (DAIR) to resolve AANA through syntactic parsing, and (2) relevance-adaptive token selection (RATS) to suppress TNRI by learning token saliency. Experiments on CUHK-PEDES, ICFG-PEDES, and RSTPReid demonstrate state-of-the-art performance, with the DEAAN achieving a Rank-1 accuracy of 76.71% and an mAP of 69.07% on CUHK-PEDES, surpassing RDE by 0.77% in Rank-1 and 1.51% in mAP. Ablation studies reveal that DAIR and RATS individually improve Rank-1 by 2.54% and 3.42%, while their combination elevates the performance by 6.35%, validating their synergy. This work bridges structured linguistic analysis with adaptive feature selection, demonstrating practical robustness in surveillance-oriented TPS scenarios. Full article
Show Figures

Figure 1

24 pages, 27167 KB  
Article
ICT-Net: A Framework for Multi-Domain Cross-View Geo-Localization with Multi-Source Remote Sensing Fusion
by Min Wu, Sirui Xu, Ziwei Wang, Jin Dong, Gong Cheng, Xinlong Yu and Yang Liu
Remote Sens. 2025, 17(12), 1988; https://doi.org/10.3390/rs17121988 - 9 Jun 2025
Viewed by 1152
Abstract
Traditional single neural network-based geo-localization methods for cross-view imagery primarily rely on polar coordinate transformations while suffering from limited global correlation modeling capabilities. To address these fundamental challenges of weak feature correlation and poor scene adaptation, we present a novel framework termed ICT-Net [...] Read more.
Traditional single neural network-based geo-localization methods for cross-view imagery primarily rely on polar coordinate transformations while suffering from limited global correlation modeling capabilities. To address these fundamental challenges of weak feature correlation and poor scene adaptation, we present a novel framework termed ICT-Net (Integrated CNN-Transformer Network) that synergistically combines convolutional neural networks with Transformer architectures. Our approach harnesses the complementary strengths of CNNs in capturing local geometric details and Transformers in establishing long-range dependencies, enabling comprehensive joint perception of both local and global visual patterns. Furthermore, capitalizing on the Transformer’s flexible input processing mechanism, we develop an attention-guided non-uniform cropping strategy that dynamically eliminates redundant image patches with minimal impact on localization accuracy, thereby achieving enhanced computational efficiency. To facilitate practical deployment, we propose a deep embedding clustering algorithm optimized for rapid parsing of geo-localization information. Extensive experiments demonstrate that ICT-Net establishes new state-of-the-art localization accuracy on the CVUSA benchmark, achieving a top-1 recall rate improvement of 8.6% over previous methods. Additional validation on a challenging real-world dataset collected at Beihang University (BUAA) further confirms the framework’s effectiveness and practical applicability in complex urban environments, particularly showing 23% higher robustness to vegetation variations. Full article
Show Figures

Figure 1

19 pages, 8750 KB  
Article
FP-Deeplab: A Novel Face Parsing Network for Fine-Grained Boundary Detection and Semantic Understanding
by Borui Zeng, Can Shu, Ziqi Liao, Jingru Yu, Zhiyu Liu and Xiaoyan Chen
Appl. Sci. 2025, 15(11), 6016; https://doi.org/10.3390/app15116016 - 27 May 2025
Cited by 1 | Viewed by 1820
Abstract
Facial semantic segmentation, as a critical technology in high-level visual understanding, plays an important role in applications such as facial editing, augmented reality, and identity recognition. However, due to the complexity of facial structures, ambiguous boundaries, and inconsistent scales of facial components, traditional [...] Read more.
Facial semantic segmentation, as a critical technology in high-level visual understanding, plays an important role in applications such as facial editing, augmented reality, and identity recognition. However, due to the complexity of facial structures, ambiguous boundaries, and inconsistent scales of facial components, traditional methods still suffer from significant limitations in detail preservation and contextual modeling. To address these challenges, this paper proposes a facial parsing network based on the Deeplabv3+ framework, named FP-Deeplab, which aims to improve segmentation performance and generalization capability through structurally enhanced modules. Specifically, two key modules are designed: (1) the Context-Channel Refine Feature Enhancement (CCR-FE) module, which integrates multi-scale contextual strip convolutions and Cross-Axis Attention and introduces a channel attention mechanism to strengthen the modeling of long-range spatial dependencies and enhances the perception and representation of boundary regions; (2) the Self-Modulation Attention Feature Integration with Regularization (SimFA) module, which combines local detail modeling and a parameter-free channel attention modulation mechanism to achieve fine-grained reconstruction and enhancement of semantic features, effectively mitigating boundary blur and information loss during the upsampling stage. The experimental results on two public facial segmentation datasets, CelebAMask-HQ and HELEN, demonstrate that FP-Deeplab improves the baseline model by 3.8% in Mean IoU and 2.3% in the overall F1-score on the HELEN dataset, and it achieves a Mean F1-score of 84.8% on the CelebAMask-HQ dataset. Furthermore, the proposed method shows superior accuracy and robustness in multiple key component categories, especially in long-tailed regions, validating its effectiveness. Full article
Show Figures

Figure 1

26 pages, 2363 KB  
Article
Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm
by Fuwen Hu, Chun Wang and Xuefei Wu
Appl. Sci. 2025, 15(10), 5697; https://doi.org/10.3390/app15105697 - 20 May 2025
Cited by 4 | Viewed by 6816
Abstract
Facility layout design (FLD) is critical for optimizing manufacturing efficiency, yet traditional approaches struggle with complexity, dynamic constraints, and fragmented data integration. This study proposes a generative-AI-enabled facility layout design, a novel paradigm aligning with Industry 4.0, to address these challenges by integrating [...] Read more.
Facility layout design (FLD) is critical for optimizing manufacturing efficiency, yet traditional approaches struggle with complexity, dynamic constraints, and fragmented data integration. This study proposes a generative-AI-enabled facility layout design, a novel paradigm aligning with Industry 4.0, to address these challenges by integrating generative artificial intelligence (AI), semantic models, and data-driven optimization. The proposed method evolves from three historical paradigms: experience-based methods, operations research, and simulation-based engineering. The metamodels supporting the generative-AI-enabled facility layout design is the Asset Administration Shell (AAS), which digitizes physical assets and their relationships, enabling interoperability across systems. Domain-specific knowledge graphs, constructed by parsing AAS metadata and enriched by large language models (LLMs), capture multifaceted relationships (e.g., spatial adjacency, process dependencies, safety constraints) to guide layout generation. The convolutional knowledge graph embedding (ConvE) method is employed for link prediction, converting entities and relationships into low-dimensional vectors to infer optimal spatial arrangements while addressing data sparsity through negative sampling. The proposed reference architecture for generative-AI-enabled facility layout design supports end-to-end layout design, featuring a 3D visualization engine, AI-driven optimization, and real-time digital twins. Prototype testing demonstrates the system’s end-to-end generation ability from requirement-driven contextual prompts and extensively reduced complexity of modeling, integration, and optimization. Key innovations include the fusion of AAS with LLM-derived contextual knowledge, dynamic adaptation via big data streams, and a hybrid optimization approach balancing competing objectives. The 3D layout generation results demonstrate a scalable, adaptive solution for storage workshops, bridging gaps between isolated data models and human–AI collaboration. This research establishes a foundational framework for AI-driven facility planning, offering actionable insights for AI-enabled facility layout design adoption and highlighting future directions in the generative design of complex engineering. Full article
Show Figures

Figure 1

31 pages, 5323 KB  
Article
Learning the Style via Mixed SN-Grams: An Evaluation in Authorship Attribution
by Juan Pablo Francisco Posadas-Durán, Germán Ríos-Toledo, Erick Velázquez-Lozada, J. A. de Jesús Osuna-Coutiño, Madaín Pérez-Patricio and Fernando Pech May
AI 2025, 6(5), 104; https://doi.org/10.3390/ai6050104 - 20 May 2025
Viewed by 2095
Abstract
This study addresses the problem of authorship attribution with a novel method for modeling writing style using dependency tree subtree parsing. This method exploits the syntactic information of sentences using mixed syntactic n-grams (mixed sn-grams). The method comprises an algorithm to generate [...] Read more.
This study addresses the problem of authorship attribution with a novel method for modeling writing style using dependency tree subtree parsing. This method exploits the syntactic information of sentences using mixed syntactic n-grams (mixed sn-grams). The method comprises an algorithm to generate mixed sn-grams by integrating words, POS tags, and dependency relation tags. The mixed sn-grams are used as style markers to feed Machine Learning methods such as a SVM. A comparative analysis was performed to evaluate the performance of the proposed mixed sn-grams method against homogeneous sn-grams with the PAN-CLEF 2012 and CCAT50 datasets. Experiments with PAN 2012 showed the potential of mixed sn-grams to model a writing style by outperforming homogeneous sn-grams. On the other hand, experiments with CCAT50 showed that training with mixed sn-grams improves accuracy over homogeneous sn-grams, with the POS-Word category showing the best result. The study’s results suggest that mixed sn-grams constitute effective stylistic markers for building a reliable writing style model, which machine learning algorithms can learn. Full article
Show Figures

Figure 1

Back to TopTop