Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,311)

Search Parameters:
Keywords = natural language processing (NLP)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 27840 KB  
Article
Decoding Public Perception of Brownfield-Transformed Urban Parks: An Interpretable Machine Learning Framework Integrating XGBoost–SHAP
by Xiaomin Wang, Xiangru Chen, Chao Yang, Zhongyuan Zhao and Xinling Chen
Buildings 2026, 16(8), 1632; https://doi.org/10.3390/buildings16081632 (registering DOI) - 21 Apr 2026
Abstract
Brownfield-transformed urban parks, particularly those derived from industrial heritage, play a critical role in both cultural preservation and public-space provision. However, existing studies often rely on linear models and general urban contexts, limiting their ability to capture nonlinear, interaction-driven perception and translate analytical [...] Read more.
Brownfield-transformed urban parks, particularly those derived from industrial heritage, play a critical role in both cultural preservation and public-space provision. However, existing studies often rely on linear models and general urban contexts, limiting their ability to capture nonlinear, interaction-driven perception and translate analytical results into design-oriented insights. To address this gap, this study develops an interpretable data-driven framework integrating NLP (natural language processing) with explainable machine learning. Using social media reviews from Shougang Park in Beijing, built environmental elements are identified and structured into four dimensions—Accessibility, Safety, Comfort, and Enjoyment. An XGBoost model combined with SHAP analysis is employed to examine variable importance, nonlinear relationships, and interaction effects. The results reveal that visitor satisfaction is governed by heterogeneous and nonlinear relationships rather than independent additive effects. Several variables exhibit threshold-like, diminishing, and inverted-U-shaped patterns, indicating sensitivity to intensity ranges. More importantly, spatial perception emerges from the nonlinear coupling of multiple elements, forming four representative interaction types: compensatory, inverted-U-shaped, context-dependent, and threshold-like relationships. Key interactions are concentrated around industrial landscape, leisure activities, and supporting facilities. Building on these findings, the study translates interactions into design-oriented strategies, emphasizing synergistic configuration, functional balance, moderated development intensity, and context- sensitive programming. By linking interpretable machine learning with spatial design, this research advances an interaction-oriented paradigm and provides a transferable framework for satisfaction-informed evaluation and optimization of brownfields. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

37 pages, 1538 KB  
Systematic Review
Automatic Extraction of Suppliers’ ESG Compliance Information from Textual Sources: A Literature Review
by Marco Perona and Laura Scalvini
Appl. Sci. 2026, 16(8), 4024; https://doi.org/10.3390/app16084024 (registering DOI) - 21 Apr 2026
Abstract
This paper presents a literature review regarding the automatic extraction of meaningful information regarding suppliers’ ESG and sustainability compliance from textual sources. Assessing suppliers’ ESG compliance has become a key challenge for procurement managers. Given the large number of suppliers and required data [...] Read more.
This paper presents a literature review regarding the automatic extraction of meaningful information regarding suppliers’ ESG and sustainability compliance from textual sources. Assessing suppliers’ ESG compliance has become a key challenge for procurement managers. Given the large number of suppliers and required data points, traditional approaches such as questionnaires and audits are inefficient, ineffective and difficult to scale. To solve this problem, we investigate whether the required information can be automatically harvested from suppliers’ textual sources. Our structured literature review identified 82 papers on which we performed a descriptive analysis, finding a rich and flourishing body of literature produced by a heterogeneous scientific community. We further reduced our sample to 73 full-text articles that supported a more in-depth content-based analysis. We investigated which data sources can be used in particular, which technologies can be leveraged, and which types of outputs can be generated. Even though they could provide much of the required information, corporate websites are rarely utilized as data sources, partly due to the limited adoption of large language models (LLMs). LLMs are less diffused than traditional Natural Language Processing (NLP) techniques due to their recent introduction and some gaps that still limit their performance. This represents both a constraint and an opportunity for future research. Full article
(This article belongs to the Special Issue Sustainability and Green Supply Chain Management in Industrial Fields)
Show Figures

Figure 1

16 pages, 2924 KB  
Article
The Impact of Artificial Intelligence Systems and Tools on Education: Comparative Social Media Analytics of Computing Versus Business Students
by Lili Yan, Hongren Wang, Zerong Xie, Dickson K. W. Chiu, Samuel Ping-Man Choi, Kevin K. W. Ho and Ruwen Tian
Systems 2026, 14(4), 451; https://doi.org/10.3390/systems14040451 (registering DOI) - 21 Apr 2026
Abstract
Artificial intelligence (AI) systems and tools are increasingly reshaping educational practices. This study examines perspectives shared in student-focused online communities on AI’s impact on education, comparing those of computer science (CS) and business students through an analysis of Reddit posts. Using natural language [...] Read more.
Artificial intelligence (AI) systems and tools are increasingly reshaping educational practices. This study examines perspectives shared in student-focused online communities on AI’s impact on education, comparing those of computer science (CS) and business students through an analysis of Reddit posts. Using natural language processing (NLP), sentiment analysis, and Latent Dirichlet Allocation (LDA) topic modeling, we analyzed 1108 posts collected from six subreddits. Results reveal distinct thematic focuses: CS students emphasize technical aspects, including programming efficiency, coding assistance, and concerns about job displacement, while business students focus on decision-making enhancement, financial analysis applications, and operational efficiency. Sentiment analysis indicates that the Business/Finance-oriented corpus is slightly more positive than the CS-oriented corpus (51.9% vs. 50.1% positive). The CS-oriented corpus also contains a higher proportion of negative posts (36.0% vs. 33.2%). These differences reflect discipline-specific epistemological frameworks shaping AI perception. The findings provide educators with guidelines for developing tailored AI integration strategies that address discipline-specific concerns and opportunities. This study contributes to understanding how academic background influences perceptions of AI in education, offering insights for curriculum design and policy development. Full article
Show Figures

Figure 1

18 pages, 1843 KB  
Article
MENARA: Medical Natural Arabic Response Assistant
by Ahmed Ibrahim, Abdullah Hosseini, Hoda Helmy, Maryam Arabi, Aya AlShareef, Wafa Lakhdhar and Ahmed Serag
Mach. Learn. Knowl. Extr. 2026, 8(4), 110; https://doi.org/10.3390/make8040110 (registering DOI) - 21 Apr 2026
Abstract
Dialectal variation presents a major challenge for deploying medical language models in real-world healthcare settings, where patient–clinician communication often occurs in regional vernaculars rather than standardized language forms. This challenge is particularly pronounced in the Arabic-speaking world, where clinical interactions frequently take place [...] Read more.
Dialectal variation presents a major challenge for deploying medical language models in real-world healthcare settings, where patient–clinician communication often occurs in regional vernaculars rather than standardized language forms. This challenge is particularly pronounced in the Arabic-speaking world, where clinical interactions frequently take place in diverse dialects that differ substantially from Modern Standard Arabic. Fine-tuning and maintaining separate models for each dialect is computationally inefficient and difficult to scale, motivating more integrated approaches. In this work, we present MENARA, an Arabic medical language model constructed by merging Egyptian Arabic, Moroccan Darija, and medical-domain specialists through model merging. We extend prior feasibility findings through comprehensive evaluation of cross-dialect performance, medical safety, and cross-lingual knowledge retention. Specifically, we introduce a fine-grained dialect composition analysis to quantify lexical purity and structured code-switching behavior, benchmark against state-of-the-art Arabic LLMs, conduct subject-matter-expert assessment of both dialectal fidelity and medical appropriateness. The results show that model merging preserves core medical competence while enabling robust dialectal adaptation, achieving strong cross-dialect fidelity while substantially reducing storage and deployment overhead compared to maintaining separate models. These findings establish model merging as a potentially practical and resource-efficient paradigm for dialect-aware medical NLP in linguistically fragmented healthcare environments. Full article
Show Figures

Figure 1

27 pages, 962 KB  
Article
DMAR: Dynamic Multi-Anchor Retrieval with Structure-Aware Query Reformulation for Knowledge-Augmented Generation
by Zhou Lei, Yanqi Xu and Shengbo Chen
Appl. Sci. 2026, 16(8), 3963; https://doi.org/10.3390/app16083963 - 19 Apr 2026
Viewed by 127
Abstract
Retrieval-Augmented Generation (RAG) has become an important paradigm for knowledge-intensive natural language processing, as it enables Large Language Models (LLMs) to access external evidence beyond their parametric memory. However, existing RAG pipelines often rely on static user queries and predominantly semantic matching, which [...] Read more.
Retrieval-Augmented Generation (RAG) has become an important paradigm for knowledge-intensive natural language processing, as it enables Large Language Models (LLMs) to access external evidence beyond their parametric memory. However, existing RAG pipelines often rely on static user queries and predominantly semantic matching, which makes them less effective in data-intensive scenarios that require structured knowledge and multi-hop evidence aggregation. To address these limitations, we propose DMAR, a dynamic multi-anchor retrieval framework for retrieval refinement in knowledge-augmented generation. DMAR first identifies high-confidence anchor documents from an initial candidate pool through a dual-path evaluator that combines semantic relevance with knowledge-graph-based structural association. The selected anchors are then used to guide generative query reformulation, producing an enriched query for second-stage retrieval, followed by fidelity-controlled reranking to preserve alignment with the user’s original intent. We further model structural relevance using Subgraph Shapley Values and a learnable Siamese GNN-based similarity module. Experiments on five knowledge-intensive benchmarks, covering open-domain question answering, multi-hop reasoning, and fact verification, show that DMAR consistently improves retrieval and downstream answer quality over strong baselines. For example, DMAR achieves an F1 score of 62.5% on HotpotQA and 79.0% on TriviaQA. These results demonstrate that dynamically integrating semantic retrieval, structural knowledge, and query reformulation is an effective approach for robust knowledge-augmented NLP systems. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP): Technologies and Applications)
Show Figures

Figure 1

29 pages, 409 KB  
Article
An AI-Based Security Architecture for Fraud Detection in Cloud Call Centers for Low-Resource Languages: Arabic as a Use Case
by Pinar Boluk and Hana’a Maratouq
Electronics 2026, 15(8), 1718; https://doi.org/10.3390/electronics15081718 - 18 Apr 2026
Viewed by 96
Abstract
Cloud-based telephony platforms face growing fraud risks including voice phishing (vishing), subscription abuse, and organizational impersonation, with detection being especially challenging in low-resource languages such as Arabic. We present an Artificial Intelligence (AI)-based security architecture for fraud detection in Arabic cloud call centers, [...] Read more.
Cloud-based telephony platforms face growing fraud risks including voice phishing (vishing), subscription abuse, and organizational impersonation, with detection being especially challenging in low-resource languages such as Arabic. We present an Artificial Intelligence (AI)-based security architecture for fraud detection in Arabic cloud call centers, combining onboarding verification, behavioral monitoring, domain-adapted Automatic Speech Recognition (ASR), semantic transcript search, and Large Language Model (LLM)-based entity verification. The domain-adapted Langa ASR model achieves a Word Error Rate (WER) of 41.0% and Character Error Rate (CER) of 18.2%, outperforming all evaluated commercial baselines. LLM-based entity extraction with multi-call consensus achieves 97.3% company-name accuracy (Generative Pre-trained Transformer 4, GPT-4) and 92.0% in the cost-effective deployed configuration (GPT-3.5 with log-probability filtering). Evaluated on production data from a Middle East and North Africa (MENA)-region provider spanning more than 1000 accounts, the pipeline flagged 47 accounts of which 41 were confirmed fraudulent (directly observed precision 87.2%, 95% confidence interval (CI): 74.3–95.2%; estimated recall 51–82% under conservative base-rate assumptions—not directly measured), providing evidence for the viability of a unified, threat-model-driven architecture for low-resource telephony fraud detection. Full article
(This article belongs to the Special Issue AI-Enhanced Security: Advancing Threat Detection and Defense)
21 pages, 3068 KB  
Editorial
Artificial Intelligence in Participatory Environments: Technologies, Ethics, and Literacy Aspects
by Theodora Saridou and Charalampos A. Dimoulas
Societies 2026, 16(4), 127; https://doi.org/10.3390/soc16040127 - 15 Apr 2026
Viewed by 379
Abstract
While Artificial Intelligence (AI) approaches date back more than 60 years, there is no doubt that in the last 4 years, we have entered the era of AI. The advanced capabilities of Generative AI (GenAI) and Large Language Models (LLMs) have noticeably reshaped [...] Read more.
While Artificial Intelligence (AI) approaches date back more than 60 years, there is no doubt that in the last 4 years, we have entered the era of AI. The advanced capabilities of Generative AI (GenAI) and Large Language Models (LLMs) have noticeably reshaped multiple sectors, becoming a driving force in participatory environments. Recent developments in Machine/Deep Learning (ML/DL) and Natural Language Processing (NLP) have enabled the introduction of tools and applications integrated into various professional fields. Areas ranging from education and media to art, tourism, and food science incorporate AI technologies to optimize established workflows, facilitate change, enhance creativity, and foster interaction. The current Special Issue includes nineteen multidisciplinary research works exploring AI in participatory environments, primarily focusing on technologies, ethics, and literacy aspects. Employing diverse methodologies, the research identifies various uses of AI along with the critical ethical and legal risks and challenges they entail. Concerns about inaccuracy, algorithmic bias, data infringements, and the potential erosion of transparency and interpretability need to be addressed in every phase of the design and implementation of AI technologies. Co-creative human-in-the-loop processes and human judgment need to be further strengthened and supported through digital/AI literacy initiatives. In this regard, effective regulatory frameworks, inclusive institutional strategies, and targeted training programs can ensure responsible and trustworthy AI use with a balance between technological evolution and human oversight. Full article
Show Figures

Figure 1

25 pages, 1445 KB  
Systematic Review
Deep Learning in the Architecture, Engineering, and Construction (AEC) Industry: Methods, Challenges, and Emerging Opportunities
by Muhammad Imran Khan, Abdul Waheed, Ehsan Harirchian and Bilal Manzoor
Buildings 2026, 16(8), 1546; https://doi.org/10.3390/buildings16081546 - 14 Apr 2026
Viewed by 233
Abstract
In recent years, deep learning (DL) has emerged as a transformative technology with significant potential to advance the Architecture, Engineering, and Construction (AEC) industry. DL enables automation, intelligent decision-making, and predictive analytics across various phases of construction, including design, site monitoring, safety management, [...] Read more.
In recent years, deep learning (DL) has emerged as a transformative technology with significant potential to advance the Architecture, Engineering, and Construction (AEC) industry. DL enables automation, intelligent decision-making, and predictive analytics across various phases of construction, including design, site monitoring, safety management, and facility operations. Despite its growing adoption, research on the comprehensive methods, practical challenges and emerging opportunities of DL in the AEC industry remains limited. This study presents a state-of-the-art review of DL applications in the AEC industry by focusing on key methods, challenges, emerging opportunities and future research directions. A systematic literature review (SLR) was conducted in this study. Three major DL methods applied in the AEC industry were examined: (i) data-driven computer vision, (ii) natural language processing (NLP), and (iii) generative and simulation-based methods. Key challenges were identified: (i) data scarcity issues, (ii) high computational requirements, (iii) limited generalization across projects, (iv) human factors and resistance to adoption, and (v) lack of standardization and interoperability. Additionally, emerging opportunities and future research directions are highlighted: (i) advanced construction site monitoring and safety management, (ii) automated design and generative modeling, (iii) predictive maintenance and facility management, (iv) integration with robotics and autonomous construction systems, and (v) smart project management and decision support systems. This study advances a holistic understanding of DL in the AEC industry by systematically synthesizing current methods, challenges, and emerging trends. It establishes a structured foundation for future research to overcome technical, practical, and organizational challenges, thereby supporting the scalable, intelligent, and sustainable transformation of construction practices. Full article
19 pages, 356 KB  
Article
Screening for Superficial Oral Mucosal Lesions in Sjögren’s Disease Using Natural Language Processing (NLP) Approaches
by Jose Ramon Herrera, Balaji Kolasani, Sandeepkumar Gaddam, Aishwarya Kunam, Devon Roese, George J. Eckert, Grace Gomez Felix Gomez and Thankam P. Thyvalikakath
Oral 2026, 6(2), 44; https://doi.org/10.3390/oral6020044 - 14 Apr 2026
Viewed by 242
Abstract
Background/Objectives: Superficial oral mucosal (SOM) lesions are prevalent among patients with Sjögren’s disease (SjD) due to mucosal dryness. Given the limited evidence on screening and referral for SOMs, and the presence of relevant information only in dental clinical notes, a natural language processing [...] Read more.
Background/Objectives: Superficial oral mucosal (SOM) lesions are prevalent among patients with Sjögren’s disease (SjD) due to mucosal dryness. Given the limited evidence on screening and referral for SOMs, and the presence of relevant information only in dental clinical notes, a natural language processing (NLP) pipeline was developed to screen for SOMs among SjD patients. This retrospective study analyzed dental clinical notes from 180 linked electronic dental and health records, including both with and without a diagnosis of SjD. Materials and Methods: An annotation schema with four classes (SOMs, signs and symptoms of dry mouth, treatment for xerostomia, referral to specialists) was inductively created using the Extensible Human Oracle Suite of Tools (eHOST) to manually annotate clinical notes. Relevant keyterms were retrieved using a rule-based approach with Python’s Natural Language Toolkit (NLTK). SjD and control groups were compared using Fisher’s Exact tests. Four annotators reviewed ninety-three records. Results: SjD patients (mean age 54.8 ± 11.7 years) had fewer total visits across 15 years but had more dental visits per year (10.2 ± 13.3) than controls. SjD patients were more likely to have oral candidiasis (p = 0.041), exhibit signs and symptoms of dry mouth (p = 0.004), receive treatments for xerostomia (p < 0.001), be treated with cholinergic agonists (p = 0.005), and be referred to a specialist (p = 0.046), but findings were not significant for all SOMs. Additionally, SjD patients had a higher proportion of sialadenitis (p = 0.045), rheumatoid arthritis (p = 0.001), systemic lupus erythematosus (p < 0.001), myalgia/myositis/fibromyalgia (p = 0.010), and anxiety/nervousness (p = 0.004). Conclusions: These findings encourage the feasibility of using text mining from dental clinical notes for screening and management of oral conditions. Full article
Show Figures

Figure 1

24 pages, 527 KB  
Article
A Human–AI Collaborative Pipeline for Decision Support in Urban Development Projects Based on Large-Scale Social Media Text Analysis
by Alexander A. Kharlamov and Maria Pilgun
Technologies 2026, 14(4), 228; https://doi.org/10.3390/technologies14040228 - 14 Apr 2026
Viewed by 299
Abstract
The rapid growth of digital communication platforms has generated vast volumes of user-generated textual data and digital footprints, creating growing demand for scalable artificial intelligence systems capable of supporting evidence-based decision-making. This study proposes and evaluates a human–AI collaborative analytical pipeline for multi-class [...] Read more.
The rapid growth of digital communication platforms has generated vast volumes of user-generated textual data and digital footprints, creating growing demand for scalable artificial intelligence systems capable of supporting evidence-based decision-making. This study proposes and evaluates a human–AI collaborative analytical pipeline for multi-class sentiment and aggression analysis of large-scale social media data (N = 15,064 messages) related to an urban infrastructure project. The proposed framework integrates standard NLP preprocessing, machine learning-based classifiers, temporal aggregation, and controlled large language model (LLM)-assisted classification within a structured analytical workflow that incorporates expert validation and oversight. A stratified manual validation procedure (n = 301) demonstrated substantial inter-annotator agreement (κ = 0.70) and stable multi-class classification accuracy (80%). The results indicate that combining sentiment polarity and aggression detection as complementary linguistic indicators improves sensitivity to shifts in discourse dynamics and enables early identification of emerging social tension. The study demonstrates the potential of human–AI collaborative analytical frameworks for transparent, interpretable, and predictive large-scale social media analysis in decision-support contexts. Full article
(This article belongs to the Special Issue Human–AI Collaboration: Emerging Technologies and Applications)
Show Figures

Graphical abstract

23 pages, 3252 KB  
Article
Norm-Driven Generative BIM Design: Semantic Parsing and Automated Layout for Small-Scale Power Infrastructure
by Yulong Chen, Chunli Ying, Hao Zhu, Jun Chen and Daguang Han
Appl. Sci. 2026, 16(8), 3804; https://doi.org/10.3390/app16083804 - 14 Apr 2026
Viewed by 277
Abstract
To deal with the high standards, strong restrictions, and high repeatability that are inside State Grid small-scale infrastructure projects, this research puts forward a norm-driven generative design method, which conquers the low efficiency, compliance dangers, and semantic breakage that are usual in manual [...] Read more.
To deal with the high standards, strong restrictions, and high repeatability that are inside State Grid small-scale infrastructure projects, this research puts forward a norm-driven generative design method, which conquers the low efficiency, compliance dangers, and semantic breakage that are usual in manual modeling. Taking standards such as Q/GDW 11382.3-2015 as the knowledge origin, we construct an ALBERT-BiLSTM-CRF semantic parsing model and change natural-language clauses into executable design restrictions via normative text pre-processing, BIO sequence marking, and rule triplet mapping. Therefore, model training and assessment produce Accuracy, Precision, Recall, and F1 of 98.05%, 95.49%, 95.88%, and 95.59% separately, with 100% precision for logical comparison and conjunction labels; thus, this provides a steady semantic base for the rule base. At the component level, a three-part coding plan and unit module collection are built based on OmniClass and GB/T 51269, which makes semantic consistency and traceability between components and space functions possible. At the system level, a continuous work process is carried out through the Revit API, which covers scheme making, automatic arrangement, and deliverable output. Hence, validation on a real case in a digital operation center for the power system shows that the design time for the third-floor administrative office area was cut from about 20 h to around 4 h, and the first-time solution met all code restrictions, which improves efficiency and compliance in a significant way. The results point out that norm-driven generative design can supply deployable automation and high-quality outputs for small-scale power infrastructure, which provides a sustainable database for digital twins and smart O&M. Full article
(This article belongs to the Section Civil Engineering)
Show Figures

Figure 1

45 pages, 6682 KB  
Article
A Multidimensional MIR Analysis of Acoustic, Linguistic and Cultural Gaps Between Maskandi and Western Music Genres
by Absolom Muzambi, Tebatso Gorgina Moape and Bester Chimbo
Appl. Sci. 2026, 16(8), 3802; https://doi.org/10.3390/app16083802 - 14 Apr 2026
Viewed by 356
Abstract
Contemporary Music Information Retrieval (MIR) and Natural Language Processing (NLP) systems are increasingly applied to diverse musical traditions, yet they are largely grounded in Western musical and linguistic assumptions. This study examines whether commonly used MIR features and multilingual NLP models adequately represent [...] Read more.
Contemporary Music Information Retrieval (MIR) and Natural Language Processing (NLP) systems are increasingly applied to diverse musical traditions, yet they are largely grounded in Western musical and linguistic assumptions. This study examines whether commonly used MIR features and multilingual NLP models adequately represent the acoustic, linguistic, and cultural structures of Maskandi music in comparison to Western music and identifies where representational gaps and biases arise. A multidimensional framework was employed, comprising acoustic and structural MIR analysis, linguistic and semantic lyrical analysis, and bias analysis. A curated dataset of 60 recordings and corresponding lyrics was analysed using rhythm and beat features, pitch contour measures, structural self-similarity, timbre embeddings, semantic similarity, lexical diversity, metaphor density, topic modelling, multilingual embeddings, and dataset-level audits. The results reveal systematic representational failures: beat tracking showed lower median IOI coefficient of variation for Maskandi (0.028) versus Western music (0.040, p = 0.0199) yet exhibited greater algorithmic instability, tempo averaged 131.16 BPM versus 111.69 BPM (p = 0.000262), pitch glide proportions were significantly higher in Maskandi (0.34 vs. 0.16), on-beat energy ratios differed substantially (2.26 vs. 1.19, p < 0.0000007), semantic similarity revealed high intra-genre coherence for Maskandi (0.73) versus Western (0.25), metaphor density approached zero in Maskandi versus up to 7 per 100 words in Western lyrics, topic modeling produced two compact clusters for Maskandi versus 6 dispersed clusters for Western, timbre embeddings achieved a 0.405 silhouette score, dataset audits revealed 0% Maskandi representation across seven major MIR corpora with African traditions comprising <3%. The study concludes that statistical separability does not imply representational adequacy and highlights the need for culturally grounded MIR and NLP representations to support diverse musical traditions. Full article
(This article belongs to the Special Issue Large Language Models and Knowledge Computing)
Show Figures

Figure 1

38 pages, 7214 KB  
Article
Quantitative Mapping of Conceptual Hierarchies and Data-Driven Taxonomies of Japanese Architectural Concepts: A 28-Term Testbed
by Gledis Gjata and Satoshi Yamada
Architecture 2026, 6(2), 62; https://doi.org/10.3390/architecture6020062 - 13 Apr 2026
Viewed by 223
Abstract
Discourse on Japanese architecture relies on qualitative interpretation to link abstract concepts such as “ma” and “mu”, used here as illustrative examples of the conceptual register, with physical spaces, such as engawa, yet lacks quantitative, data-driven validation. This study addresses this gap by [...] Read more.
Discourse on Japanese architecture relies on qualitative interpretation to link abstract concepts such as “ma” and “mu”, used here as illustrative examples of the conceptual register, with physical spaces, such as engawa, yet lacks quantitative, data-driven validation. This study addresses this gap by testing two primary hypotheses: (1) whether abstract Japanese architectural terms form a distinct, computationally recoverable conceptual layer, and (2) whether the corresponding concrete architectural devices cohere into a unified physical mesh rather than being fragmented into unrelated subclusters. We investigate this using a Natural Language Processing (NLP) framework centred on a fine-tuned BERT model, utilising an exhaustive Adjusted Rand Index (ARI) enumeration search over two-way partitions on a target vocabulary of 28 terms. Furthermore, a “definitional audit” compares a FULL corpus against a CLEAN corpus, stripped of explicit glossary-like sentences, to mitigate “shortcut learning”, allowing sensitivity at the conceptual physical boundary to be inspected. Both hypotheses are supported. A stable two-block structure appears across all evaluations, comprising a compact conceptual pocket {aware, ma, mu, wabi, sabi, and wabi_sabi} and a larger physical mesh integrating vocabulary for room, garden, and shrine. Interface structure concentrates in a narrow boundary corridor, most consistently along the engawa–shakkei linkage, with en acting as the principal physical-side interface hub under sparsified network views. In the definitional audit (FULL versus CLEAN), ikezuishi is the only recurrently unstable item, shifting sides under small, defensible changes in corpus cleaning and Japanese-aware sentence segmentation, which is best read as a sensitivity signal rather than a substantive change in macro-structure. Removing glossary-like definitions slightly tightens dispersion while preserving the backbone split, which supports definitional audits as a practical robustness check for distributional studies of architectural vocabularies. Full article
(This article belongs to the Special Issue Architecture in the Digital Age)
Show Figures

Figure 1

20 pages, 489 KB  
Systematic Review
Linguistic Markers in At-Risk Mental States Using Natural Language Processing: A Systematic Review
by Yuhan Zhang, Alba Carrió, Julia Sevilla-Llewellyn-Jones, Enrique Gutiérrez, Ana Calvo, Jose-Blas Navarro and Ana Barajas
Healthcare 2026, 14(8), 999; https://doi.org/10.3390/healthcare14080999 - 10 Apr 2026
Viewed by 292
Abstract
Background/Objectives: In recent years, research on psychosis has increasingly focused on prevention, aiming to implement early interventions that mitigate or reduce its impact. Within this framework, the analysis of linguistic markers in individuals with at-risk mental states (ARMS) has proven valuable for [...] Read more.
Background/Objectives: In recent years, research on psychosis has increasingly focused on prevention, aiming to implement early interventions that mitigate or reduce its impact. Within this framework, the analysis of linguistic markers in individuals with at-risk mental states (ARMS) has proven valuable for identifying those at risk and predicting psychosis onset. Artificial intelligence tools, particularly natural language processing (NLP), have emerged as effective resources for detecting these language-based indicators. This study aims to synthesize the existing scientific evidence on linguistic markers analyzed through NLP techniques in individuals with ARMS. Methods: A systematic review following the PRISMA 2020 protocol was conducted. Three databases (PubMed, PsycInfo, and Scopus) were searched for published articles from their inception to October 2025. Rayyan software was used to manage references and article downloads. Out of ninety initial search results, fifteen studies involving 1313 participants from diverse groups were included in the review. Results: The findings indicated that alterations in semantic coherence, syntactic complexity, referential cohesion, and speech/content poverty differentiated ARMS individuals from healthy controls. Several of these markers, analyzed with NLP methods, predicted the onset of psychosis with accuracy levels ranging from 79% to 100%, although these findings should be interpreted with caution due to the significant methodological heterogeneity and variability in sample sizes across the included studies. Conclusions: NLP techniques offer a powerful approach for detecting language alterations that distinguish ARMS individuals and provide meaningful predictions of psychosis onset, highlighting their potential as a complement to traditional clinical assessments for early identification and prevention. Full article
Show Figures

Figure 1

35 pages, 2657 KB  
Article
Mitigating Metamorphic Malware Through Adversarial Learning Techniques
by Kehinde O. Babaagba and Zhiyuan Tan
Network 2026, 6(2), 22; https://doi.org/10.3390/network6020022 - 8 Apr 2026
Viewed by 248
Abstract
Antivirus (AV) solutions remain a core defence mechanism against malicious software. However, many of these engines struggle to detect metamorphic malware, which continually alters its internal form in unpredictable ways. To address this limitation, we present an adversarially oriented approach that automatically generates [...] Read more.
Antivirus (AV) solutions remain a core defence mechanism against malicious software. However, many of these engines struggle to detect metamorphic malware, which continually alters its internal form in unpredictable ways. To address this limitation, we present an adversarially oriented approach that automatically generates novel malicious variants of existing malware that evade detection by a substantial proportion of AV systems, thereby providing material for strengthening defensive techniques. In this work, an Evolutionary Algorithm (EA) is used to evolve undetectable variants, guided by three fitness criteria: the evasiveness of the produced samples, and their behavioural and structural similarity to the original malware. The proposed method is assessed across three malware families to evaluate the effectiveness of the EA-generated variants. Results indicate that the EA produces diverse mutant variants capable of evading up to 94% of AV detectors for a given malware family, significantly surpassing the evasion rate of the original malware. Furthermore, we evaluated whether the mutants produced by the EA could enhance the training of machine learning models. In this context, a pretrained Natural Language Processing (NLP) transformer was employed within a transfer learning framework to improve the classification of metamorphic malware. When the evolved variants were incorporated into the training data, the approach achieved classification accuracies of up to 93%. These results highlight the value of using diverse EA-generated samples to strengthen malware classifiers, thereby improving the robustness of security systems against evolving threats. Full article
Show Figures

Figure 1

Back to TopTop