MDPI - Publisher of Open Access Journals

21 pages, 1112 KiB

Open AccessArticle

Evaluative Grammar and Non-Standard Comparatives: A Cross-Linguistic Analysis of Ukrainian and English

by Oksana Kovtun

Languages 2025, 10(8), 191; https://doi.org/10.3390/languages10080191 - 6 Aug 2025

This study examines non-standard comparative and superlative adjective forms in Ukrainian and English, emphasizing their evaluative meanings and grammatical deviations. While prescriptive grammar dictates conventional comparison patterns, modern discourse—particularly in advertising, informal communication, and literary texts—exhibits an increasing prevalence of innovative comparative structures. [...] Read more.

This study examines non-standard comparative and superlative adjective forms in Ukrainian and English, emphasizing their evaluative meanings and grammatical deviations. While prescriptive grammar dictates conventional comparison patterns, modern discourse—particularly in advertising, informal communication, and literary texts—exhibits an increasing prevalence of innovative comparative structures. Using a corpus-based approach, this research identifies patterns of positive and negative evaluative meanings, revealing that positive evaluations dominate non-standard comparatives in both languages, particularly in advertising (English: 78.5%, Ukrainian: 80.2%). However, English exhibits a higher tolerance for grammatical flexibility, while Ukrainian maintains a more restricted use, primarily in commercial and expressive discourse. The findings highlight the pragmatic and evaluative functions of such constructions, including hyperbolic emphasis, rhetorical contrast, and branding strategies. These insights contribute to research on comparative grammar, sentiment analysis, and natural language processing, particularly in modeling evaluative structures in computational linguistics. Full article

► Show Figures

Figure 1

28 pages, 2335 KiB

Open AccessArticle

Fine-Tuning Pre-Trained Large Language Models for Price Prediction on Network Freight Platforms

by Pengfei Lu, Ping Zhang, Jun Wu, Xia Wu, Yunsheng Mao and Tao Liu

Mathematics 2025, 13(15), 2504; https://doi.org/10.3390/math13152504 - 4 Aug 2025

Viewed by 197

Abstract

Various factors influence the formation and adjustment of network freight prices, including transportation costs, cargo characteristics, and policies and regulations. The interaction of these factors increases the difficulty of accurately predicting network freight prices through regressions or other machine learning models, especially when [...] Read more.

Various factors influence the formation and adjustment of network freight prices, including transportation costs, cargo characteristics, and policies and regulations. The interaction of these factors increases the difficulty of accurately predicting network freight prices through regressions or other machine learning models, especially when the amount and quality of training data are limited. This paper introduces large language models (LLMs) to predict network freight prices using their inherent prior knowledge. Different data sorting methods and serialization strategies are employed to construct the corpora of LLMs, which are then tested on multiple base models. A few-shot sample dataset is constructed to test the performance of models under insufficient information. The Chain of Thought (CoT) is employed to construct a corpus that demonstrates the reasoning process in freight price prediction. Cross entropy loss with LoRA fine-tuning and cosine annealing learning rate adjustment, and Mean Absolute Error (MAE) loss with full fine-tuning and OneCycle learning rate adjustment to train the models, respectively, are used. The experimental results demonstrate that LLMs are better than or competitive with the best comparison model. Tests on a few-shot dataset demonstrate that LLMs outperform most comparison models in performance. This method provides a new reference for predicting network freight prices. Full article

(This article belongs to the Special Issue New Advances in Combinatorial Multi-Objective Optimization and Computational Intelligence)

► Show Figures

Figure 1

23 pages, 1604 KiB

Open AccessArticle

Fine-Tuning Large Language Models for Kazakh Text Simplification

by Alymzhan Toleu, Gulmira Tolegen and Irina Ualiyeva

Appl. Sci. 2025, 15(15), 8344; https://doi.org/10.3390/app15158344 - 26 Jul 2025

Viewed by 370

Abstract

This paper addresses text simplification task for Kazakh, a morphologically rich, low-resource language, by introducing KazSim, an instruction-tuned model built on multilingual large language models (LLMs). First, we develop a heuristic pipeline to identify complex Kazakh sentences, manually validating its performance on 400 [...] Read more.

This paper addresses text simplification task for Kazakh, a morphologically rich, low-resource language, by introducing KazSim, an instruction-tuned model built on multilingual large language models (LLMs). First, we develop a heuristic pipeline to identify complex Kazakh sentences, manually validating its performance on 400 examples and comparing it against a purely LLM-based selection method; we then use this pipeline to assemble a parallel corpus of 8709 complex–simple pairs via LLM augmentation. For the simplification task, we benchmark KazSim against standard Seq2Seq systems, domain-adapted Kazakh LLMs, and zero-shot instruction-following models. On an automatically constructed test set, KazSim (Llama-3.3-70B) achieves BLEU 33.50, SARI 56.38, and F1 87.56 with a length ratio of 0.98, outperforming all baselines. We also explore prompt language (English vs. Kazakh) and conduct human evaluation with three native speakers: KazSim scores 4.08 for fluency, 4.09 for meaning preservation, and 4.42 for simplicity—significantly above GPT-4o-mini. Error analysis shows that remaining failures cluster into tone change, tense change, and semantic drift, reflecting Kazakh’s agglutinative morphology and flexible syntax. Full article

(This article belongs to the Special Issue Natural Language Processing and Text Mining)

► Show Figures

Figure 1

36 pages, 702 KiB

Open AccessArticle

Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus

by Margot Vanhaverbeke, Renata Enghels, María del Carmen Parafita Couto and Iva Ivanova

Languages 2025, 10(7), 174; https://doi.org/10.3390/languages10070174 - 21 Jul 2025

Viewed by 581

Abstract

Research on language contact outcomes, such as code-switching, continues to face theoretical and methodological challenges, particularly due to the difficulty of comparing findings across studies that use divergent data collection methods. Accordingly, scholars have emphasized the need for publicly available and comparable bilingual [...] Read more.

Research on language contact outcomes, such as code-switching, continues to face theoretical and methodological challenges, particularly due to the difficulty of comparing findings across studies that use divergent data collection methods. Accordingly, scholars have emphasized the need for publicly available and comparable bilingual corpora. This paper introduces the El Paso Bilingual Corpus, a new Spanish–English bilingual corpus recorded in El Paso (TX) in 2022, designed to be methodologically comparable to the Bangor Miami Corpus. The paper is structured in three main sections. First, we review the existing Spanish–English corpora and examine the theoretical challenges posed by studies using non-comparable methodologies, thereby underscoring the gap addressed by the El Paso Bilingual Corpus. Second, we outline the corpus creation process, discussing participant recruitment, data collection, and transcription, and provide an overview of these data, including participants’ sociolinguistic profiles. Third, to demonstrate the practical value of methodologically aligned corpora, we report a comparative case study on diminutive expressions in the El Paso and Bangor Miami corpora, illustrating how shared collection protocols can elucidate the role of community-specific social factors on bilinguals’ morphosyntactic choices. Full article

(This article belongs to the Special Issue Exploring Linguistic Boundaries: From the Acquisition of Languages to Multilingual Practices)

► Show Figures

Figure 1

15 pages, 588 KiB

Open AccessReview

Archaeometry of Ancient Mortar-Based Materials in Roman Regio X and Neighboring Territories: A First Review

by Simone Dilaria

Minerals 2025, 15(7), 746; https://doi.org/10.3390/min15070746 - 16 Jul 2025

Viewed by 351

Abstract

This review synthesizes the corpus of archaeometric and analytical investigations focused on mortar-based materials, including wall paintings, plasters, and concrete, in the Roman Regio X and neighboring territories of northeastern Italy from the mid-1970s to the present. Organized into three principal categories—wall paintings [...] Read more.

This review synthesizes the corpus of archaeometric and analytical investigations focused on mortar-based materials, including wall paintings, plasters, and concrete, in the Roman Regio X and neighboring territories of northeastern Italy from the mid-1970s to the present. Organized into three principal categories—wall paintings and pigments, structural and foundational mortars, and flooring preparations—the analysis highlights the main methodological advances and progress in petrographic microscopy, mineralogical analysis, and mechanical testing of ancient mortars. Despite extensive case studies, the review identifies a critical need for systematic, statistically robust, and chronologically anchored datasets to fully reconstruct socio-economic and technological landscapes of this provincial region. This work offers a programmatic research agenda aimed at bridging current gaps and fostering integrated understandings of ancient construction technologies in northern Italy. The full forms of the abbreviations used throughout the text to describe the analytical equipment are provided at the end of the document in the “Abbreviations” section. Full article

(This article belongs to the Special Issue Mineralogical and Geochemical Insights into Cultural Heritage Materials)

► Show Figures

Figure 1

18 pages, 957 KiB

Open AccessArticle

CHTopo: A Multi-Source Large-Scale Chinese Toponym Annotation Corpus

by Peng Ye, Yujin Jiang and Yadi Wang

Information 2025, 16(7), 610; https://doi.org/10.3390/info16070610 - 16 Jul 2025

Viewed by 353

Abstract

Toponyms are fundamental geographical resources characterized by their spatial attributes, distinct from general nouns. While natural language provides rich toponymic data beyond traditional surveying methods, its qualitative ambiguity and inherent uncertainty challenge systematic extraction. Traditional toponym recognition methods based on part-of-speech tagging only [...] Read more.

Toponyms are fundamental geographical resources characterized by their spatial attributes, distinct from general nouns. While natural language provides rich toponymic data beyond traditional surveying methods, its qualitative ambiguity and inherent uncertainty challenge systematic extraction. Traditional toponym recognition methods based on part-of-speech tagging only focus on the surface-level features of words, failing to effectively handle complex scenarios such as alias nesting, metonymy ambiguity, and mixed punctuation. This leads to the loss of toponym semantic integrity and deviations in geographic entity recognition. This study proposes a set of Chinese toponym annotation specifications that integrate spatial semantics. By leveraging the XML markup language, it deeply combines the spatial location characteristics of toponyms with linguistic features, and designs fine-grained annotation rules to address the limitations of traditional methods in semantic integrity and geographic entity recognition. On this basis, by integrating multi-source corpora from the Encyclopedia of China: Chinese Geography and People’s Daily, a large-scale Chinese toponym annotation corpus (CHTopo) covering five major categories of toponyms has been constructed. The performance of this annotated corpus was evaluated through toponym recognition, exploring the construction methods of a large-scale, diversified, and high-coverage Chinese toponym annotated corpus from the perspectives of applicability and practicality. CHTopo is conducive to providing foundational support for geographic information extraction, spatial knowledge graphs, and geoparsing research, bridging linguistic and geospatial intelligence. Full article

(This article belongs to the Special Issue Text Mining: Challenges, Algorithms, Tools and Applications)

► Show Figures

Figure 1

15 pages, 820 KiB

Open AccessArticle

From Sacred to Secular: Daoist Robes as Instruments of Identity Negotiation in Ming Dynasty Literature

by Xiangyang Bian, Menghe Tian and Liyan Zhou

Religions 2025, 16(7), 903; https://doi.org/10.3390/rel16070903 - 14 Jul 2025

Viewed by 428

Abstract

Daoist robes in the Ming Dynasty literature underwent a marked transformation from exclusive religious vestments to widespread secular attire. Originally confined to Daoist priests and sacred rites, these garments began to appear in everyday work, entertainment, and ceremonies across social strata. Drawing on [...] Read more.

Daoist robes in the Ming Dynasty literature underwent a marked transformation from exclusive religious vestments to widespread secular attire. Originally confined to Daoist priests and sacred rites, these garments began to appear in everyday work, entertainment, and ceremonies across social strata. Drawing on a hand-coded corpus of novels that yields robe related passages, and by analyzing textual references from Ming novels, Daoist canonical works, and visual artifacts, and applying clothing psychology and semiotic theory, this study elucidates how Daoist robes were re-coded as secular fashion symbols. For example, scholar-officials donned Daoist robes to convey moral prestige, laborers adopted them to signal upward mobility, and merchants donned them to impersonate the educated elite for commercial gain. By integrating close textual reading with cultural theory, the article advances a three-stage model, sacred uniform, ritual costume, and secular fashion, that clarifies the semantic flow of Daoist robes. In weddings and funerals, many commoners flaunted Daoist robes despite sumptuary laws, using them to assert honor and status. These adaptations reflect both the erosion of Daoist institutional authority and the dynamic process of identity construction through dress in late Ming society. Our interdisciplinary analysis highlights an East Asian perspective on the interaction of religion and fashion, offering historical insight into the interplay between religious symbolism and sociocultural identity formation. Full article

(This article belongs to the Special Issue Religious Symbols and Fashion: Identification, Representation and Regulation in Dress)

► Show Figures

Figure 1

20 pages, 4177 KiB

Open AccessArticle

Joint Entity–Relation Extraction for Knowledge Graph Construction in Marine Ranching Equipment

by Du Chen, Zhiwu Gao, Sirui Li, Xuruixue Guo, Yaqi Wu, Haiyu Zhang and Delin Zhang

Appl. Sci. 2025, 15(13), 7611; https://doi.org/10.3390/app15137611 - 7 Jul 2025

Viewed by 354

Abstract

The construction of marine ranching is a crucial component of China’s Blue Granary strategy, yet the fragmented knowledge system in marine ranching equipment impedes intelligent management and operational efficiency. This study proposes the first knowledge graph (KG) framework tailored for marine ranching equipment, [...] Read more.

The construction of marine ranching is a crucial component of China’s Blue Granary strategy, yet the fragmented knowledge system in marine ranching equipment impedes intelligent management and operational efficiency. This study proposes the first knowledge graph (KG) framework tailored for marine ranching equipment, integrating hybrid ontology design, joint entity–relation extraction, and graph-based knowledge storage: (1) The limitations in existing KG are obtained through targeted questionnaires for diverse users and employees; (2) A domain ontology was constructed through a combination of the top-down and the bottom-up approach, defining seven key concepts and eight semantic relationships; (3) Semi-structured data from enterprises and standards, combined with unstructured data from the literature were systematically collected, cleaned via Scrapy and regular expression, and standardized into JSON format, forming a domain-specific corpus of 1456 annotated sentences; (4) A novel BERT-BiGRU-CRF model was developed, leveraging contextual embeddings from BERT, parameter-efficient sequence modeling via BiGRU (Bidirectional Gated Recurrent Unit), and label dependency optimization using CRF (Conditional Random Field). The TE + SE + R_i + BMESO tagging strategy was introduced to address multi-relation extraction challenges by linking theme entities to secondary entities; (5) The Neo4j-based KG encapsulated 2153 nodes and 3872 edges, enabling scalable visualization and dynamic updates. Experimental results demonstrated superior performance over BiLSTM-CRF and BERT-BiLSTM-CRF, achieving 86.58% precision, 77.82% recall, and 81.97% F1 score. This study not only proposes the first structured KG framework for marine ranching equipment but also offers a transferable methodology for vertical domain knowledge extraction. Full article

(This article belongs to the Section Marine Science and Engineering)

► Show Figures

Figure 1

23 pages, 2203 KiB

Open AccessReview

Digital Academic Leadership in Higher Education Institutions: A Bibliometric Review Based on CiteSpace

by Olaniyi Joshua Olabiyi, Carl Jansen van Vuuren, Marieta Du Plessis, Yujie Xue and Chang Zhu

Educ. Sci. 2025, 15(7), 846; https://doi.org/10.3390/educsci15070846 - 2 Jul 2025

Cited by 1 | Viewed by 804

Abstract

The continuous evolution of technology compels higher education leaders to adapt to VUCA (volatile, uncertain, complex, and ambiguous) and BANI (brittle, anxious, non-linear, and incomprehensible) environments through innovative strategies that ensure institutional relevance. While VUCA emphasizes the challenges posed by rapid change and [...] Read more.

The continuous evolution of technology compels higher education leaders to adapt to VUCA (volatile, uncertain, complex, and ambiguous) and BANI (brittle, anxious, non-linear, and incomprehensible) environments through innovative strategies that ensure institutional relevance. While VUCA emphasizes the challenges posed by rapid change and uncertain decision-making, BANI underscores the fragility of systems, heightened anxiety, unpredictable causality, and the collapse of established patterns. Navigating these complexities requires agility, resilience, and visionary leadership to ensure that institutions remain adaptable and future ready. This study presents a bibliometric analysis of digital academic leadership in higher education transformation, examining empirical studies, reviews, book chapters, and proceeding papers published from 2014 to 2024 (11-year period) in the Web of Science—Science Citation Index Expanded (SCIE) and Social Science Citation Index (SSCI). Using CiteSpace software (version 6.3. R1-64 bit), we analyzed 5837 documents, identifying 24 key publications that formed a network of 90 nodes and 256 links. The reduction to 24 publications occurred as part of a structured bibliometric analysis using CiteSpace, which employs algorithmic thresholds to identify the most influential and structurally significant publications within a large corpus. These 24 documents form the core co-citation network, which serves as a conceptual backbone for further thematic interpretation. This was the result of a multi-step refinement process using CiteSpace’s default thresholds and clustering algorithms to detect the most influential nodes based on centrality, citation burst, and network clustering. Our findings reveal six primary research clusters: “Enhancing Academic Performance”, “Digital Leadership Scale Adaptation”, “Construction Industry”, “Innovative Work Behavior”, “Development Business Strategy”, and “Education.” The analysis demonstrates a significant increase in publications over the decade, with the highest concentration in 2024, reflecting growing scholarly interest in this field. Keywords analysis shows “digital leadership”, “digital transformation”, “performance”, and “innovation” as dominant terms, highlighting the field’s evolution from technology-focused approaches to holistic leadership frameworks. Geographical analysis reveals significant contributions from Pakistan, Ireland, and India, indicating valuable insights emerging from diverse global contexts. These findings suggest that effective digital academic leadership requires not only technical competencies but also transformational capabilities, communication skills, and innovation management to enhance student outcomes and institutional performance in an increasingly digitalized educational landscape. Full article

► Show Figures

Figure 1

22 pages, 548 KiB

Open AccessArticle

Readability Formulas for Elementary School Texts in Mexican Spanish

by Daniel Fajardo-Delgado, Lino Rodriguez-Coayahuitl, María Guadalupe Sánchez-Cervantes, Miguel Ángel Álvarez-Carmona and Ansel Y. Rodríguez-González

Appl. Sci. 2025, 15(13), 7259; https://doi.org/10.3390/app15137259 - 27 Jun 2025

Viewed by 316

Abstract

Readability formulas are mathematical functions that assess the ‘difficulty’ level of a given text. They play a crucial role in aligning educational texts with student reading abilities; however, existing models are often not tailored to specific linguistic or regional contexts. This study aims [...] Read more.

Readability formulas are mathematical functions that assess the ‘difficulty’ level of a given text. They play a crucial role in aligning educational texts with student reading abilities; however, existing models are often not tailored to specific linguistic or regional contexts. This study aims to develop and evaluate two novel readability formulas specifically designed for the Mexican Spanish language, targeting elementary education levels. The formulas were trained on a corpus of 540 texts drawn from official elementary-level textbooks issued by the Mexican public education system. The first formula was constructed using multiple linear regression, emulating the structure of traditional readability models. The second was derived through genetic programming (GP), a machine learning technique that evolves symbolic expressions based on training data. Both approaches prioritize interpretability and use standard textual features, such as sentence length, word length, and lexical and syntactic complexity. Experimental results show that the proposed formulas outperform several well-established Spanish and non-Spanish readability formulas in distinguishing between grade levels, particularly for early and intermediate stages of elementary education. The GP-based formula achieved the highest alignment with target grade levels while maintaining a clear analytical form. These findings underscore the potential of combining machine learning with interpretable modeling techniques and highlight the importance of linguistic and curricular adaptation in readability assessment tools. Full article

(This article belongs to the Special Issue Machine Learning and Soft Computing: Current Trends and Applications)

► Show Figures

Figure 1

24 pages, 3367 KiB

Open AccessArticle

From Policy to Practice: A Comparative Topic Modeling Study of Smart Forestry in China

by Yukun Cao, Yafang Zhang, Yuchen Shi and Yue Ren

Forests 2025, 16(6), 1019; https://doi.org/10.3390/f16061019 - 18 Jun 2025

Viewed by 455

Abstract

The accelerated penetration of digital technology into natural ecosystems has led to the digital transformation of forest ecological spaces. Smart forestry, as a key pathway for digital-intelligence-enabled ecological governance, plays an important role in global sustainable development and multi-level governance. However, due to [...] Read more.

The accelerated penetration of digital technology into natural ecosystems has led to the digital transformation of forest ecological spaces. Smart forestry, as a key pathway for digital-intelligence-enabled ecological governance, plays an important role in global sustainable development and multi-level governance. However, due to differences in functional positioning, resource capacity, and policy translation mechanisms, semantic shifts and disconnections arise between central policies, local policies, and practical implementation, thereby affecting policy execution and governance effectiveness. Fujian Province has been identified as a key pilot region for smart forestry practices in China, owing to its early adoption of informatization strategies and distinctive ecological conditions. This study employed the Latent Dirichlet Allocation (LDA) topic modeling method to construct a corpus of smart forestry texts, including central policies, local policies, and local media reports from 2010 to 2025. Seven potential themes were identified and categorized into three overarching dimensions: technological empowerment, governance mechanisms, and ecological goals. The results show that central policies emphasize macro strategy and ecological security, local policies focus on platform construction and governance coordination, and local practice features digital innovation and ecological value transformation. Three transmission paths are summarized to support smart forestry policy optimization and inform digital ecological governance globally. Full article

(This article belongs to the Section Forest Economics, Policy, and Social Science)

► Show Figures

Figure 1

24 pages, 1461 KiB

Open AccessArticle

Syllable-, Bigram-, and Morphology-Driven Pseudoword Generation in Greek

by Kosmas Kosmidis, Vassiliki Apostolouda and Anthi Revithiadou

Appl. Sci. 2025, 15(12), 6582; https://doi.org/10.3390/app15126582 - 11 Jun 2025

Viewed by 447

Abstract

Pseudowords are essential in (psycho)linguistic research, offering a way to study language without meaning interference. Various methods for creating pseudowords exist, but each has its limitations. Traditional approaches modify existing words, risking unintended recognition. Modern algorithmic methods use high-frequency n-grams or syllable [...] Read more.

Pseudowords are essential in (psycho)linguistic research, offering a way to study language without meaning interference. Various methods for creating pseudowords exist, but each has its limitations. Traditional approaches modify existing words, risking unintended recognition. Modern algorithmic methods use high-frequency n-grams or syllable deconstruction but often require specialized expertise. Currently, no automatic process for pseudoword generation is designed explicitly for Greek, which is our primary focus. Therefore, we developed SyBig-r-Morph, a novel application that constructs pseudowords using syllables as the main building block, replicating Greek phonotactic patterns. SyBig-r-Morph draws input from word lists and databases that include syllabification, word length, part of speech, and frequency information. It categorizes syllables by position to ensure phonotactic consistency with user-selected morphosyntactic categories and can optionally assign stress to generated words. Additionally, the tool uses multiple lexicons to eliminate phonologically invalid combinations. Its modular architecture allows easy adaptation to other languages. To further evaluate its output, we conducted a manual assessment using a tool that verifies phonotactic well-formedness based on phonological parameters derived from a corpus. Most SyBig-r-Morph words passed the stricter phonotactic criteria, confirming the tool’s sound design and linguistic adequacy. Full article

(This article belongs to the Special Issue Computational Linguistics: From Text to Speech Technologies)

► Show Figures

Figure 1

13 pages, 1136 KiB

Open AccessArticle

Machine Learning-Driven Acoustic Feature Classification and Pronunciation Assessment for Mandarin Learners

by Gulnur Arkin, Tangnur Abdukelim, Hankiz Yilahun and Askar Hamdulla

Appl. Sci. 2025, 15(11), 6335; https://doi.org/10.3390/app15116335 - 5 Jun 2025

Viewed by 465

Abstract

Based on acoustic feature analysis, this study systematically examines the differences in vowel pronunciation characteristics among Mandarin learners at various proficiency levels. A speech corpus containing samples from advanced, intermediate, and elementary learners (N = 50) and standard speakers (N = 10) was [...] Read more.

Based on acoustic feature analysis, this study systematically examines the differences in vowel pronunciation characteristics among Mandarin learners at various proficiency levels. A speech corpus containing samples from advanced, intermediate, and elementary learners (N = 50) and standard speakers (N = 10) was constructed, with a total of 5880 samples. Support Vector Machine (SVM) and ID3 decision tree algorithms were employed to classify vowel formant parameters (F1-F2) patterns. The results demonstrate that SVM significantly outperforms the ID3 algorithm in vowel classification, with an average accuracy of 92.09% for the three learner groups (92.38% for advanced, 92.25% for intermediate, and 91.63% for elementary), an improvement of 2.05 percentage points compared to ID3 (p < 0.05). Learners’ vowel production exhibits systematic deviations, particularly pronounced in complex vowels for the elementary group. For instance, the apical vowel “ẓ” has a deviation of 2.61 Bark (standard group: F1 = 3.39/F2 = 8.13; elementary group: F1 = 3.42/F2 = 10.74), while the advanced group’s deviations are generally less than 0.5 Bark (e.g., vowel “a” deviation is only 0.09 Bark). The difficulty of tongue position control strongly correlates with the deviation magnitude (r = 0.87, p < 0.001). This study confirms the effectiveness of objective assessment methods based on formant analysis in speech acquisition research, provides a theoretical basis for algorithm optimization in speech evaluation systems, and holds significant application value for the development of Computer-Assisted Language Learning (CALL) systems and the improvement of multi-ethnic Mandarin speech recognition technology. Full article

(This article belongs to the Collection Fishery Acoustics)

► Show Figures

Figure 1

24 pages, 3152 KiB

Open AccessArticle

EHMQA-GPT: A Knowledge Augmented Large Language Model for Personalized Elderly Health Management

by Shaofu Lin, Yidan Duan, Tao Zhou, Xiliang Liu and Jiaojiao Wang

Information 2025, 16(6), 467; https://doi.org/10.3390/info16060467 - 30 May 2025

Viewed by 664

Abstract

Due to training limitations, general LLMs often lack sufficient accuracy and practicality in specialized domains such as elderly health management. To help alleviate this issue, this paper introduces EHMQA-GPT, the first domain-specific LLM tailored for non-specialist users (caregivers, elderly individuals, family members, and [...] Read more.

Due to training limitations, general LLMs often lack sufficient accuracy and practicality in specialized domains such as elderly health management. To help alleviate this issue, this paper introduces EHMQA-GPT, the first domain-specific LLM tailored for non-specialist users (caregivers, elderly individuals, family members, and community health workers) for low-risk, daily health consultations in real-world scenarios. EHMQA-GPT innovates in two aspects: (1) professional corpus construction: we established a multi-dimensional annotation system, integrating EHM-KB, EHM-SFT, and EHM-Eval, to achieve vector representation and hierarchical classification of domain knowledge; and (2) knowledge-enhanced large language model construction: based on ChatGLM3-6B, we integrated knowledge retrieval mechanisms and supervised fine-tuning strategies, enhanced the generation effect through knowledge base retrieval, and achieved deep alignment of domain knowledge through mixed supervised fine-tuning. The experimental verification part adopts testing in six fields. EHMQA-GPT has an accuracy rate of 78.1%, which is 22.3% higher than ChatGLM3-6B. Subjective assessment constructs a dual verification system (GPT-4 automatic scoring + gerontology expert blind review) and is significantly superior to the baseline model in three dimensions: knowledge accuracy (+38.9%), logical coherence (+39.4%), and practical guidance (+31.4%). The proposed framework and corpus provide a novel and scalable foundation for future research and deployment of LLMs in elderly health. Full article

► Show Figures

Figure 1

12 pages, 351 KiB

Open AccessArticle

HOTGAME: A Corpus of Early House and Techno Music from Germany and America

by Tim Ziemer

Metrics 2025, 2(2), 8; https://doi.org/10.3390/metrics2020008 - 29 May 2025

Viewed by 393

Abstract

Many publications on early house and techno music have the character of documentation and include (auto-)biographical statements from contemporaries of the scene. This literature has led to many statements, hypotheses, and conclusions. The weaknesses of such sources are their selective and subjective nature, [...] Read more.

Many publications on early house and techno music have the character of documentation and include (auto-)biographical statements from contemporaries of the scene. This literature has led to many statements, hypotheses, and conclusions. The weaknesses of such sources are their selective and subjective nature, and the danger of unclear memories, romanticization, and constructive memory. Consequently, a validation through content-based, quantitative music analyses is desirable. For this purpose, the HOuse and Techno music from Germany and AMErica (HOTGAME) corpus was built. Metrics from the field of data quality control show that the corpus is representative and explanatory for house and techno music from Germany and the United States of America between 1984 and 1994. HOTGAME can serve as a reliable source for the analysis of early house and techno music using big data methods, like inferential statistics and machine learning. Full article

► Show Figures

Figure 1

Search Results (289)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (289)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI