MDPI - Publisher of Open Access Journals

32 pages, 852 KB

Open AccessArticle

Benchmarking the Responsiveness of Open-Source Text-to-Speech Systems

by Ha Pham Thien Dinh, Rutherford Agbeshi Patamia, Ming Liu and Akansel Cosgun

Computers 2025, 14(10), 406; https://doi.org/10.3390/computers14100406 - 23 Sep 2025

Viewed by 755

Responsiveness—the speed at which a text-to-speech (TTS) system produces audible output—is critical for real-time voice assistants yet has received far less attention than perceptual quality metrics. Existing evaluations often touch on latency but do not establish reproducible, open-source standards that capture responsiveness as a first-class dimension. This work introduces a baseline benchmark designed to fill that gap. Our framework unifies latency distribution, tail latency, and intelligibility within a transparent and dataset-diverse pipeline, enabling a fair and replicable comparison across 13 widely used open-source TTS models. By grounding evaluation in structured input sets ranging from single words to sentence-length utterances and adopting a methodology inspired by standardized inference benchmarks, we capture both typical and worst-case user experiences. Unlike prior studies that emphasize closed or proprietary systems, our focus is on establishing open, reproducible baselines rather than ranking against commercial references. The results reveal substantial variability across architectures, with some models delivering near-instant responses while others fail to meet interactive thresholds. By centering evaluation on responsiveness and reproducibility, this study provides an infrastructural foundation for benchmarking TTS systems and lays the groundwork for more comprehensive assessments that integrate both fidelity and speed. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling (2nd Edition))

► Show Figures

Figure 1

17 pages, 479 KB

Open AccessArticle

Analyzing LLM Sentencing Variability in Theft Indictments Across Gender, Family Status and the Value of the Stolen Item

by Karol Struniawski, Ryszard Kozera and Aleksandra Konopka

Appl. Sci. 2025, 15(16), 8860; https://doi.org/10.3390/app15168860 - 11 Aug 2025

Viewed by 619

Abstract

As large language models (LLMs) increasingly enter high-stakes decision-making contexts, questions arise about their suitability in domains requiring normative judgment, such as judicial sentencing. This study investigates whether LLMs exhibit bias when tasked with sentencing decisions in Polish criminal law, despite clear legal norms that prohibit considering extralegal factors. The simulated sentencing scenarios for theft offenses use two leading open-source LLMs (LLaMA and Mixtral) and systematically vary three defendant characteristics: gender, number of children, and the value of the stolen item. While none of these variables should legally affect sentence length under Polish law, our results reveal statistically significant disparities, particularly in how female defendants with children are treated. The non-parametric tests (Kruskal–Wallis and Mann–Whitney U) and correlation analysis were applied to quantify these effects. Our findings raise concerns about the normative reliability of LLMs and their alignment with principles of fairness and legality. From a jurisprudential perspective, we contrast the implicit logic of LLM sentencing with theoretical models of adjudication, including Dworkin’s moral interpretivism and Posner’s pragmatism. This work contributes to ongoing debates on the integration of AI in legal systems, highlighting both the empirical risks and the philosophical limitations of computational legal reasoning. Full article

(This article belongs to the Special Issue Advances in Machine Learning and Data Mining: Emerging Trends and Applications)

► Show Figures

Figure 1

14 pages, 591 KB

Open AccessArticle

Punctuation Patterns in Finnegans Wake by James Joyce Are Largely Translation-Invariant

by Krzysztof Bartnicki, Stanisław Drożdż, Jarosław Kwapień and Tomasz Stanisz

Entropy 2025, 27(2), 177; https://doi.org/10.3390/e27020177 - 7 Feb 2025

Cited by 2 | Viewed by 1450

Abstract

The complexity characteristics of texts written in natural languages are significantly related to the rules of punctuation. In particular, the distances between punctuation marks measured by the number of words quite universally follow the family of Weibull distributions known from survival analyses. However, the values of two parameters marking specific forms of these distributions distinguish specific languages. This is such a strong constraint that the punctuation distributions of texts translated from the original language into another adopt quantitative characteristics of the target language. All these changes take place within Weibull distributions such that the corresponding hazard functions are always increasing. Recent previous research shows that James Joyce’s famous novel Finnegans Wake is subject to such an extreme distribution from the Weibull family that the corresponding hazard function is clearly decreasing. At the same time, the distances of sentence-ending punctuation marks, determining the sentence length variability, have an almost perfect multifractal organization to an extent found nowhere else in the literature thus far. In the present contribution, based on several available translations (Dutch, French, German, Polish, and Russian) of Finnegans Wake, it is shown that the punctuation characteristics of this work remain largely translation-invariant, contrary to the common cases. These observations may constitute further evidence that Finnegans Wake is a translinguistic work in this respect as well, in line with Joyce’s original intention. Full article

(This article belongs to the Special Issue Complexity Characteristics of Natural Language)

► Show Figures

Figure 1

15 pages, 15258 KB

Open AccessArticle

Multifractal Hopscotch in Hopscotch by Julio Cortázar

by Jakub Dec, Michał Dolina, Stanisław Drożdż, Jarosław Kwapień and Tomasz Stanisz

Entropy 2024, 26(8), 716; https://doi.org/10.3390/e26080716 - 22 Aug 2024

Cited by 4 | Viewed by 1287

Abstract

Punctuation is the main factor introducing correlations in natural language written texts and it crucially impacts their overall effectiveness, expressiveness, and readability. Punctuation marks at the end of sentences are of particular importance as their distribution can determine various complexity features of written natural language. Here, the sentence length variability (SLV) time series representing Hopscotch by Julio Cortázar are subjected to quantitative analysis with an attempt to identify their distribution type, long-memory effects, and potential multiscale patterns. The analyzed novel is an important and innovative piece of literature whose essential property is freedom of movement between its building blocks given to a reader by the author. The statistical consequences of this freedom are closely investigated in both the original, Spanish version of the novel, and its translations into English and Polish. Clear evidence of rich multifractality in the SLV dynamics, with a left-sided asymmetry, however, is observed in all three language versions as well as in the versions with differently ordered chapters. Full article

(This article belongs to the Special Issue Complexity Characteristics of Natural Language)

► Show Figures

Figure 1

20 pages, 1861 KB

Open AccessArticle

Prominent User Segments in Online Consumer Recommendation Communities: Capturing Behavioral and Linguistic Qualities with User Comment Embeddings

by Apostolos Skotis and Christos Livas

Information 2024, 15(6), 356; https://doi.org/10.3390/info15060356 - 15 Jun 2024

Cited by 1 | Viewed by 1590

Abstract

Online conversation communities have become an influential source of consumer recommendations in recent years. We propose a set of meaningful user segments which emerge from user embedding representations, based exclusively on comments’ text input. Data were collected from three popular recommendation communities in Reddit, covering the domains of book and movie suggestions. We utilized two neural language model methods to produce user embeddings, namely Doc2Vec and Sentence-BERT. Embedding interpretation issues were addressed by examining latent factors’ associations with behavioral, sentiment, and linguistic variables, acquired using the VADER, LIWC, and LFTK libraries in Python. User clusters were identified, having different levels of engagement and linguistic characteristics. The latent features of both approaches were strongly correlated with several user behavioral and linguistic indicators. Both approaches managed to capture significant variability in writing styles and quality, such as length, readability, use of function words, and complexity. However, the Doc2Vec features better described users by varying level of contribution, while S-BERT-based features were more closely adapted to users’ varying emotional engagement. Prominent segments revealed prolific users with formal, intuitive, emotionally distant, and highly analytical styles, as well as users who were less elaborate, less consistent, but more emotionally connected. The observed patterns were largely similar across communities. Full article

(This article belongs to the Section Information Processes)

► Show Figures

Figure 1

54 pages, 1086 KB

Open AccessSystematic Review

Gait Biomechanical Parameters Related to Falls in the Elderly: A Systematic Review

by Jullyanne Silva, Tiago Atalaia, João Abrantes and Pedro Aleixo

Biomechanics 2024, 4(1), 165-218; https://doi.org/10.3390/biomechanics4010011 - 5 Mar 2024

Cited by 7 | Viewed by 6987

Abstract

According to the World Health Organization, one-third of elderly people aged 65 or over fall annually, and this number increases after 70. Several gait biomechanical parameters were associated with a history of falls. This study aimed to conduct a systematic review to identify and describe the gait biomechanical parameters related to falls in the elderly. MEDLINE Complete, Cochrane, Web of Science, and CINAHL Complete were searched for articles on 22 November 2023, using the following search sentence: (gait) AND (fall*) AND ((elder*) OR (old*) OR (senior*)) AND ((kinematic*) OR (kinetic*) OR (biomechanic*) OR (electromyogram*) OR (emg) OR (motion analysis*) OR (plantar pressure)). This search identified 13,988 studies. From these, 96 were selected. Gait speed, stride/step length, and double support phase are gait biomechanical parameters that differentiate fallers from non-fallers. Fallers also tended to exhibit higher variability in gait biomechanical parameters, namely the minimum foot/toe clearance variability. Although the studies were scarce, differences between fallers and non-fallers were found regarding lower limb muscular activity and joint biomechanics. Due to the scarce literature and contradictory results among studies, it is complex to draw clear conclusions for parameters related to postural stability. Minimum foot/toe clearance, step width, and knee kinematics did not differentiate fallers from non-fallers. Full article

(This article belongs to the Special Issue Gait and Balance Control in Typical and Special Individuals)

► Show Figures

Figure 1

15 pages, 2919 KB

Open AccessArticle

Automatic Extraction of Flooding Control Knowledge from Rich Literature Texts Using Deep Learning

by Min Zhang and Juanle Wang

Appl. Sci. 2023, 13(4), 2115; https://doi.org/10.3390/app13042115 - 7 Feb 2023

Cited by 6 | Viewed by 2387

Abstract

Flood control is a global problem; increasing number of flooding disasters occur annually induced by global climate change and extreme weather events. Flood studies are important knowledge sources for flood risk reduction and have been recorded in the academic literature. The main objective of this paper was to acquire flood control knowledge from long-tail data of the literature by using deep learning techniques. Screening was conducted to obtain 4742 flood-related academic documents from past two decades. Machine learning was conducted to parse the documents, and 347 sample data points from different years were collected for sentence segmentation (approximately 61,000 sentences) and manual annotation. Traditional machine learning (NB, LR, SVM, and RF) and artificial neural network-based deep learning algorithms (Bert, Bert-CNN, Bert-RNN, and ERNIE) were implemented for model training, and complete sentence-level knowledge extraction was conducted in batches. The results revealed that artificial neural network-based deep learning methods exhibit better performance than traditional machine learning methods in terms of accuracy, but their training time is much longer. Based on comprehensive feature extraction capability and computational efficiency, the performances of deep learning methods were ranked as: ERNIE > Bert-CNN > Bert > Bert-RNN. When using Bert as the benchmark model, several deformation models showed applicable characteristics. Bert, Bert-CNN, and Bert-RNN were good at acquiring global features, local features, and processing variable-length inputs, respectively. ERNIE showed improved masking mechanism and corpus and therefore exhibited better performance. Finally, 124,196 usage method and 8935 quotation method sentences were obtained in batches. The proportions of method sentence in the literature showed increasing trends over the last 20 years. Thus, as literature with more method sentences accumulates, this study lays a foundation for knowledge extraction in the future. Full article

(This article belongs to the Special Issue State-of-the-Art Earth Sciences and Geography in China)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI