MDPI - Publisher of Open Access Journals

19 pages, 992 KB

Open AccessArticle

Hybrid Music Similarity with Hypergraph and Siamese Network

by Sera Kim, Youngjun Kim, Jaewon Lee and Dalwon Jang

Big Data Cogn. Comput. 2026, 10(3), 96; https://doi.org/10.3390/bdcc10030096 - 21 Mar 2026

Viewed by 385

This paper proposes a novel method for measuring music similarity. Existing music similarity measurements have often been used for music appreciation, but this paper proposes a method for measuring the similarity between music samples which are used for music production. Conventional music recommendation [...] Read more.

This paper proposes a novel method for measuring music similarity. Existing music similarity measurements have often been used for music appreciation, but this paper proposes a method for measuring the similarity between music samples which are used for music production. Conventional music recommendation approaches often rely on either metadata-based similarity or audio-based feature similarity in isolation, which limits their effectiveness in sample-based recommendation scenarios where both compositional context and acoustic characteristics are important. To address this limitation, the proposed framework combines a hypergraph-based information similarity module with a feature-based similarity module learned using Siamese networks and triplet loss. In the information-based module, metadata attributes such as beats per minute (BPM), genre, chord, key, and instrument are modeled as vertices in a hypergraph, and Random Walk–Word2Vec embeddings are learned to capture structural relationships between music samples and their attributes. In parallel, the feature-based module employs vertex-specific Siamese networks trained on instrument and key classification tasks to learn perceptual similarity directly from audio signals. The two modules are trained independently and jointly utilized at the recommendation stage to provide attribute-specific similarity results for a given query sample. Results show that the proposed system achieves high Precision@k across multiple attributes and forms stable similarity structures in the embedding space, even without relying on user interaction data. These results reflect embedding consistency evaluated over the entire dataset where training and retrieval are performed on the same sample pool, rather than generalization to unseen samples. These results demonstrate that the proposed hybrid framework effectively captures both structural and perceptual similarity among music samples and is well suited for sample-based music recommendation in music production environments. Full article

► Show Figures

Figure 1

24 pages, 7997 KB

Open AccessArticle

Recognition of Partial Drawing Sequences for Constructing an AI Player in Drawing Werewolf

by Nodoka Okamoto, Sota Nishiguchi, Akari Takemoto and Shun Nishide

Electronics 2026, 15(6), 1189; https://doi.org/10.3390/electronics15061189 - 12 Mar 2026

Viewed by 385

Abstract

Drawing-based social deduction games require artificial intelligence (AI) agents to infer semantic information from incomplete and evolving visual inputs under asymmetric information conditions. In this study, we address the problem of recognizing drawing targets from partial sketch sequences toward constructing an AI player [...] Read more.

Drawing-based social deduction games require artificial intelligence (AI) agents to infer semantic information from incomplete and evolving visual inputs under asymmetric information conditions. In this study, we address the problem of recognizing drawing targets from partial sketch sequences toward constructing an AI player for Drawing Werewolf, a collaborative drawing game derived from the Werewolf (Mafia) genre. Using stroke-based representations from the “Quick, Draw!” dataset, we formulate incremental sketch classification as a sequence modeling task and compare a unidirectional Long Short-Term Memory (UniLSTM) model with a Transformer-based model under realistic online inference constraints. Experiments were conducted on 44 animal classes, evaluating classification accuracy at different drawing stages ranging from one stroke to completed sketches. The results demonstrate that both models improve as additional strokes are observed; however, the Transformer consistently outperforms UniLSTM across all stroke counts. The performance gap is particularly pronounced in early-stage prediction, where sketches are highly incomplete and ambiguous. Class-wise analyses further reveal that the advantage of self-attention depends on visual characteristics and drawing progression. These findings indicate that self-attention mechanisms are well suited for modeling partial sketch sequences and provide valuable insights for designing AI players capable of real-time inference in drawing-based social deduction games. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 698 KB

Open AccessReview

What Distinguishes AI-Generated from Human Writing? A Rapid Review of the Literature

by Georgios P. Georgiou

Big Data Cogn. Comput. 2026, 10(2), 55; https://doi.org/10.3390/bdcc10020055 - 8 Feb 2026

Viewed by 2668

Abstract

Large language models (LLMs) are now routine writing tools across various domains, intensifying questions about when text should be treated as human-authored, artificial intelligence (AI)-generated, or collaboratively produced. This rapid review aims to identify cue families reported in empirical studies as distinguishing AI [...] Read more.

Large language models (LLMs) are now routine writing tools across various domains, intensifying questions about when text should be treated as human-authored, artificial intelligence (AI)-generated, or collaboratively produced. This rapid review aims to identify cue families reported in empirical studies as distinguishing AI from human-authored text and to assess how stable these cues are across genres/tasks, text lengths, and revision conditions. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines, we searched four online databases for peer-reviewed empirical articles (1 January 2022–1 January 2026). After deduplication and screening, 40 studies were included. Evidence converged on five cue families: surface, discourse/pragmatic, epistemic/content, predictability/probabilistic, and provenance. Surface cues dominated the literature and were the most consistently operationalized. Discourse/pragmatic cues followed, particularly in discipline-bound academic genres where stance and metadiscourse differentiated AI from human writing. Predictability/probabilistic cues were central in detector-focused studies, while epistemic/content cues emerged primarily in tasks where grounding and authenticity were salient. Provenance cues were concentrated in watermarking research. Across studies, cue stability was consistently conditional rather than universal. Specifically, surface and discourse cues often remained discriminative within constrained genres, but shifted with register and discipline; probabilistic cues were powerful yet fragile under paraphrasing, post-editing, and evasion; and provenance signals required robustness to editing, mixing, and span localization. Overall, the literature indicates that AI–human distinction emerges from layered and context-dependent cue profiles rather than from any single reliable marker. High-stakes decisions, therefore, require condition-aware interpretation, triangulation across multiple cue families, and human oversight rather than automated classification in isolation. Full article

(This article belongs to the Special Issue Machine Learning Applications in Natural Language Processing)

► Show Figures

Figure 1

22 pages, 6241 KB

Open AccessArticle

Using Large Language Models to Detect and Debunk Climate Change Misinformation

by Zeinab Shahbazi and Sara Behnamian

Big Data Cogn. Comput. 2026, 10(1), 34; https://doi.org/10.3390/bdcc10010034 - 17 Jan 2026

Cited by 1 | Viewed by 1309

Abstract

The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. [...] Read more.

The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. This study presents a multi-stage system that employs state-of-the-art large language models such as Generative Pre-trained Transformer 4 (GPT-4), Large Language Model Meta AI (LLaMA) version 3 (LLaMA-3), and RoBERTa-large (Robustly optimized BERT pretraining approach large) to identify, classify, and generate scientifically grounded corrections for climate misinformation. The system integrates several complementary techniques, including transformer-based text classification, semantic similarity scoring using Sentence-BERT, stance detection, and retrieval-augmented generation (RAG) for evidence-grounded debunking. Misinformation instances are detected through a fine-tuned RoBERTa–Multi-Genre Natural Language Inference (MNLI) classifier (RoBERTa-MNLI), grouped using BERTopic, and verified against curated climate-science knowledge sources using BM25 and dense retrieval via FAISS (Facebook AI Similarity Search). The debunking component employs RAG-enhanced GPT-4 to produce accurate and persuasive counter-messages aligned with authoritative scientific reports such as those from the Intergovernmental Panel on Climate Change (IPCC). A diverse dataset of climate misinformation categories covering denialism, cherry-picking of data, false causation narratives, and misleading comparisons is compiled for evaluation. Benchmarking experiments demonstrate that LLM-based models substantially outperform traditional machine-learning baselines such as Support Vector Machines, Logistic Regression, and Random Forests in precision, contextual understanding, and robustness to linguistic variation. Expert assessment further shows that generated debunking messages exhibit higher clarity, scientific accuracy, and persuasive effectiveness compared to conventional fact-checking text. These results highlight the potential of advanced LLM-driven pipelines to provide scalable, real-time mitigation of climate misinformation while offering guidelines for responsible deployment of AI-assisted debunking systems. Full article

(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)

► Show Figures

Figure 1

32 pages, 2195 KB

Open AccessArticle

MUSIGAIN: Adaptive Graph Attention Network for Multi-Relationship Mining in Music Knowledge Graphs

by Mian Chen, Tinghao Wang, Chunhao Li and Yuheng Li

Electronics 2025, 14(24), 4892; https://doi.org/10.3390/electronics14244892 - 12 Dec 2025

Viewed by 1037

Abstract

With the exponential growth of digital music, efficiently identifying key music relationship nodes in large-scale music knowledge graphs is crucial for enhancing music recommendation, emotion analysis, and genre classification. To address this challenge, we propose MUSIGAIN, a GATv2-based adaptive framework that combines graph [...] Read more.

With the exponential growth of digital music, efficiently identifying key music relationship nodes in large-scale music knowledge graphs is crucial for enhancing music recommendation, emotion analysis, and genre classification. To address this challenge, we propose MUSIGAIN, a GATv2-based adaptive framework that combines graph robustness metrics with advanced graph neural network mechanisms for multi-relationship mining in heterogeneous music knowledge graphs. MUSIGAIN tackles three fundamental challenges: the prohibitive computational complexity of exact graph-robustness calculations, the limitations of traditional centrality measures in capturing semantic heterogeneity, and the over-smoothing problem in deep graph neural networks. The framework introduces three key innovations: (1) a layer-wise dynamic skipping mechanism that adaptively controls propagation depth based on third-order embedding stability, reducing computation by 30–40% while preventing over-smoothing; (2) the DiGRAF adaptive activation function that enables node-specific nonlinear transformations to capture semantic heterogeneity across different entity types; and (3) ranking-based optimization supervised by graph robustness metrics, focusing on relative importance ordering rather than absolute value prediction. Experimental results on four real-world music knowledge graphs (POP-MKG, ROCK-MKG, JAZZ-MKG, CLASSICAL-MKG) demonstrate that MUSIGAIN consistently outperforms existing methods in Top-5% node identification accuracy, achieving up to 96.78% while maintaining linear scalability to graphs with hundreds of thousands of nodes. MUSIGAIN provides an efficient, accurate, and interpretable solution for key node identification in complex heterogeneous graphs. Full article

(This article belongs to the Special Issue AI-Driven Data Analytics and Mining)

► Show Figures

Figure 1

28 pages, 731 KB

Open AccessArticle

Research on an Automatic Classification Method for Art Film Scenes Based on Image and Audio Deep Features

by Zhaojun An and Heinz D. Fill

Appl. Sci. 2025, 15(23), 12603; https://doi.org/10.3390/app152312603 - 28 Nov 2025

Viewed by 863

Abstract

This paper addresses the challenging task of automatic scene classification in art films, a genre characterized by symbolic visuals, asynchronous audio, and non-linear storytelling. We propose Styloformer, a multimodal transformer architecture designed to integrate visual, auditory, textual, and curatorial signals into a unified [...] Read more.

This paper addresses the challenging task of automatic scene classification in art films, a genre characterized by symbolic visuals, asynchronous audio, and non-linear storytelling. We propose Styloformer, a multimodal transformer architecture designed to integrate visual, auditory, textual, and curatorial signals into a unified representation space. The model combines cross-modal attention, stylistic clustering, influence prediction, and canonicality estimation to handle the semantic and historical complexity of art cinema. Additionally, we introduce a novel module called Historiographic Navigation, which embeds ontological priors and temporal logic to support interpretive reasoning. Evaluated on multiple benchmarks, Styloformer achieves state-of-the-art performance, including 91.85% accuracy and 94.31% AUC on the MovieNet dataset—outperforming baselines such as CLIP and ViT. Ablation studies further demonstrate the importance of each architectural component. Unlike general-purpose video models, our system is tailored to the aesthetic and narrative structure of art films, making it suitable for applications in digital curation and computational film analysis. Styloformer represents a scalable and interpretable approach to understanding artistic media, bridging machine learning with art historical reasoning. Full article

(This article belongs to the Special Issue AI-Driven Computer Vision and Pattern Recognition: Challenges and Applications)

► Show Figures

Figure 1

26 pages, 4013 KB

Open AccessArticle

Music Genre Classification Using Prosodic, Stylistic, Syntactic and Sentiment-Based Features

by Erik-Robert Kovacs and Stefan Baghiu

Big Data Cogn. Comput. 2025, 9(11), 296; https://doi.org/10.3390/bdcc9110296 - 19 Nov 2025

Viewed by 3037

Abstract

Romanian popular music has had a storied history across the last century and a half. Incorporating different influences at different times, today it boasts a wide range of both autochthonous and imported genres, such as traditional folk music, rock, rap, pop, and manele, [...] Read more.

Romanian popular music has had a storied history across the last century and a half. Incorporating different influences at different times, today it boasts a wide range of both autochthonous and imported genres, such as traditional folk music, rock, rap, pop, and manele, to name a few. We aim to trace the linguistic differences between the lyrics of these genres using natural language processing and a computational linguistics approach by studying the prosodic, stylistic, syntactic, and sentiment-based features of each genre. For this purpose, we have crawled a dataset of ~14,000 Romanian songs from publicly available websites along with the user-provided genre labels, and characterized each song and each genre, respectively, with regard to these features, discussing similarities and differences. We improve on existing tools for Romanian language natural language processing by building a lexical analysis library well suited to song lyrics or poetry which encodes a set of 17 linguistic features. In addition, we build lexical analysis tools for profanity-based features and improve the SentiLex sentiment analysis library by manually rebalancing its lexemes to overcome the limitations introduced by it having been machine translated into Romanian. We estimate the accuracy gain using a benchmark Romanian sentiment analysis dataset and register a 25% increase in accuracy over the SentiLex baseline. The contribution is meant to describe the characteristics of the Romanian expression of autochthonous as well as international genres and provide technical support to researchers in natural language processing, musicology or the digital humanities in studying the lyrical content of Romanian music. We have released our data and code for research use. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

28 pages, 6100 KB

Open AccessArticle

Multiplexed Integrin Detection and Cancer Cell Classification Using Multicolor Gap-Enhanced Gold Nanorods and Machine Learning Algorithm

by Suprava Shah, Reed Youngerman, Alberto Luis Rodriguez-Nieves, Mitchell Lee Taylor, William Rodney Bantom, David Thompson, Jingyi Chen, Yongmei Wang and Xiaohua Huang

Nanomaterials 2025, 15(22), 1693; https://doi.org/10.3390/nano15221693 - 8 Nov 2025

Cited by 1 | Viewed by 1007

Abstract

Integrins, cell-surface adhesion receptors involved in tumor progression, invasion, and metastasis, serve as crucial biomarkers for cancer diagnosis and therapeutic targeting. Multiplexed detection of integrins and cancer cell classification at the single-cell level allows for comprehensive profiling, facilitating precise identification and categorization of [...] Read more.

Integrins, cell-surface adhesion receptors involved in tumor progression, invasion, and metastasis, serve as crucial biomarkers for cancer diagnosis and therapeutic targeting. Multiplexed detection of integrins and cancer cell classification at the single-cell level allows for comprehensive profiling, facilitating precise identification and categorization of tumor cells that are heterogeneous in integrin expression and cell subtype. In this study, we developed a five-plex detection platform and demonstrated integrin profile for cancer cell classification leveraging surface-enhanced Raman scattering (SERS) with gap-enhanced gold nanorods (GENRs) in conjunction with advanced computational analysis. Specifically, we synthesized GENRs bearing five distinct Raman nanotags, each producing a unique spectral fingerprint upon targeting a specific integrin subtype expressed on cancer cell surfaces. SERS signals from single cancer cells—after labeling simultaneously with the five-color SERS nanotags—were collected on single cells and subsequently analyzed with classical least squares regression to reliably deconvolute and quantify expression level of five different integrin monomers. Utilizing a random forest classifier trained on integrin profiles from individual cancer cell lines, we achieved simultaneous detections of three different breast cancer cell lines, with exceptional classification accuracy of 99.9%. The feasibility of this method for multiplexed detection of circulating tumor cells was tested using peripheral blood mononuclear cells (PBMCs) spiked with mixed breast cancer cells from three cell lines. By integrating GENRs, multiplexed SERS nanotag technology, and machine learning, our platform significantly advances cancer diagnostics through accurate integrin-based cell profiling and classification. These findings highlight the potential of multiplexed integrin detection using SERS technology as a powerful diagnostic approach, ultimately supporting improved cancer subtype characterization, personalized diagnostics, and more targeted therapeutic strategies. Full article

(This article belongs to the Section Biology and Medicines)

► Show Figures

Figure 1

37 pages, 5181 KB

Open AccessArticle

Cinematic Narratives as Socio-Technical Systems: Emotion Mining and Script–Audience Emotional Fidelity

by Ayse Ocal

Systems 2025, 13(11), 994; https://doi.org/10.3390/systems13110994 - 6 Nov 2025

Cited by 5 | Viewed by 2562

Abstract

Cinema can be conceptualized as a socio-technical system in which scripts encode intended emotions, production processes transform them into multimodal experiences, and audiences generate emergent responses through reviews and ratings. This study investigates the emotional fidelity between designed affective trajectories in film scripts [...] Read more.

Cinema can be conceptualized as a socio-technical system in which scripts encode intended emotions, production processes transform them into multimodal experiences, and audiences generate emergent responses through reviews and ratings. This study investigates the emotional fidelity between designed affective trajectories in film scripts and perceived emotions expressed in audience reviews. A system-oriented computational framework was developed, integrating large-scale script and review data with transformer-based natural language processing models fine-tuned on the GoEmotions dataset. By applying a unified classification pipeline, we compare emotional distributions across scripts and reviews, analyze temporal and genre-specific patterns, and examine correlations with film success metrics such as profit and ratings. The results reveal both convergence and divergence between scripted intentions and audience responses, with genres functioning as semi-autonomous subsystems and historical trends reflecting context-dependent adaptation. Emotional fidelity—defined as the degree to which intended emotional expressions are preserved, transformed, or inverted in audience interpretation—is introduced as a system-level performance indicator. These findings advance theoretical perspectives on narrative communication as a feedback-driven socio-technical process and demonstrate how emotion mining can function as affective monitoring infrastructure for complex adaptive systems. The study contributes actionable insights for screenwriters, producers, and system designers seeking to enhance affective engagement. Full article

► Show Figures

Figure 1

28 pages, 1712 KB

Open AccessArticle

Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models

by Aura Cristina Udrea, Stefan Ruseti, Vlad Pojoga, Stefan Baghiu, Andrei Terian and Mihai Dascalu

Future Internet 2025, 17(9), 397; https://doi.org/10.3390/fi17090397 - 30 Aug 2025

Cited by 1 | Viewed by 1539

Abstract

Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary [...] Read more.

Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary historians into one of 17 genres. Our findings reveal that most novels do not adhere to a single genre label but instead combine elements of multiple (micro)genres, challenging traditional single-label classification approaches. We employed a dual computational methodology combining an analysis with Romanian-tailored linguistic features with general-purpose LLMs. ReaderBench, a Romanian-specific framework, was utilized to extract surface, syntactic, semantic, and discourse features, capturing fine-grained linguistic patterns. Alternatively, we prompted two LLMs (Llama3.3 70B and DeepSeek-R1 70B) to predict genres at the paragraph level, leveraging their ability to detect contextual and thematic coherence across multiple narrative scales. Statistical analyses using Kruskal–Wallis and Mann–Whitney tests identified genre-defining features at both novel and chapter levels. The integration of these complementary approaches enhances microgenre detection beyond traditional classification capabilities. ReaderBench provides quantifiable linguistic evidence, while LLMs capture broader contextual patterns; together, they provide a multi-layered perspective on literary genre that reflects the complex and heterogeneous character of fictional texts. Our results argue that both language-specific and general-purpose computational tools can effectively detect stylistic diversity in Romanian fiction, opening new avenues for computational literary analysis in limited-resourced languages. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

18 pages, 4936 KB

Open AccessReview

The Small Frontier: Trends Toward Miniaturization and the Future of Planetary Surface Rovers

by Carrington Chun, Faysal Chowdoury, Muhammad Hassan Tanveer, Sumit Chakravarty and David A. Guerra-Zubiaga

Actuators 2025, 14(7), 356; https://doi.org/10.3390/act14070356 - 20 Jul 2025

Viewed by 3065

Abstract

The robotic exploration of space began only five decades ago, and yet in the intervening years, a wide and diverse ecosystem of robotic explorers has been developed for this purpose. Such devices have greatly benefited from miniaturization trends and the increased availability of [...] Read more.

The robotic exploration of space began only five decades ago, and yet in the intervening years, a wide and diverse ecosystem of robotic explorers has been developed for this purpose. Such devices have greatly benefited from miniaturization trends and the increased availability of high-quality commercial off-the-shelf (COTS) components. This review outlines the specific taxonomic distinction between planetary surface rovers and other robotic space exploration vehicles, such as orbiters and landers. Additionally, arguments are made to standardize the classification of planetary rovers by mass into categories similar to those used for orbital satellites. Discussions about recent noteworthy trends toward the miniaturization of planetary rovers are also included, as well as a compilation of previous planetary rovers. This analysis compiles relevant metrics such as the mass, the distance traveled, and the locomotion or actuation technique for previous planetary rovers. Additional details are also examined about archetypal rovers that were chosen as representatives of specific small-scale rover classes. Finally, potential future trends for miniature planetary surface rovers are examined by way of comparison to similar miniaturized orbital robotic explorers known as CubeSats. Based on the existing relationship between CubeSats and their Earth-based simulation equivalents, CanSats, the importance of a potential Earth-based analog for miniature rovers is identified. This research establishes such a device, coining the new term ‘CanBot’ to refer to pathfinding systems that are deployed terrestrially to help develop future planetary surface exploration robots. Establishing this explicit genre of robotic vehicle is intended to provide a unified means for categorizing and encouraging the development of future small-scale rovers. Full article

(This article belongs to the Special Issue Feature Papers in Actuators for Surface Vehicles)

► Show Figures

Figure 1

25 pages, 2939 KB

Open AccessReview

by Jakub Swacha

Multimodal Technol. Interact. 2025, 9(6), 59; https://doi.org/10.3390/mti9060059 - 11 Jun 2025

Cited by 2 | Viewed by 9320

Abstract

Video games come in many genres. Although the popularity of games that belong to different genres is the subject of various research and industry reports, so far, there have been no studies investigating their popularity in research papers. This paper addresses this gap [...] Read more.

Video games come in many genres. Although the popularity of games that belong to different genres is the subject of various research and industry reports, so far, there have been no studies investigating their popularity in research papers. This paper addresses this gap with an analysis of bibliographic data sourced from Scopus, spanning 45 years since the emergence of the topic till today and covering nine widely recognized genres: Action, Puzzle, Rhythm, Role-Playing, Simulation, Sports, Shooter, Strategy, and Traditional. The obtained results not only reveal the current popularity of these video game genres but also illustrate its change over time and geographic distribution as well as highlight the most impactful papers referring to the respective genres and their topics, providing a number of footholds for future studies, including regarding the identified disparities in the research interest in some genres and the number of available games belonging to them, the fluctuations in the relative popularity of the respective genres, and the disparities in the share of research output dedicated to video game genres in the total research output of different countries. Full article

► Show Figures

Figure 1

17 pages, 275 KB

Open AccessArticle

The Dark Side of Things: Praxis of Curiosity in La silva curiosa (Julián de Medrano 1583)

by Mercedes Alcalá Galán

Humanities 2025, 14(5), 100; https://doi.org/10.3390/h14050100 - 28 Apr 2025

Viewed by 1424

Abstract

Curiosity lies at the heart of the sixteenth-century miscellany books, which served as precursors to the essay genre. Among them, a truly exceptional piece stands out: La silva curiosa by Julián de Medrano, published in 1583. This work pushes the boundaries of curiosity [...] Read more.

Curiosity lies at the heart of the sixteenth-century miscellany books, which served as precursors to the essay genre. Among them, a truly exceptional piece stands out: La silva curiosa by Julián de Medrano, published in 1583. This work pushes the boundaries of curiosity to such an extent that it challenges its classification within the genre of miscellany owing to its unconventional and strange nature. Julián de Medrano, the author of this outlandish work, transforms himself into a character and protagonist, defining himself as an “extremely curious” individual. During his extensive travels, he curates a collection of “curious” epitaphs associated with often comical and peculiar deaths, spanning Latin, Spanish, Portuguese, French, Galician, and Italian. In addition to this, La silva curiosa includes an autobiographical narrative, a precursor to the Gothic genre, in which Medrano recounts unsettling encounters with black magic. This work offers a multifaceted exploration of curiosity, taking it to the extreme by narrating the author’s life experiences driven by a relentless pursuit of the curious, which is synonymous with the bizarre, extraordinary, marvelous, and unexpected. La silva curiosa emerges from a time marked by an almost nihilistic void, as the full force of the Baroque era has not yet arrived, and the ideals of humanism are fading away. It stands as a unique document that unveils an unexpected facet of the concept of curiosity within Spanish Renaissance culture. Full article

(This article belongs to the Special Issue Curiosity and Modernity in Early Modern Spain)

11 pages, 877 KB

Open AccessArticle

Beyond Spectrograms: Rethinking Audio Classification from EnCodec’s Latent Space

by Jorge Perianez-Pascual, Juan D. Gutiérrez, Laura Escobar-Encinas, Álvaro Rubio-Largo and Roberto Rodriguez-Echeverria

Algorithms 2025, 18(2), 108; https://doi.org/10.3390/a18020108 - 16 Feb 2025

Cited by 2 | Viewed by 5626

Abstract

This paper presents a novel approach to audio classification leveraging the latent representation generated by Meta’s EnCodec neural audio codec. We hypothesize that the compressed latent space representation captures essential audio features more suitable for classification tasks than the traditional spectrogram-based approaches. We [...] Read more.

This paper presents a novel approach to audio classification leveraging the latent representation generated by Meta’s EnCodec neural audio codec. We hypothesize that the compressed latent space representation captures essential audio features more suitable for classification tasks than the traditional spectrogram-based approaches. We train a vanilla convolutional neural network for music genre, speech/music, and environmental sound classification using EnCodec’s encoder output as input to validate this. Then, we compare its performance training with the same network using a spectrogram-based representation as input. Our experiments demonstrate that this approach achieves comparable accuracy to state-of-the-art methods while exhibiting significantly faster convergence and reduced computational load during training. These findings suggest the potential of EnCodec’s latent representation for efficient, faster, and less expensive audio classification applications. We analyze the characteristics of EnCodec’s output and compare its performance against traditional spectrogram-based approaches, providing insights into this novel approach’s advantages. Full article

(This article belongs to the Special Issue Artificial Intelligence Algorithms for Prediction, Control, Classification, Regression, and Intelligent Signal Processing in Industry)

► Show Figures

Figure 1

27 pages, 10096 KB

Open AccessArticle

Comparative Analysis of Conventional CNN v’s ImageNet Pretrained ResNet in Medical Image Classification

by Christos Raptis, Efstratios Karavasilis, George Anastasopoulos and Adam Adamopoulos

Information 2024, 15(12), 806; https://doi.org/10.3390/info15120806 - 14 Dec 2024

Cited by 10 | Viewed by 4992

Abstract

Convolutional Neural Networks (CNNs) are the prevalent technology in computer vision and have become increasingly popular for medical imaging data classification and analysis. In this field, due to the scarcity of medical data, pretrained ResNets on ImageNet can be considered a suitable first [...] Read more.

Convolutional Neural Networks (CNNs) are the prevalent technology in computer vision and have become increasingly popular for medical imaging data classification and analysis. In this field, due to the scarcity of medical data, pretrained ResNets on ImageNet can be considered a suitable first approach. This paper examines the medical imaging classification accuracy of conventional basic custom CNNs compared to ImageNet pretrained ResNets on various medical datasets in an effort to give more information about the importance of medical data and its preprocessing techniques for disease studies. Microscope-extracted cytological images were examined along with chest X-rays, MRI brain scans, and melanoma photographs. The medical images were examined in various sets, class combinations, and resolutions. Augmented image datasets and asymmetrical training and validation splits among the classes were also examined. Models were developed after they were tested and fine-tuned with respect to their network size, parameter values and network methods, image resolution, size of dataset, multitude, and genre of class. Overfitting was also examined, and comparative studies regarding the computational cost of different models were performed. The models achieved high accuracy in image classification that varies depending on the dataset and can be easily incorporated in future over-the-internet medical decision-supporting (telemedicine) environments. In addition, it appeared that conventional basic custom CNN overperformed ImageNet pretrained ResNets. The obtained results indicate the importance of utilizing medical image data as a testbed for improvements in CNN classification performance and the possibility of using CNNs and data preprocessing techniques for disease studies. Full article

(This article belongs to the Special Issue Deep Learning in Medical Image Analysis: Foundations, Techniques, and Applications)

► Show Figures

Figure 1

Search Results (66)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (66)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI