Information

Journal Browser

► Journal Browser

Computer Vision, Pattern Recognition, and Machine Learning in Italy—Second Edition

Share This Special Issue

Special Issue Editors

Special Issue Information

Dear Colleagues,

Most modern technological innovations are made possible with the most recent advances in pattern recognition, machine learning, and computer vision.

The main aim of this Special Issue is to collect works from the Italian research community. These works should report the main theoretical improvements in the aforementioned research areas and their impacts on different application contexts, such as video surveillance and biometry, sports analysis, inspection, assistive and manufacturing technologies, smart agriculture, eHealth, environment monitoring, intelligent transportation and construction, retail, etc.

Dr. Marco Leo
Dr. Sara Colantonio
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.

Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.

Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.

External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.

Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (2 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

30 pages, 7003 KB

Open AccessArticle

Facial Expression Recognition in Anime and Manga Characters: A Comparative Study of Vision Transformers and Convolutional Neural Networks

by Marco Parrillo, Elia Santoro, Luigi Laura and Valerio Rughetti

Information 2026, 17(5), 484; https://doi.org/10.3390/info17050484 - 15 May 2026

Viewed by 373

Abstract

Facial expression recognition (FER) is a well-established task in computer vision, yet its application to non-photorealistic domains, such as anime and manga, remains largely underexplored. The stylized, exaggerated, and often non-proportional facial features of illustrated characters present unique challenges for deep learning models trained predominantly on realistic imagery. In this work, we construct a balanced dataset of 3000 manga and anime face images spanning six emotion categories (Angry, Embarrassed, Happy, Manic–Euphoric, Sad, Scared) and conduct a systematic comparison of two major deep learning paradigms: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Specifically, we evaluate ResNet-18, ResNet-50, ViT-B/16, and ViT-S/16 under four fine-tuning strategies: linear probing, partial fine-tuning, full fine-tuning, and progressive unfreezing, enabling a controlled comparison of both architectural families and transfer learning depth. Our results show that fine-tuning strategy significantly impacts performance: the best configuration (ViT-B/16 with progressive unfreezing) achieves 81.33% test accuracy (single run, seed 42), compared to 61.33% for the weakest linear probe baseline (ViT-S/16), a gap of 20.00 percentage points. To isolate architectural differences from strategy effects, we note that under full fine-tuning, the only strategy applied identically to all four models, ViT-S/16 (76.00%) outperforms ResNet-18 (74.44%) by 1.56 percentage points and ViT-B/16 (74.22%) by 1.78 percentage points, confirming a modest but consistent architectural advantage for Transformers once backbone adaptation is permitted. Vision Transformers benefit disproportionately from fine-tuning, and the relative ranking of architectures changes across fine-tuning regimes. Confusion matrix analysis reveals persistent cross-class confusion between visually similar emotions (e.g., Happy vs. Embarrassed), while the highly distinctive Manic–Euphoric category is consistently well recognized across all architectures. To the best of our knowledge, this is the first work to conduct a controlled multi-architecture, multi-strategy transfer learning benchmark specifically for FER in anime and manga, revealing findings that are not predictable from photographic FER literature and that carry direct practical implications for model selection in non-photorealistic visual recognition tasks. The anime and manga domain provides a uniquely controlled testbed for studying transfer learning under deliberate stylization, where the domain gap from realistic imagery is not an artifact of image degradation or environmental noise but a principled artistic choice with codified visual conventions; observing that fine-tuning depth dominates architectural choice in this domain suggests the same conclusion likely holds in other non-photorealistic transfer scenarios such as medical illustrations, architectural drawings, and synthetic training data. Full article

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, and Machine Learning in Italy—Second Edition)

► Show Figures

Figure 1

21 pages, 2169 KB

Open AccessArticle

Enhancing Early Detection of Alzheimer’s Disease via Vision Transformer Machine Learning Architecture Using MRI Images

by Wided Hechkel, Marco Leo, Pierluigi Carcagnì, Marco Del-Coco and Abdelhamid Helali

Information 2026, 17(2), 163; https://doi.org/10.3390/info17020163 - 6 Feb 2026

Viewed by 1507

Abstract

Computer-aided diagnosis (CAD) systems based on deep learning have shown significant potential for Alzheimer’s disease (AD) stage classification from Magnetic Resonance Imaging (MRI). Nevertheless, challenges such as class imbalance, small sample sizes, and the presence of multiple slices per subject may lead to biased evaluation and statistically unreliable performance, particularly for minority classes. In this study, a Vision Transformer (ViT)-based framework is proposed for multi-class AD classification using a Kaggle dataset containing 6400 MRI slices across four cognitive stages. A subject-wise data-splitting strategy is employed to prevent information leakage between the training and testing sets, and the statistical unreliability of near-perfect scores in underrepresented classes is critically examined. An ablation study is conducted to assess the contribution of key architectural components, demonstrating the effectiveness of self-attention and patch embedding in capturing discriminative features. Furthermore, attention-based visualization maps are incorporated to highlight brain regions influencing the model’s decisions and to illustrate subtle anatomical differences between MildDemented and VeryMildDemented cases. The proposed approach achieves a test accuracy of 97.98%, outperforming existing methods on the same dataset while providing improved interpretability. It supports early and accurate AD stage identification. Full article

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, and Machine Learning in Italy—Second Edition)

► Show Figures

Journal Menu

Journal Browser

Computer Vision, Pattern Recognition, and Machine Learning in Italy—Second Edition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (2 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI