Data-Centric Artificial Intelligence: New Methods for Data Processing, 2nd Edition

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 October 2026 | Viewed by 3750

Special Issue Editor


E-Mail Website
Guest Editor
Department of Intelligent Systems, Faculty of Telecommunications, Computer Science and Electrical Engineering, Bydgoszcz University of Science and Technology, 85-796 Bydgoszcz, Poland
Interests: bee algorithms; fuzzy logic; artificial neural networks and their applications; language models; generative AI
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Data-centric artificial intelligence is developing rapidly thanks to advances in machine learning, natural language processing, and data visualization. These modern AI techniques enable a better understanding and processing of huge datasets. They provide companies and scientists with tools for extracting hidden patterns, discovering new knowledge, and automating complex analytical processes. In this Special Issue, we present examples of applications of these AI methods for solving real business and scientific problems.

We would like to invite you to submit a paper to our Special Issue of Electronics dedicated to data-centric artificial intelligence. This Special Issue will focus on the following topics:

  1. New methods and techniques for processing large datasets;
  2. Topics related to machine learning, natural language processing, and data visualization;
  3. Presenting practical applications of these methods in various fields.

This Special Issue will supplement the existing literature by focusing on the latest trends and solutions in this area.

Dr. Dawid Ewald
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • machine learning
  • data processing
  • data visualization
  • natural language processing
  • fuzzy logic

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 2515 KB  
Article
A Vision Transformer Model with Hyperparameter Optimization for Oral Cancer Image Classification
by Chun-Tai Huang, Ying-Lei Lin, Chung-Hui Lin and Ping-Feng Pai
Electronics 2026, 15(10), 2230; https://doi.org/10.3390/electronics15102230 - 21 May 2026
Viewed by 210
Abstract
Oral cancer is a significant public health concern and is among the most common malignant tumors of the head and neck. Its incidence and mortality rates remain persistently high, especially in regions where smoking and betel nut chewing are prevalent. Due to its [...] Read more.
Oral cancer is a significant public health concern and is among the most common malignant tumors of the head and neck. Its incidence and mortality rates remain persistently high, especially in regions where smoking and betel nut chewing are prevalent. Due to its high mortality rate, early detection is crucial for improving patient outcomes. However, early symptoms of oral cancer often resemble benign oral lesions, leading to delayed diagnosis. In this study, a vision transformer (ViT) model with Optuna (ViTOPT) is employed to perform classification tasks of identifying oral cancer images. The Optuna is used to determine hyperparameters in ViT. Histological images are obtained from a publicly available dataset. Three classification tasks with histological images namely classifying oral squamous cell carcinoma (OSCC) and leukoplakia (LEUK), classifying the presence of dysplasia, and classifying OSCC and leukoplakia with or without dysplasia are performed in this study. Numerical results reveal that the proposed ViTOPT framework is able to provide satisfactory performance in oral cancer recognition. Thus, the proposed ViTOPT model is a feasible and effective alternative in identifying oral cancer. Full article
37 pages, 883 KB  
Article
Data-Centric AI Manifesto: How Data Quality Drives Modern AI
by Donato Malerba, Antonella Poggi, Mario Alviano, Tommaso Boccali, Maria Teresa Camerlingo, Roberto Maria Delfino, Domenico Diacono, Domenico Elia, Vincenzo Pasquadibisceglie, Mara Sangiovanni, Vincenzo Spinoso and Gioacchino Vino
Electronics 2026, 15(9), 1913; https://doi.org/10.3390/electronics15091913 - 1 May 2026
Viewed by 1080
Abstract
Artificial Intelligence (AI) has traditionally been developed according to a model-centric paradigm, in which progress is driven by increasingly sophisticated learning architectures applied to largely fixed datasets. However, this paradigm exhibits well-known limitations, including sensitivity to label noise, distribution shifts, adversarial perturbations, and [...] Read more.
Artificial Intelligence (AI) has traditionally been developed according to a model-centric paradigm, in which progress is driven by increasingly sophisticated learning architectures applied to largely fixed datasets. However, this paradigm exhibits well-known limitations, including sensitivity to label noise, distribution shifts, adversarial perturbations, and limited transparency and reproducibility. These issues indicate that many of the current bottlenecks of AI systems arise from deficiencies in data rather than from model design. In this paper, we adopt and formalize the Data-Centric Artificial Intelligence (DCAI) paradigm, which places data quality, semantic consistency, and representativeness at the core of the AI lifecycle. From this perspective, performance, robustness, interpretability, and regulatory compliance are primarily achieved through systematic data engineering, including data curation, enrichment, validation, and continuous monitoring, rather than through repeated model re-engineering. The contributions of this work are threefold. First, a conceptual framework is provided to clarify the epistemic and methodological foundations of DCAI and distinguish it from traditional model-centric approaches. Second, a data-centric lifecycle is presented, covering training data development, inference data design, and data maintenance and integrating techniques such as semantic data representation, active learning, synthetic data generation, and drift-aware quality control. Third, the role of DCAI in the context of Generative AI is analyzed, showing how data-centric practices are essential to ensure robustness, accountability, and responsible deployment of large-scale generative models. Overall, this work positions DCAI as a coherent methodological and technological framework for the development of trustworthy, resilient, and sustainable AI systems, making a research contribution and providing a reference model for industrial and regulatory contexts. Full article
Show Figures

Figure 1

28 pages, 4737 KB  
Article
Comparative Evaluation of Perceptual Hashing and Deep Embedding Methods for Robust and Efficient Image Deduplication
by Md Firoz Mahmud, Zerin Nusrat and W. David Pan
Electronics 2026, 15(7), 1493; https://doi.org/10.3390/electronics15071493 - 2 Apr 2026
Viewed by 2184
Abstract
The rapid growth in large-scale image repositories over the past few years has made exact and near-duplicate images increasingly common, creating substantial redundancy that wastes storage resources and reduces retrieval efficiency in practical systems. Even though perceptual hashing and deep learning are promising [...] Read more.
The rapid growth in large-scale image repositories over the past few years has made exact and near-duplicate images increasingly common, creating substantial redundancy that wastes storage resources and reduces retrieval efficiency in practical systems. Even though perceptual hashing and deep learning are promising deduplication strategies, the lack of standardized benchmarks complicates direct comparison. In this study, we conduct a unified, controlled evaluation of five commonly used methods, including four classical perceptual hashes (AHash, DHash, PHash, and WHash) and a CNN-based embedding model. We evaluate all methods on the UKBench and Amazon Berkeley Objects datasets using identical preprocessing, thresholds, and metrics, which include exact duplicates, near-duplicates, and geometrically transformed duplicates. Our experiments highlight a clear trade-off between speed and robustness. Hashing methods are computationally efficient and effective for exact matches, but perform poorly on near-duplicates and under geometric transformations, whereas the CNN model is significantly more robust across all duplicate types, but comes at a high computational cost. Based on these results, we outline practical recommendations for selecting deduplication strategies in large-scale applications. In addition, our evaluation setup serves as a reproducible baseline for future research in image similarity and large-scale deduplication. Full article
Show Figures

Figure 1

Back to TopTop