Artificial Intelligence and Data Science, 2nd Edition

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 30 April 2026 | Viewed by 5333

Special Issue Editors

School of Computer Science and Technology, Dalian University of Technology, Dalian 116078, China
Interests: data science; network science; knowledge science; anomaly detection
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computing Technologies, RMIT University, Melbourne, VIC 3000, Australia
Interests: data science; artificial intelligence; graph learning; anomaly detection; systems engineering
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Building on the success of the first edition, we are pleased to announce the second edition of our Special Issue on "Artificial Intelligence and Data Science, 2nd Edition".

Data science serves as the core theory and methodology for extracting valuable insights from data. The rapid evolution of artificial intelligence (AI) technologies has significantly expanded and enriched the field of data science, driving transformative impacts across various domains, including cybersecurity, healthcare, fraud detection, transportation, and more. By integrating advanced AI methodologies with data science, researchers and practitioners have developed hybrid approaches that enable the seamless transition from data to information, knowledge, and actionable decisions.

Despite the remarkable progress in big data and AI technologies, the theoretical frameworks and technical mechanisms that underpin their success remain in a nascent stage. Isolated advancements in either AI or data science are insufficient to sustain the growth of intelligent, data-driven applications. Therefore, a deeper exploration of the fundamental theories and interdisciplinary approaches is urgently needed to propel both fields forward and unlock their full potential in addressing real-world challenges.

This Special Issue invites submissions that aim to address the following critical questions:

  • How can interdisciplinary approaches break the barriers between methodologies and theories to further advance AI and data science?
  • What will the new paradigms of AI and data science look like?
  • How can AI and data science technologies achieve greater impact in practical applications?

We welcome original research articles and reviews that explore innovative theories, methodologies, and applications at the intersection of AI and data science. Topics of interest include, but are not limited to, the following:

  • Knowledge-driven AI technologies;
  • Advanced deep learning approaches, such as fairness learning;
  • Security, trust, and privacy in AI and data science;
  • Few-shot, one-shot, and zero-shot learning methodologies;
  • Data governance strategies and frameworks;
  • Intelligent computing paradigms, including auto machine learning and lifelong learning;
  • Applications in urgent domains, such as anomaly detection;
  • Complexity theory and its implications for AI and data science;
  • High-performance computing for large-scale AI models;
  • Big data technologies and their applications;
  • Data analytics and visualization techniques;
  • Real-world applications, including healthcare, transportation, and beyond.

We look forward to your contributions and encourage you to submit your high-quality work to this Special Issue. Together, let us continue to push the boundaries of AI and data science, fostering innovation and real-world impact.

Dr. Shuo Yu
Prof. Dr. Feng Xia
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • intelligent computing such as auto machine learning, lifelong learning, etc.
  • complexity theory
  • high-performance computing
  • big data technologies and applications

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

34 pages, 3112 KB  
Article
Artificial Intelligence Applied to Soil Compaction Control for the Light Dynamic Penetrometer Method
by Jorge Rojas-Vivanco, José García, Gabriel Villavicencio, Miguel Benz, Antonio Herrera, Pierre Breul, German Varas, Paola Moraga, Jose Gornall and Hernan Pinto
Mathematics 2025, 13(21), 3359; https://doi.org/10.3390/math13213359 - 22 Oct 2025
Viewed by 67
Abstract
Compaction quality control in earthworks and pavements still relies mainly on density-based acceptance referenced to laboratory Proctor tests, which are costly, time-consuming, and spatially sparse. Lightweight dynamic cone penetrometer (LDCP) provides rapid indices, such as qd0 and qd1, [...] Read more.
Compaction quality control in earthworks and pavements still relies mainly on density-based acceptance referenced to laboratory Proctor tests, which are costly, time-consuming, and spatially sparse. Lightweight dynamic cone penetrometer (LDCP) provides rapid indices, such as qd0 and qd1, yet acceptance thresholds commonly depend on ad hoc, site-specific calibrations. This study develops and validates a supervised machine learning framework that estimates qd0, qd1, and Zc directly from readily available soil descriptors (gradation, plasticity/activity, moisture/state variables, and GTR class) using a multi-campaign dataset of n=360 observations. While the framework does not remove the need for the standard soil characterization performed during design (e.g., W, γd,field, and RCSPC), it reduces reliance on additional LDCP calibration campaigns to obtain device-specific reference curves. Models compared under a unified pipeline include regularized linear baselines, support vector regression, Random Forest, XGBoost, and a compact multilayer perceptron (MLP). The evaluation used a fixed 80/20 train–test split with 5-fold cross-validation on the training set and multiple error metrics (R2, RMSE, MAE, and MAPE). Interpretability combined SHAP with permutation importance, 1D partial dependence (PDP), and accumulated local effects (ALE); calibration diagnostics and split-conformal prediction intervals connected the predictions to QA/QC decisions. A naïve GTR-average baseline was added for reference. Computation was lightweight. On the test set, the MLP attained the best accuracy for qd1 (R2=0.794, RMSE =5.866), with XGBoost close behind (R2=0.773, RMSE =6.155). Paired bootstrap contrasts with Holm correction indicated that the MLP–XGBoost difference was not statistically significant. Explanations consistently highlighted density- and moisture-related variables (γd,field, RCSPC, and W) as dominant, with gradation/plasticity contributing second-order adjustments; these attributions are model-based and associational rather than causal. The results support interpretable, computationally efficient surrogates of LDCP indices that can complement density-based acceptance and enable risk-aware QA/QC via conformal prediction intervals. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

15 pages, 2373 KB  
Article
LLM-Empowered Kolmogorov-Arnold Frequency Learning for Time Series Forecasting in Power Systems
by Zheng Yang, Yang Yu, Shanshan Lin and Yue Zhang
Mathematics 2025, 13(19), 3149; https://doi.org/10.3390/math13193149 - 2 Oct 2025
Viewed by 312
Abstract
With the rapid evolution of artificial intelligence technologies in power systems, data-driven time-series forecasting has become instrumental in enhancing the stability and reliability of power systems, allowing operators to anticipate demand fluctuations and optimize energy distribution. Despite the notable progress made by current [...] Read more.
With the rapid evolution of artificial intelligence technologies in power systems, data-driven time-series forecasting has become instrumental in enhancing the stability and reliability of power systems, allowing operators to anticipate demand fluctuations and optimize energy distribution. Despite the notable progress made by current methods, they are still hindered by two major limitations: most existing models are relatively small in architecture, failing to fully leverage the potential of large-scale models, and they are based on fixed nonlinear mapping functions that cannot adequately capture complex patterns, leading to information loss. To this end, an LLM-Empowered Kolmogorov–Arnold frequency learning (LKFL) is proposed for time series forecasting in power systems, which consists of LLM-based prompt representation learning, KAN-based frequency representation learning, and entropy-oriented cross-modal fusion. Specifically, LKFL first transforms multivariable time-series data into text prompts and leverages a pre-trained LLM to extract semantic-rich prompt representations. It then applies Fast Fourier Transform to convert the time-series data into the frequency domain and employs Kolmogorov–Arnold networks (KAN) to capture multi-scale periodic structures and complex frequency characteristics. Finally, LKFL integrates the prompt and frequency representations through an entropy-oriented cross-modal fusion strategy, which minimizes the semantic gap between different modalities and ensures full integration of complementary information. This comprehensive approach enables LKFL to achieve superior forecasting performance in power systems. Extensive evaluations on five benchmarks verify that LKFL sets a new standard for time-series forecasting in power systems compared with baseline methods. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

15 pages, 1839 KB  
Article
Cluster Complementarity and Consistency Mining for Multi-View Representation Learning
by Yanyan Wen and Haifeng Li
Mathematics 2025, 13(15), 2521; https://doi.org/10.3390/math13152521 - 5 Aug 2025
Viewed by 516
Abstract
With the advent of the big data era, multi-view clustering (MVC) methods have attracted considerable acclaim due to their capability in handling the multifaceted nature of data, which achieves impressive results across various fields. However, two significant challenges persist in MVC methods: (1) [...] Read more.
With the advent of the big data era, multi-view clustering (MVC) methods have attracted considerable acclaim due to their capability in handling the multifaceted nature of data, which achieves impressive results across various fields. However, two significant challenges persist in MVC methods: (1) They resort to learning view-invariant information of samples to bridge the heterogeneity gap between views, which may result in the loss of view-specific information that contributes to pattern mining. (2) They utilize fusion strategies that are susceptible to the discriminability of views, i.e., the concatenation and the weighing fusion of cross-view representations, to aggregate complementary and consistent information, which is difficult to guarantee semantic robustness of fusion representations. To this end, a simple yet effective cluster complementarity and consistency learning framework (CommonMVC) is proposed for mining patterns of multiview data. Specifically, a cluster complementarity learning is devised to endow fusion representations with discriminate information via nonlinearly aggregating view-specific information. Meanwhile, a cluster consistency learning is introduced via modeling instance-level and cluster-level partition invariance to coordinate the clustering partition of various views, which ensures the robustness of multi-view data pattern mining. Seamless collaboration between two components effectively enhances multi-view clustering performance. Finally, comprehensive experiments on four real-world datasets demonstrate CommonMVC establishes a new state-of-the-art baseline for the MVC task. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

33 pages, 7261 KB  
Article
Comparative Analysis of Explainable AI Methods for Manufacturing Defect Prediction: A Mathematical Perspective
by Gabriel Marín Díaz
Mathematics 2025, 13(15), 2436; https://doi.org/10.3390/math13152436 - 29 Jul 2025
Viewed by 1872
Abstract
The increasing complexity of manufacturing processes demands accurate defect prediction and interpretable insights into the causes of quality issues. This study proposes a methodology integrating machine learning, clustering, and Explainable Artificial Intelligence (XAI) to support defect analysis and quality control in industrial environments. [...] Read more.
The increasing complexity of manufacturing processes demands accurate defect prediction and interpretable insights into the causes of quality issues. This study proposes a methodology integrating machine learning, clustering, and Explainable Artificial Intelligence (XAI) to support defect analysis and quality control in industrial environments. Using a dataset based on empirical industrial distributions, we train an XGBoost model to classify high- and low-defect scenarios from multidimensional production and quality metrics. The model demonstrates high predictive performance and is analyzed using five XAI techniques (SHAP, LIME, ELI5, PDP, and ICE) to identify the most influential variables linked to defective outcomes. In parallel, we apply Fuzzy C-Means and K-means to segment production data into latent operational profiles, which are also interpreted using XAI to uncover process-level patterns. This approach provides both global and local interpretability, revealing consistent variables across predictive and structural perspectives. After a thorough review, no prior studies have combined supervised learning, unsupervised clustering, and XAI within a unified framework for manufacturing defect analysis. The results demonstrate that this integration enables a transparent, data-driven understanding of production dynamics. The proposed hybrid approach supports the development of intelligent, explainable Industry 4.0 systems. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

16 pages, 3735 KB  
Article
A Novel Trustworthy Toxic Text Detection Method with Entropy-Oriented Invariant Representation Learning for Portuguese Community
by Wenting Fan, Haoyan Song and Jun Zhang
Mathematics 2025, 13(13), 2136; https://doi.org/10.3390/math13132136 - 30 Jun 2025
Viewed by 427
Abstract
With the rapid development of digital technologies, data-driven methods have demonstrated commendable performance in the toxic text detection task. However, several challenges remain unresolved, including the inability to fully capture the nuanced semantic information embedded in text languages, the lack of robust mechanisms [...] Read more.
With the rapid development of digital technologies, data-driven methods have demonstrated commendable performance in the toxic text detection task. However, several challenges remain unresolved, including the inability to fully capture the nuanced semantic information embedded in text languages, the lack of robust mechanisms to handle the inherent uncertainty of text languages, and the utilization of static fusion strategies for multi-view information. To address these issues, this paper proposes a comprehensive and dynamic toxic text detection method. Specifically, we design a multi-view feature augmentation module by combining bidirectional long short-term memory and BERT as a dual-stream framework. This module captures a more holistic representation of semantic information by learning both local and global features of texts. Next, we introduce an entropy-oriented invariant learning module by minimizing the conditional entropy between view-specific representations to align consistent information, thereby enhancing the representation generalization. Meanwhile, we devise a trustworthy text recognition module by defining the Dirichlet function to model uncertainty estimation of text prediction. And then, we perform the evidence-based information fusion strategy to dynamically aggregate decision information between views with the help of the Dirichlet distribution. Through these components, the proposed method aims to overcome the limitations of traditional methods and provide a more accurate and reliable solution for toxic language detection. Finally, extensive experiments on the two real-world datasets show the effectiveness and superiority of the proposed method in comparison with seven methods. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

36 pages, 9139 KB  
Article
On the Synergy of Optimizers and Activation Functions: A CNN Benchmarking Study
by Khuraman Aziz Sayın, Necla Kırcalı Gürsoy, Türkay Yolcu and Arif Gürsoy
Mathematics 2025, 13(13), 2088; https://doi.org/10.3390/math13132088 - 25 Jun 2025
Viewed by 1340
Abstract
In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and activation functions, we [...] Read more.
In this study, we present a comparative analysis of gradient descent-based optimizers frequently used in Convolutional Neural Networks (CNNs), including SGD, mSGD, RMSprop, Adadelta, Nadam, Adamax, Adam, and the recent EVE optimizer. To explore the interaction between optimization strategies and activation functions, we systematically evaluate all combinations of these optimizers with four activation functions—ReLU, LeakyReLU, Tanh, and GELU—across three benchmark image classification datasets: CIFAR-10, Fashion-MNIST (F-MNIST), and Labeled Faces in the Wild (LFW). Each configuration was assessed using multiple evaluation metrics, including accuracy, precision, recall, F1-score, mean absolute error (MAE), and mean squared error (MSE). All experiments were performed using k-fold cross-validation to ensure statistical robustness. Additionally, two-way ANOVA was employed to validate the significance of differences across optimizer–activation combinations. This study aims to highlight the importance of jointly selecting optimizers and activation functions to enhance training dynamics and generalization in CNNs. We also consider the role of critical hyperparameters, such as learning rate and regularization methods, in influencing optimization stability. This work provides valuable insights into the optimizer–activation interplay and offers practical guidance for improving architectural and hyperparameter configurations in CNN-based deep learning models. Full article
(This article belongs to the Special Issue Artificial Intelligence and Data Science, 2nd Edition)
Show Figures

Figure 1

Back to TopTop