Advanced Software and Machine Learning Techniques for System Architectures and Big Data

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289). This special issue belongs to the section "Big Data".

Deadline for manuscript submissions: 1 July 2026 | Viewed by 6270

Special Issue Editors

School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, China.
Interests: Anomaly detection; time series analysis; deep generative networks; computer networks

E-Mail Website
Guest Editor
Department of Information Engineering (DII), Polytechnic University of Marche, Via Brecce Bianche 12, 60121 Ancona, Italy
Interests: social and complex network analysis; data mining and data science; Internet of Things; logic programming and methods for coupling inductive and deductive reasoning; advanced algorithms for sequences comparison
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, the explosive growth of data and the increasing complexity of computing systems have posed new challenges to traditional system architectures. At the same time, significant advances in software engineering and machine learning have opened up new opportunities for building intelligent, adaptive, and efficient systems. Leveraging artificial intelligence to enhance system design, monitoring, optimization, and scalability has become a critical research direction, especially in the context of big data and distributed environments. This rapidly evolving interdisciplinary field plays a pivotal role in the development of next-generation computing infrastructures, from cloud and edge computing to autonomous systems and intelligent analytics platforms.

This Special Issue aims to bring together original research and comprehensive reviews that explore the convergence of advanced software methodologies and machine learning techniques in the context of system architectures and big data. The Issue emphasizes system-level innovation—how software and AI/ML can enhance architectural design, automate resource management, optimize performance, and support large-scale deployment, especially in big data and distributed settings. Contributions that bridge practical system design with intelligent algorithmic techniques—providing solutions that are not only theoretically novel but also applicable to real-world computing infrastructures such as cloud, edge, and hybrid environments—are especially encouraged. 

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but are not limited to) the following:

  • Machine learning techniques for optimizing system architecture design;
  • Advanced software engineering methods in big data environments;
  • Intelligent data processing and analytics frameworks;
  • Automated and adaptive resource management in distributed systems;
  • Integration of AI/ML with cloud and edge computing infrastructures;
  • Security, privacy, and reliability in intelligent system architectures;
  • Real-world applications of smart system designs in industry and society;
  • Hybrid models combining forecasting, anomaly detection, and automated response mechanisms within intelligent architectures;
  • Time series analysis, modeling, and forecasting in dynamic environments;
  • Machine learning approaches for anomaly detection in system logs, network traffic, or operational metrics.

We look forward to receiving your contributions. 

Dr. Yan Qiao
Dr. Francesco Cauteruccio
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • system architectures
  • big data
  • software engineering
  • artificial intelligence (AI)
  • distributed systems
  • cloud and edge computing
  • anomaly detection
  • adaptive systems
  • intelligent analytics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 2120 KB  
Article
Reliability of LLM Inference Engines from a Static Perspective: Root Cause Analysis and Repair Suggestion via Natural Language Reports
by Hongwei Li and Yongjun Wang
Big Data Cogn. Comput. 2026, 10(2), 60; https://doi.org/10.3390/bdcc10020060 - 13 Feb 2026
Viewed by 770
Abstract
Large Language Model (LLM) inference engines are becoming critical system infrastructure, yet their increasing architectural complexity makes defects difficult to be diagnosed and repaired. Existing reliability studies predominantly focus on model behavior or training frameworks, leaving inference engine bugs underexplored, especially in settings [...] Read more.
Large Language Model (LLM) inference engines are becoming critical system infrastructure, yet their increasing architectural complexity makes defects difficult to be diagnosed and repaired. Existing reliability studies predominantly focus on model behavior or training frameworks, leaving inference engine bugs underexplored, especially in settings where execution-based debugging is impractical. We present a static, issue-centric approach for automated root cause analysis and repair suggestion generation for LLM inference engines. Based solely on issue reports and developer discussions, we construct a real-world defect dataset and annotate each issue with a semantic root cause category and affected system module. Leveraging text-based representations, our framework performs root cause classification and coarse-grained module localization without requiring code execution or specialized runtime environments. We further integrate structured repair patterns with a large language model to generate interpretable and actionable repair suggestions. Experiments on real-world issues concerning vLLMs demonstrate that our approach achieves effective root cause identification and module localization under limited and imbalanced data. A cross-engine evaluation further shows promising generalization to TensorRT-LLM. Human evaluation confirms that the generated repair suggestions are correct, useful, and clearly expressed. Our results indicate that static, issue-level analysis is a viable foundation for scalable debugging assistance in LLM inference engines. This work highlights the feasibility of static, issue-level defect analysis for complex LLM inference engines and explores automated debugging assistance techniques. The dataset and implementation will be publicly released to facilitate future research. Full article
Show Figures

Figure 1

41 pages, 5751 KB  
Article
Efficient Scheduling for GPU-Based Neural Network Training via Hybrid Reinforcement Learning and Metaheuristic Optimization
by Nana Du, Chase Wu, Aiqin Hou, Weike Nie and Ruiqi Song
Big Data Cogn. Comput. 2025, 9(11), 284; https://doi.org/10.3390/bdcc9110284 - 10 Nov 2025
Viewed by 2479
Abstract
On GPU-based clusters, the training workloads of machine learning (ML) models, particularly neural networks (NNs), are often structured as Directed Acyclic Graphs (DAGs) and typically deployed for parallel execution across heterogeneous GPU resources. Efficient scheduling of these workloads is crucial for optimizing performance [...] Read more.
On GPU-based clusters, the training workloads of machine learning (ML) models, particularly neural networks (NNs), are often structured as Directed Acyclic Graphs (DAGs) and typically deployed for parallel execution across heterogeneous GPU resources. Efficient scheduling of these workloads is crucial for optimizing performance metrics such as execution time, under various constraints including GPU heterogeneity, network capacity, and data dependencies. DAG-structured ML workload scheduling could be modeled as a Nonlinear Integer Program (NIP) problem, and is shown to be NP-complete. By leveraging a positive correlation between Scheduling Plan Distance (SPD) and Finish Time Gap (FTG) identified through an empirical study, we propose to develop a Running Time Gap Strategy for scheduling based on Whale Optimization Algorithm (WOA) and Reinforcement Learning, referred to as WORL-RTGS. The proposed method integrates the global search capabilities of WOA with the adaptive decision-making of Double Deep Q-Networks (DDQN). Particularly, we derive a novel function to generate effective scheduling plans using DDQN, enhancing adaptability to complex DAG structures. Comprehensive evaluations on practical ML workload traces collected from Alibaba on simulated GPU-enabled platforms demonstrate that WORL-RTGS significantly improves WOA’s stability for DAG-structured ML workload scheduling and reduces completion time by up to 66.56% compared with five state-of-the-art scheduling algorithms. Full article
Show Figures

Figure 1

26 pages, 1895 KB  
Article
A Pattern-Based Framework for Automated Migration of Monolithic Applications to Microservices
by Hossam Hassan, Manal A. Abdel-Fattah and Wael Mohamed
Big Data Cogn. Comput. 2025, 9(10), 253; https://doi.org/10.3390/bdcc9100253 - 6 Oct 2025
Cited by 2 | Viewed by 2285
Abstract
Over the past decade, many software enterprises have migrated from monolithic to microservice architectures to enhance scalability, maintainability, and performance. However, this transition presents significant challenges, requiring considerable development efforts, research, customization, and resource allocation over extended periods. Furthermore, the success of migration [...] Read more.
Over the past decade, many software enterprises have migrated from monolithic to microservice architectures to enhance scalability, maintainability, and performance. However, this transition presents significant challenges, requiring considerable development efforts, research, customization, and resource allocation over extended periods. Furthermore, the success of migration is not guaranteed, highlighting the complexities organizations face in modernizing their software systems. To address these challenges, this study introduces Mono2Micro, a comprehensive framework designed to automate the migration process while preserving structural integrity and optimizing service boundaries. The framework focuses on three core patterns: database patterns, service decomposition, and communication patterns. It leverages machine learning algorithms, including Random Forest and Louvain clustering, to analyze database query patterns along with static and dynamic database model analysis, which enables the identification of relationships between models, facilitating the systematic decomposition of microservices while ensuring efficient inter-service communication. To validate its effectiveness, Mono2Micro was applied to a student information system for faculty management, demonstrating its ability to streamline the migration process while maintaining functional integrity. The proposed framework offers a systematic and scalable solution for organizations and researchers seeking efficient migration from monolithic systems to microservices. Full article
Show Figures

Figure 1

Back to TopTop