Submit your Manuscript Submit your Abstract Propose a Topic

Topic Menu

Topic Editors

Prof. Dr. Xujuan Zhou

E-Mail Website

School of Business, University of Southern Queensland, Springfield, QLD 4300, Australia

Prof. Dr. Yuefeng Li

E-Mail Website

School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia

Prof. Dr. Raj Gururajan

E-Mail Website

School of Business, University of Southern Queensland, Springfield 4300, Australia

Prof. Dr. Ji Zhang

E-Mail Website

School of Mathematics, Physics and Computing, University of Southern Queensland, Toowoomba, QLD 4350, Australia

Prof. Dr. Revathi Venkataraman

E-Mail Website

School of Computing, SRM Institute of Science and Technology, Chennai 603203, India

New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence

Abstract submission deadline

31 December 2025

Manuscript submission deadline

31 March 2026

Viewed by

12725

Topic Information

Dear Colleagues,

The landscape of data mining and machine learning is rapidly evolving, fuelled by advancements in algorithms, computational power, and the availability of vast datasets. This Topic will explore the latest trends and innovations shaping the future of these fields. Key areas of interest include, but are not limited to, deep learning architectures, reinforcement learning, unsupervised and semi-supervised learning techniques, federated learning, and the integration of machine learning with big data technologies. We invite contributions that address novel approaches and methodologies, including improvements in model interpretability, the development of more efficient algorithms, and the application of machine learning in diverse domains such as healthcare, finance, engineering, material science, and social networks. Special emphasis will be placed on emerging topics like generative AI, explainable AI (XAI), edge AI, and the ethical implications of AI deployment. In the realm of data mining, we are particularly interested in new techniques for anomaly detection, pattern recognition, and predictive analytics. Papers exploring the convergence of data mining with AI technologies, such as using deep learning for feature extraction or leveraging generative models for data augmentation, are highly encouraged. By bringing together cutting-edge research and practical applications, this Topic will provide a comprehensive overview of the current state and future directions of data mining and machine learning. We encourage submissions that offer theoretical insights, empirical studies, and case studies demonstrating the transformative impacts of these technologies. Join us in contributing to this exciting discourse and advancing our field through collaborative knowledge-sharing.

Prof. Dr. Xujuan Zhou
Prof. Dr. Yuefeng Li
Prof. Dr. Raj Gururajan
Prof. Dr. Ji Zhang
Prof. Dr. Revathi Venkataraman
Topic Editors

Keywords

data and text mining
graph data mining
machine and deep learning
reinforcement learning
supervised and unsupervised learning
semi-supervised learning
federated learning
generative AI and explainable AI (XAI)
edge AI
pattern recognition and anomaly detection
predictive analytics
natural language processing (NLP)
computer vision
big data technologies
AI applications in diverse domains

Participating Journals

Journal Name	Impact Factor	CiteScore	Launched Year	First Decision (median)	APC
Applied Sciences applsci	2.5	5.5	2011	19.8 Days	CHF 2400	Submit
Data data	2.0	5.0	2016	25.2 Days	CHF 1600	Submit
Electronics electronics	2.6	6.1	2012	16.8 Days	CHF 2400	Submit
Information information	2.9	6.5	2010	18.6 Days	CHF 1800	Submit
Mathematics mathematics	2.2	4.6	2013	18.4 Days	CHF 2600	Submit

Preprints.org is a multidisciplinary platform offering a preprint service designed to facilitate the early sharing of your research. It supports and empowers your research journey from the very beginning.

MDPI Topics is collaborating with Preprints.org and has established a direct connection between MDPI journals and the platform. Authors are encouraged to take advantage of this opportunity by posting their preprints at Preprints.org prior to publication:

Share your research immediately: disseminate your ideas prior to publication and establish priority for your work.
Safeguard your intellectual contribution: Protect your ideas with a time-stamped preprint that serves as proof of your research timeline.
Boost visibility and impact: Increase the reach and influence of your research by making it accessible to a global audience.
Gain early feedback: Receive valuable input and insights from peers before submitting to a journal.
Ensure broad indexing: Web of Science (Preprint Citation Index), Google Scholar, Crossref, SHARE, PrePubMed, Scilit and Europe PMC.

Published Papers (10 papers)

Download All Papers

Order results

Result details

Journals

Show export options Show export options

Select all

Export citation of selected articles as:

19 pages, 7259 KiB

Open AccessArticle

A Novel Fuzzy Kernel Extreme Learning Machine Algorithm in Classification Problems

by Asli Kaya Karakutuk and Ozer Ozdemir

Appl. Sci. 2025, 15(8), 4506; https://doi.org/10.3390/app15084506 - 19 Apr 2025

Viewed by 369

Abstract

Today, numerous methods have been developed to address various problems, each with its own advantages and limitations. To overcome these limitations, hybrid structures that integrate multiple techniques have emerged as effective computational methods, offering superior performance and efficiency compared to single-method solutions. In this paper, we introduce a basic method that combines the strengths of fuzzy logic, wavelet theory, and kernel-based extreme learning machines to efficiently classify facial expressions. We call this method the Fuzzy Wavelet Mexican Hat Kernel Extreme Learning Machine. To evaluate the classification performance of this mathematically defined hybrid method, we apply it to both an original dataset and the JAFFE dataset. The method is enhanced with various feature extraction methods. On the JAFFE dataset, the algorithm achieved an average classification accuracy of 94.55% when supported with local binary patterns and 94.27% with a histogram of oriented gradients. Moreover, these results outperform those of previous studies conducted on the same dataset. On the original dataset, the proposed method was compared with an extreme learning machine and wavelet neural network, and it was found that the method has remarkable efficiency compared to the other two methods. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 7299 KiB

Open AccessArticle

Unsupervised Contrastive Learning for Time Series Data Clustering

by Bo Cao, Qinghua Xing, Ke Yang, Xuan Wu and Longyue Li

Electronics 2025, 14(8), 1660; https://doi.org/10.3390/electronics14081660 - 19 Apr 2025

Viewed by 964

Abstract

Aiming at the problems of existing time series data clustering methods, such as the lack of similarity metric universality, the influence of dimensional catastrophe, and the limitation of feature expression ability, a time series data clustering method based on unsupervised contrasting learning (UCL-TSC) is proposed. The method first utilizes Residual, TCN, and CNN-TCN to construct multi-view representations of spatial, temporal, and spatial–temporal features of time series data, and adaptively fuses complementary information to enhance feature extraction capabilities. Subsequently, positive and negative sample pairs are constructed based on nearest neighbor and pseudo-clustering label information. Finally, a contrast loss function consisting of feature loss, clustering loss, and a regularization term is designed to facilitate the model in achieving compact intra-cluster and sparse inter-cluster clustering effects in the clustering process. The experimental results on the UCR dataset show that UCL-TSC performs well with respect to several evaluation indexes, such as clustering accuracy, normalized information degree, and purity, and is more effective in learning time series data features and achieving accurate clustering compared to traditional clustering and deep clustering methods. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 6364 KiB

Open AccessArticle

Multi-Frame Joint Detection Approach for Foreign Object Detection in Large-Volume Parenterals

by Ziqi Li, Dongyao Jia, Zihao He and Nengkai Wu

Mathematics 2025, 13(8), 1333; https://doi.org/10.3390/math13081333 - 18 Apr 2025

Viewed by 613

Abstract

Large-volume parenterals (LVPs), as essential medical products, are widely used in healthcare settings, making their safety inspection crucial. Current methods for detecting foreign particles in LVP solutions through image analysis primarily rely on single-frame detection or simple temporal smoothing strategies, which fail to effectively utilize spatiotemporal correlations across multiple frames. Factors such as occlusion, motion blur, and refractive distortion can significantly impact detection accuracy. To address these challenges, this paper proposes a multi-frame object detection framework based on spatiotemporal collaborative learning, incorporating three key innovations: a YOLO network optimized with deformable convolution, a differentiable cross-frame association module, and an uncertainty-aware feature fusion and re-identification module. Experimental results demonstrate that our method achieves a 97% detection rate for contaminated LVP solutions on the LVPD dataset. Furthermore, the proposed method enables end-to-end training and processes five bottles per second, meeting the requirements for real-time pipeline applications. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 2604 KiB

Open AccessArticle

Quantifying Relational Exploration in Cultural Heritage Knowledge Graphs with LLMs: A Neuro-Symbolic Approach for Enhanced Knowledge Discovery

by Mohammed Maree

Data 2025, 10(4), 52; https://doi.org/10.3390/data10040052 - 10 Apr 2025

Cited by 1 | Viewed by 1087

Abstract

This paper introduces a neuro-symbolic approach for relational exploration in cultural heritage knowledge graphs, exploiting Large Language Models (LLMs) for explanation generation and a mathematically grounded model to quantify the interestingness of relationships. We demonstrate the importance of the proposed interestingness measure through a quantitative analysis, highlighting its significant impact on system performance, particularly in terms of precision, recall, and F1-score. Utilizing the Wikidata Cultural Heritage Linked Open Data (WCH-LOD) dataset, our approach achieves a precision of 0.70, recall of 0.68, and an F1-score of 0.69, outperforming both graph-based (precision: 0.28, recall: 0.25, F1-score: 0.26) and knowledge-based (precision: 0.45, recall: 0.42, F1-score: 0.43) baselines. Furthermore, the proposed LLM-powered explanations exhibit better quality, as evidenced by higher BLEU (0.52), ROUGE-L (0.58), and METEOR (0.63) scores compared to baseline approaches. We further demonstrate a strong correlation (0.65) between the interestingness measure and the quality of generated explanations, validating its ability to guide the system towards more relevant discoveries. This system offers more effective exploration by achieving more diverse and human-interpretable relationship explanations compared to purely knowledge-based and graph-based methods, contributing to the knowledge-based systems field by providing a personalized and adaptable relational exploration framework. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 7335 KiB

Open AccessArticle

An Interpretable Hybrid Deep Learning Model for Molten Iron Temperature Prediction at the Iron-Steel Interface Based on Bi-LSTM and Transformer

by Zhenzhong Shen, Weigang Han, Yanzhuo Hu, Ye Zhu and Jingjing Han

Mathematics 2025, 13(6), 975; https://doi.org/10.3390/math13060975 - 15 Mar 2025

Cited by 1 | Viewed by 887

Abstract

Hot metal temperature is a key factor affecting the quality and energy consumption of iron and steel smelting. Accurate prediction of the temperature drop in a hot metal ladle is very important for optimizing transport, improving efficiency, and reducing energy consumption. Most of the existing studies focus on the prediction of molten iron temperature in torpedo tanks, but there is a significant research gap in the prediction of molten iron ladle temperature drop, especially as the ladle is increasingly used to replace the torpedo tank in the transportation process, this research gap has not been fully addressed in the existing literature. This paper proposes an interpretable hybrid deep learning model combining Bi-LSTM and Transformer to solve the complexity of temperature drop prediction. By leveraging Catboost-RFECV, the most influential variables are selected, and the model captures both local features with Bi-LSTM and global dependencies with Transformer. Hyperparameters are optimized automatically using Optuna, enhancing model performance. Furthermore, SHAP analysis provides valuable insights into the key factors influencing temperature drops, enabling more accurate prediction of molten iron temperature. The experimental results demonstrate that the proposed model outperforms each individual model in the ensemble in terms of R², RMSE, MAE, and other evaluation metrics. Additionally, SHAP analysis identifies the key factors contributing to the temperature drop. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 8383 KiB

Open AccessArticle

Self-Supervised Time-Series Preprocessing Framework for Maritime Applications

by Shengli Dong, Jilong Liu, Bing Han, Shengzheng Wang, Hong Zeng and Meng Zhang

Electronics 2025, 14(4), 765; https://doi.org/10.3390/electronics14040765 - 16 Feb 2025

Viewed by 674

Abstract

This study proposes a novel self-supervised data-preprocessing framework for time-series forecasting in complex ship systems. The framework integrates an improved Learnable Wavelet Packet Transform (L-WPT) for adaptive denoising and a correlation-based Uniform Manifold Approximation and Projection (UMAP) approach for dimensionality reduction. The enhanced L-WPT incorporates Reversible Instance Normalization to improve training efficiency while preserving denoising performance, especially for low-frequency sporadic noise. The UMAP dimensionality reduction, combined with a modified K-means clustering using correlation coefficients, enhances the computational efficiency and interpretability of the reduced data. Experimental results validate that state-of-the-art time-series models can effectively forecast the data processed by this framework, achieving promising MSE and MAE metrics. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 4481 KiB

Open AccessArticle

A Clustering Algorithm Based on Local Relative Density

by Yujuan Zou, Zhijian Wang, Xiangchen Wang and Taizhi Lv

Electronics 2025, 14(3), 481; https://doi.org/10.3390/electronics14030481 - 24 Jan 2025

Cited by 2 | Viewed by 1197

Abstract

DBSCAN and DPC are typical density-based clustering algorithms. These two algorithms have their drawbacks, such as difficulty in clustering when there are significant differences in density between clusters. This study proposes a clustering algorithm, RDBSCAN, which is based on local relative density, drawing on the extension strategy of DBSCAN and the allocation mechanism of DPC. The algorithm first uses k-nearest neighbors to calculate the original local density, then sorts the points in descending order of this density. It then selects the point with the highest original local density from the unprocessed points as the local center of the next cluster. Based on this local center, RDBSCAN calculates the local relative density, determines the core objects, and performs cluster expansion. Drawing on the allocation mechanism of DPC, the algorithm performs a secondary allocation for points in clusters that are too small to complete the final clustering. Comparative experiments using RDBSCAN and eight other clustering algorithms were conducted, and the test results show that RDBSCAN ranks first in clustering performance metrics among all algorithms on synthetic datasets and second on real-world datasets. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 2674 KiB

Open AccessArticle

Achieving Excellence in Cyber Fraud Detection: A Hybrid ML+DL Ensemble Approach for Credit Cards

by Eyad Btoush, Xujuan Zhou, Raj Gururajan, Ka Ching Chan and Omar Alsodi

Appl. Sci. 2025, 15(3), 1081; https://doi.org/10.3390/app15031081 - 22 Jan 2025

Cited by 4 | Viewed by 3142

Abstract

The rapid advancement of technology has increased the complexity of cyber fraud, presenting a growing challenge for the banking sector to efficiently detect fraudulent credit card transactions. Conventional detection approaches face challenges in adapting to the continuously evolving tactics of fraudsters. This study addresses these limitations by proposing an innovative hybrid model that integrates Machine Learning (ML) and Deep Learning (DL) techniques through a stacking ensemble and resampling strategies. The hybrid model leverages ML techniques including Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), eXtreme Gradient Boosting (XGBoost), Categorical Boosting (CatBoost), and Logistic Regression (LR) alongside DL techniques such as Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory Network (BiLSTM) with attention mechanisms. By utilising the stacking ensemble method, the model consolidates predictions from multiple base models, resulting in improved predictive accuracy compared to individual models. The methodology incorporates robust data pre-processing techniques. Experimental evaluations demonstrate the superior performance of the hybrid ML+DL model, particularly in handling class imbalances and achieving a high F1 score, achieving an F1 score of 94.63%. This result underscores the effectiveness of the proposed model in delivering reliable cyber fraud detection, highlighting its potential to enhance financial transaction security. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 10348 KiB

Open AccessArticle

A Learning Resource Recommendation Method Based on Graph Contrastive Learning

by Jiu Yong, Jianguo Wei, Xiaomei Lei, Jianwu Dang, Wenhuan Lu and Meijuan Cheng

Electronics 2025, 14(1), 142; https://doi.org/10.3390/electronics14010142 - 1 Jan 2025

Viewed by 1184

Abstract

The existing learning resource recommendation systems suffer from data sparsity and missing data labels, leading to the insufficient mining of the correlation between users and courses. To address these issues, we propose a learning resource recommendation method based on graph contrastive learning, which uses graph contrastive learning to construct an auxiliary recommendation task combined with a main recommendation task, achieving the joint recommendation of learning resources. Firstly, the interaction bipartite graph between the user and the course is input into a lightweight graph convolutional network, and the embedded representation of each node in the graph is obtained after compilation. Then, for the input user–course interaction bipartite graph, noise vectors are randomly added to each node in the embedding space to perturb the embedding of graph encoder node, forming a perturbation embedding representation of the node to enhance the data. Subsequently, the graph contrastive learning method is used to construct auxiliary recommendation tasks. Finally, the main task of recommendation supervision and the constructed auxiliary task of graph contrastive learning are jointly learned to alleviate data sparsity. The experimental results show that the proposed method in this paper has improved the Recall@5 by 5.7% and 11.2% and the NDCG@5 by 0.1% and 6.4%, respectively, on the MOOCCube and Amazon-Book datasets compared with the node enhancement methods. Therefore, the proposed method can significantly improve the mining level of users and courses by using a graph comparison method in the auxiliary recommendation task and has better noise immunity and robustness. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

16 pages, 3708 KiB

Open AccessArticle

Suppression of Strong Cultural Noise in Magnetotelluric Signals Using Particle Swarm Optimization-Optimized Variational Mode Decomposition

by Zhongda Shang, Xinjun Zhang, Shen Yan and Kaiwen Zhang

Appl. Sci. 2024, 14(24), 11719; https://doi.org/10.3390/app142411719 - 16 Dec 2024

Viewed by 920

Abstract

To effectively separate strong cultural noise in Magnetotelluric (MT) signals under strong interference conditions and restore the true forms of apparent resistivity and phase curves, this paper proposes an improved method for suppressing strong cultural noise based on Particle Swarm Optimization (PSO) and Variational Mode Decomposition (VMD). First, the effects of two initial parameters, the decomposition scale K and penalty factor α, on the performance of variational mode decomposition are studied. Subsequently, using the PSO algorithm, the optimal combination of influential parameters in the VMD is determined. This optimal parameter set is applied to decompose electromagnetic signals, and Intrinsic Mode Functions (IMFs) are selected for signal reconstruction based on correlation coefficients, resulting in denoised electromagnetic signals. The simulation results show that, compared to traditional algorithms such as Empirical Mode Decomposition (EMD), Intrinsic Time Decomposition (ITD), and VMD, the Normalized Cross-Correlation (NCC) and signal-to-noise ratio (SNR) of the PSO-optimized VMD method for suppressing strong cultural noise increased by 0.024, 0.035, 0.019, and 2.225, 2.446, 1.964, respectively. The processing of field data confirms that this method effectively suppresses strong cultural noise in strongly interfering environments, leading to significant improvements in the apparent resistivity and phase curve data, thereby enhancing the authenticity and reliability of underground electrical structure interpretations. Full article

(This article belongs to the Topic New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence)

► Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Displaying articles 1-10

Submit your Abstract

Journal Name	Impact Factor	CiteScore	Launched Year	First Decision (median)	APC
Applied Sciences applsci	2.5	5.5	2011	19.8 Days	CHF 2400	Submit
Data data	2.0	5.0	2016	25.2 Days	CHF 1600	Submit
Electronics electronics	2.6	6.1	2012	16.8 Days	CHF 2400	Submit
Information information	2.9	6.5	2010	18.6 Days	CHF 1800	Submit
Mathematics mathematics	2.2	4.6	2013	18.4 Days	CHF 2600	Submit

Topic Menu

Topic Editors

New Applications of Big Data Technology: Integration of Data Mining and Artificial Intelligence

Topic Information

Keywords

Participating Journals

Published Papers (10 papers)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI