Machine Learning in Big Data

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (15 October 2022) | Viewed by 12094

Special Issue Editor

Department of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou 510006, China
Interests: wireless sensor networks; internet of things; artificial intelligence and robotics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue titled “Machine Learning in Big Data” welcomes papers on topics in machine learning for the exploration and analytics of Big Data from all fields of everyday life, e.g., medical care, transportation, environment, economy, government, ocean, tourism, agriculture, industry, etc. The aim of this Special Issue is to encourage original papers presenting high quality research on machine learning techniques for the difficult problems observed in practical applications of Big Data.

Disruptive technologies of machine learning such as deep learning, reinforcement learning, transfer learning, distributed machine learning, federated learning, and multi-agent reinforcement learning are applied across every application of present systems with Big Data. Machine learning is the current effective method used to confront massive data. In such a context, we call for papers in this Special Issue to explore innovative and proper machine learning techniques for Big Data in a certain application domain.

Topics are invited from a wide range of disciplines and perspectives, including but are not restricted to the following:

  • Advances in machine learning frameworks for Big Data open platforms;
  • Efficient search of Big Data through machine learning;
  • Effective techniques for data cleaning; 
  • Graph neural networks for graph mining;
  • Semantic mining of Big Data;
  • Parallel/distributed machine learning paradigms for Big Data;
  • Machine learning for applications of medical care, agriculture, industry, and business;
  • Federated learning or swarm learning in edge computing of Big Data applications;
  • Distributed machine learning techniques for IoT (Internet of Things) related Big Data applications;
  • Privacy-preserving secure Big Data analytics;
  • Models for multimedia Big Data analytics;
  • Deep learning for visual analytics of Big Data;
  • Intelligent scheduling for energy efficient processing of Big Data;

Reinforcement learning/multi-agent reinforcement learning in IoT for continuous and accurate generation of Big Data.

Dr. Hejun Wu
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • transfer learning
  • parallel/distributed learning
  • federated learning
  • swarm learning
  • reinforcement learning
  • deep learning
  • Big Data mining
  • data cleaning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

13 pages, 2038 KiB  
Article
Mammographic Classification of Breast Cancer Microcalcifications through Extreme Gradient Boosting
by Haobang Liang, Jiao Li, Hejun Wu, Li Li, Xinrui Zhou and Xinhua Jiang
Electronics 2022, 11(15), 2435; https://doi.org/10.3390/electronics11152435 - 4 Aug 2022
Cited by 10 | Viewed by 2696
Abstract
In this paper, we proposed an effective and efficient approach to the classification of breast cancer microcalcifications and evaluated the mathematical model for calcification on mammography with a large medical dataset. We employed several semi-automatic segmentation algorithms to extract 51 calcification features from [...] Read more.
In this paper, we proposed an effective and efficient approach to the classification of breast cancer microcalcifications and evaluated the mathematical model for calcification on mammography with a large medical dataset. We employed several semi-automatic segmentation algorithms to extract 51 calcification features from mammograms, including morphologic and textural features. We adopted extreme gradient boosting (XGBoost) to classify microcalcifications. Then, we compared other machine learning techniques, including k-nearest neighbor (kNN), adaboostM1, decision tree, random decision forest (RDF), and gradient boosting decision tree (GBDT), with XGBoost. XGBoost showed the highest accuracy (90.24%) for classifying microcalcifications, and kNN demonstrated the lowest accuracy. This result demonstrates that it is essential for the classification of microcalcification to use the feature engineering method for the selection of the best composition of features. One of the contributions of this study is to present the best composition of features for efficient classification of breast cancers. This paper finds a way to select the best discriminative features as a collection to improve the accuracy. This study showed the highest accuracy (90.24%) for classifying microcalcifications with AUC = 0.89. Moreover, we highlighted the performance of various features from the dataset and found ideal parameters for classifying microcalcifications. Furthermore, we found that the XGBoost model is suitable both in theory and practice for the classification of calcifications on mammography. Full article
(This article belongs to the Special Issue Machine Learning in Big Data)
Show Figures

Figure 1

18 pages, 1877 KiB  
Article
ASPDC: Accelerated SPDC Regularized Empirical Risk Minimization for Ill-Conditioned Problems in Large-Scale Machine Learning
by Haobang Liang, Hao Cai, Hejun Wu, Fanhua Shang, James Cheng and Xiying Li
Electronics 2022, 11(15), 2382; https://doi.org/10.3390/electronics11152382 - 29 Jul 2022
Viewed by 1782
Abstract
This paper aims to improve the response speed of SPDC (stochastic primal–dual coordinate ascent) in large-scale machine learning, as the complexity of per-iteration of SPDC is not satisfactory. We propose an accelerated stochastic primal–dual coordinate ascent called ASPDC and its further accelerated variant, [...] Read more.
This paper aims to improve the response speed of SPDC (stochastic primal–dual coordinate ascent) in large-scale machine learning, as the complexity of per-iteration of SPDC is not satisfactory. We propose an accelerated stochastic primal–dual coordinate ascent called ASPDC and its further accelerated variant, ASPDC-i. Our proposed ASPDC methods achieve a good balance between low per-iteration computation complexity and fast convergence speed, even when the condition number becomes very large. The large condition number causes ill-conditioned problems, which usually requires many more iterations before convergence and longer per-iteration times in data training for machine learning. We performed experiments on various machine learning problems. The experimental results demonstrate that ASPDC and ASPDC-i converge faster than their counterparts, and enjoy low per-iteration complexity as well. Full article
(This article belongs to the Special Issue Machine Learning in Big Data)
Show Figures

Figure 1

15 pages, 5675 KiB  
Article
Electrocardiogram Signal Classification Based on Mix Time-Series Imaging
by Hao Cai, Lingling Xu, Jianlong Xu, Zhi Xiong and Changsheng Zhu
Electronics 2022, 11(13), 1991; https://doi.org/10.3390/electronics11131991 - 24 Jun 2022
Cited by 7 | Viewed by 4007
Abstract
Arrhythmia is a significant cause of death, and it is essential to analyze the electrocardiogram (ECG) signals as this is usually used to diagnose arrhythmia. However, the traditional time series classification methods based on ECG ignore the nonlinearity, temporality, or other characteristics inside [...] Read more.
Arrhythmia is a significant cause of death, and it is essential to analyze the electrocardiogram (ECG) signals as this is usually used to diagnose arrhythmia. However, the traditional time series classification methods based on ECG ignore the nonlinearity, temporality, or other characteristics inside these signals. This paper proposes an electrocardiogram classification method that encodes one-dimensional ECG signals into the three-channel images, named ECG classification based on Mix Time-series Imaging (EC-MTSI). Specifically, this hybrid transformation method combines Gramian angular field (GAF), recurrent plot (RP), and tiling, preserving the original ECG time series’ time dependence and correlation. We use a variety of neural networks to extract features and perform feature fusion and classification. This retains sufficient details while emphasizing local information. To demonstrate the effectiveness of the EC-MTSI, we conduct abundant experiments in a commonly-used dataset. In our experiments, the general accuracy reached 93.23%, and the accuracy of identifying high-risk arrhythmias of ventricular beats and supraventricular beats alone are as high as 97.4% and 96.3%, respectively. The results reveal that the proposed method significantly outperforms the existing approaches. Full article
(This article belongs to the Special Issue Machine Learning in Big Data)
Show Figures

Figure 1

21 pages, 1495 KiB  
Article
A Graph Attention Mechanism-Based Multiagent Reinforcement-Learning Method for Task Scheduling in Edge Computing
by Yinong Li, Jianbo Li and Junjie Pang
Electronics 2022, 11(9), 1357; https://doi.org/10.3390/electronics11091357 - 24 Apr 2022
Cited by 9 | Viewed by 2716
Abstract
Multi-access edge computing (MEC) enables end devices with limited computing power to provide effective solutions while dealing with tasks that are computationally challenging. When each end device in an MEC scenario generates multiple tasks, how to reasonably and effectively schedule these tasks is [...] Read more.
Multi-access edge computing (MEC) enables end devices with limited computing power to provide effective solutions while dealing with tasks that are computationally challenging. When each end device in an MEC scenario generates multiple tasks, how to reasonably and effectively schedule these tasks is a large-scale discrete action space problem. In addition, how to exploit the objectively existing spatial structure relationships in the given scenario is also an important factor to be considered in task-scheduling algorithms. In this work, we consider indivisible, time-sensitive tasks under this scenario and formalize the task-scheduling problem to minimize the long-term losses. We propose a multiagent collaborative deep reinforcement learning (DRL)-based distributed scheduling algorithm based on graph attention neural networks (GATs) to solve task-scheduling problems in the MEC scenario. Each end device creates a graph representation agent to extract potential spatial features in the scenario and a scheduling agent to extract the timing-related features of the tasks and make scheduling decisions using a gated recurrent unit (GRU). The simulation results show that, compared with several baseline algorithms, our proposed algorithm can take advantage of the spatial positional relationship of devices in the environment, significantly reduce the average delay and drop rate, and improve link utilization. Full article
(This article belongs to the Special Issue Machine Learning in Big Data)
Show Figures

Figure 1

Back to TopTop