From Edge Devices to Cloud Computing and Datacenters: Emerging Machine Learning Applications, Algorithms, and Optimizations

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: 31 August 2024 | Viewed by 17147

Special Issue Editor


E-Mail Website
Guest Editor
Faculty of Engineering, Ruppin Academic Center, Emek Hefer 4025000, Israel
Interests: artificial intelligence; machine learning and deep neural network algorithms; deep compression of machine learning; explainable AI (XAI); high performance computing algorithms and acceleration; computation systems and algorithm modeling and simulations
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In the last decade, machine learning has emerged as an important tool for a tremendous number of applications, such as computer vision, medicine, fintech, autonomous systems, speech recognition, and many others. Machine learning models offer state-of-the-art accuracy and robustness in many applications. The increasing deployment of machine learning algorithms from edge and IoT devices to high-end computational infrastructures, such as supercomputers, the cloud, and datacenters, introduces major computational challenges due to the growing amount of data and also the major growth in their model size and complexity. This Special Issue looks for novel developments of emerging machine learning applications, algorithms, and optimization in diverse computational platforms such as:

  • Novel IoT and edge devices’ machine learning applications;
  • High-performance computing machine learning algorithms;
  • Machine learning applications in cloud and fog computing;
  • Fusion of machine learning models between edge and cloud;
  • Machine learning optimization methods such as pruning and deep compression.

Prof. Dr. Freddy Gabbay
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep neural network
  • deep compression
  • machine learning optimizations
  • machine learning under constrained resources

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 594 KiB  
Article
AMED: Automatic Mixed-Precision Quantization for Edge Devices
by Moshe Kimhi, Tal Rozen, Avi Mendelson and Chaim Baskin
Mathematics 2024, 12(12), 1810; https://doi.org/10.3390/math12121810 - 11 Jun 2024
Viewed by 588
Abstract
Quantized neural networks are well known for reducing the latency, power consumption, and model size without significant harm to the performance. This makes them highly appropriate for systems with limited resources and low power capacity. Mixed-precision quantization offers better utilization of customized hardware [...] Read more.
Quantized neural networks are well known for reducing the latency, power consumption, and model size without significant harm to the performance. This makes them highly appropriate for systems with limited resources and low power capacity. Mixed-precision quantization offers better utilization of customized hardware that supports arithmetic operations at different bitwidths. Quantization methods either aim to minimize the compression loss given a desired reduction or optimize a dependent variable for a specified property of the model (such as FLOPs or model size); both make the performance inefficient when deployed on specific hardware, but more importantly, quantization methods assume that the loss manifold holds a global minimum for a quantized model that copes with the global minimum of the full precision counterpart. Challenging this assumption, we argue that the optimal minimum changes as the precision changes, and thus, it is better to look at quantization as a random process, placing the foundation for a different approach to quantize neural networks, which, during the training procedure, quantizes the model to a different precision, looks at the bit allocation as a Markov Decision Process, and then, finds an optimal bitwidth allocation for measuring specified behaviors on a specific device via direct signals from the particular hardware architecture. By doing so, we avoid the basic assumption that the loss behaves the same way for a quantized model. Automatic Mixed-Precision Quantization for Edge Devices (dubbed AMED) demonstrates its superiority over current state-of-the-art schemes in terms of the trade-off between neural network accuracy and hardware efficiency, backed by a comprehensive evaluation. Full article
Show Figures

Figure 1

25 pages, 16257 KiB  
Article
Detection and Prediction of Chipping in Wafer Grinding Based on Dicing Signal
by Bao Rong Chang, Hsiu-Fen Tsai and Hsiang-Yu Mo
Mathematics 2022, 10(24), 4631; https://doi.org/10.3390/math10244631 - 7 Dec 2022
Cited by 2 | Viewed by 1712
Abstract
Simple regression cannot wholly analyze large-scale wafer backside wall chipping because the wafer grinding process encounters many problems, such as collected data missing, data showing a non-linear distribution, and correlated hidden parameters lost. The objective of this study is to propose a novel [...] Read more.
Simple regression cannot wholly analyze large-scale wafer backside wall chipping because the wafer grinding process encounters many problems, such as collected data missing, data showing a non-linear distribution, and correlated hidden parameters lost. The objective of this study is to propose a novel approach to solving this problem. First, this study uses time series, random forest, importance analysis, and correlation analysis to analyze the signals of wafer grinding to screen out key grinding parameters. Then, we use PCA and Barnes-Hut t-SNE to reduce the dimensionality of the key grinding parameters and compare their corresponding heat maps to find out which dimensionality reduction method is more sensitive to the chipping phenomenon. Finally, this study imported the more sensitive dimensionality reduction data into the Data Driven-Bidirectional LSTM (DD-BLSTM) model for training and predicting the wafer chipping. It can adjust the key grinding parameters in time to reduce the occurrence of large-scale wafer chipping and can effectively improve the degree of deterioration of the grinding blade. As a result, the blades can initially grind three pieces of the wafers without replacement and successfully expand to more than eight pieces of the wafer. The accuracy of wafer chipping prediction using DD-BLSTM with Barnes-Hut t-SNE dimensionality reduction can achieve 93.14%. Full article
Show Figures

Figure 1

14 pages, 1803 KiB  
Article
An Evaluation of Modern Accelerator-Based Edge Devices for Object Detection Applications
by Pilsung Kang and Athip Somtham
Mathematics 2022, 10(22), 4299; https://doi.org/10.3390/math10224299 - 16 Nov 2022
Cited by 10 | Viewed by 2521
Abstract
Edge AI is one of the newly emerged application domains where networked IoT (Internet of Things) devices are deployed to perform AI computations at the edge of the cloud environments. Today’s edge devices are typically equipped with powerful accelerators within their architecture to [...] Read more.
Edge AI is one of the newly emerged application domains where networked IoT (Internet of Things) devices are deployed to perform AI computations at the edge of the cloud environments. Today’s edge devices are typically equipped with powerful accelerators within their architecture to efficiently process the vast amount of data generated in place. In this paper, we evaluate major state-of-the-art edge devices in the context of object detection, which is one of the principal applications of modern AI technology. For our evaluation study, we choose recent devices with different accelerators to compare performance behavior depending on different architectural characteristics. The accelerators studied in this work include the GPU and the edge version of the TPU, and these accelerators can be used to boost the performance of deep learning operations. By performing a set of major object detection neural network benchmarks on the devices and by analyzing their performance behavior, we assess the effectiveness and capability of the modern edge devices accelerated by a powerful parallel hardware. Based on the benchmark results in the perspectives of detection accuracy, inference latency, and energy efficiency, we provide a latest report of comparative evaluation for major modern edge devices in the context of the object detection application of the AI technology. Full article
Show Figures

Figure 1

16 pages, 730 KiB  
Article
Using an Artificial Neural Network for Improving the Prediction of Project Duration
by Itai Lishner and Avraham Shtub
Mathematics 2022, 10(22), 4189; https://doi.org/10.3390/math10224189 - 9 Nov 2022
Cited by 9 | Viewed by 3050
Abstract
One of the most challenging tasks in project management is estimating the duration of a project. The unknowns that accompany projects, the different risks, the uniqueness of each project, and the differences between organizations’ culture and management techniques, hinder the ability to build [...] Read more.
One of the most challenging tasks in project management is estimating the duration of a project. The unknowns that accompany projects, the different risks, the uniqueness of each project, and the differences between organizations’ culture and management techniques, hinder the ability to build one project duration prediction tool that can fit all types of projects and organizations. When machine learning (ML) techniques are used for project duration prediction, the challenge is even greater, as each organization has a different dataset structure, different features, and different quality of data. This hinders the ability to create one ML model that fits all types of organizations. This paper presents a new dynamic ML tool for improving the prediction accuracy of project duration. The tool is based on an artificial neural network (ANN) which is automatically adapted and optimized to different types of prediction methods and different datasets. The tool trains the ANN model multiple times with different architectures and uses a genetic algorithm to eventually choose the architecture which gives the most accurate prediction results. The validation process of the prediction accuracy is performed by using real-life project datasets supplied by two different organizations which have different project management approaches, different project types, and different project features. The results show that the proposed tool significantly improved the prediction accuracy for both organizations despite the major differences in the size, type, and structure of their datasets. Full article
Show Figures

Figure 1

20 pages, 4279 KiB  
Article
Deep Neural Network Memory Performance and Throughput Modeling and Simulation Framework
by Freddy Gabbay, Rotem Lev Aharoni and Ori Schweitzer
Mathematics 2022, 10(21), 4144; https://doi.org/10.3390/math10214144 - 6 Nov 2022
Cited by 2 | Viewed by 2551
Abstract
Deep neural networks (DNNs) are widely used in various artificial intelligence applications and platforms, such as sensors in internet of things (IoT) devices, speech and image recognition in mobile systems, and web searching in data centers. While DNNs achieve remarkable prediction accuracy, they [...] Read more.
Deep neural networks (DNNs) are widely used in various artificial intelligence applications and platforms, such as sensors in internet of things (IoT) devices, speech and image recognition in mobile systems, and web searching in data centers. While DNNs achieve remarkable prediction accuracy, they introduce major computational and memory bandwidth challenges due to the increasing model complexity and the growing amount of data used for training and inference. These challenges introduce major difficulties not only due to the constraints of system cost, performance, and energy consumption, but also due to limitations in currently available memory bandwidth. The recent advances in semiconductor technologies have further intensified the gap between computational hardware performance and memory systems bandwidth. Consequently, memory systems are, today, a major performance bottleneck for DNN applications. In this paper, we present DRAMA, a deep neural network memory simulator. DRAMA extends the SCALE-Sim simulator for DNN inference on systolic arrays with a detailed, accurate, and extensive modeling and simulation environment of the memory system. DRAMA can simulate in detail the hierarchical main memory components—such as memory channels, modules, ranks, and banks—and related timing parameters. In addition, DRAMA can explore tradeoffs for memory system performance and identify bottlenecks for different DNNs and memory architectures. We demonstrate DRAMA’s capabilities through a set of experimental simulations based on several use cases. Full article
Show Figures

Figure 1

14 pages, 2211 KiB  
Article
Bimodal-Distributed Binarized Neural Networks
by Tal Rozen, Moshe Kimhi, Brian Chmiel, Avi Mendelson and Chaim Baskin
Mathematics 2022, 10(21), 4107; https://doi.org/10.3390/math10214107 - 3 Nov 2022
Cited by 2 | Viewed by 1601
Abstract
Binary neural networks (BNNs) are an extremely promising method for reducing deep neural networks’ complexity and power consumption significantly. Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts. Prior work mainly focused on strategies for sign function approximation during [...] Read more.
Binary neural networks (BNNs) are an extremely promising method for reducing deep neural networks’ complexity and power consumption significantly. Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts. Prior work mainly focused on strategies for sign function approximation during the forward and backward phases to reduce the quantization error during the binarization process. In this work, we propose a bimodal-distributed binarization method (BD-BNN). The newly proposed technique aims to impose a bimodal distribution of the network weights by kurtosis regularization. The proposed method consists of a teacher–trainer training scheme termed weight distribution mimicking (WDM), which efficiently imitates the full-precision network weight distribution to their binary counterpart. Preserving this distribution during binarization-aware training creates robust and informative binary feature maps and thus it can significantly reduce the generalization error of the BNN. Extensive evaluations on CIFAR-10 and ImageNet demonstrate that our newly proposed BD-BNN outperforms current state-of-the-art schemes. Full article
Show Figures

Figure 1

17 pages, 2758 KiB  
Article
Knowledge Graph-Based Framework for Decision Making Process with Limited Interaction
by Sivan Albagli-Kim and Dizza Beimel
Mathematics 2022, 10(21), 3981; https://doi.org/10.3390/math10213981 - 26 Oct 2022
Cited by 4 | Viewed by 2429
Abstract
In this work, we present an algorithmic framework that supports a decision process in which an end user is assisted by a domain expert to solve a problem. In addition, the communication between the end user and the domain expert is characterized by [...] Read more.
In this work, we present an algorithmic framework that supports a decision process in which an end user is assisted by a domain expert to solve a problem. In addition, the communication between the end user and the domain expert is characterized by a limited number of questions and answers. The framework we have developed helps the domain expert to pinpoint a small number of questions to the end user to increase the likelihood of their insights being correct. The proposed framework is based on the domain expert’s knowledge and includes an interaction with both the domain expert and the end user. The domain expert’s knowledge is represented by a knowledge graph, and the end user’s information related to the problem is entered into the graph as evidence. This triggers the inference algorithm in the graph, which suggests to the domain expert the next question for the end user. The paper presents a detailed proposed framework in a medical diagnostic domain; however, it can be adapted to additional domains with a similar setup. The software framework we have developed makes the decision-making process accessible in an interactive and explainable manner, which includes the use of semantic technology and is, therefore, innovative. Full article
Show Figures

Figure 1

19 pages, 4411 KiB  
Article
Structured Compression of Convolutional Neural Networks for Specialized Tasks
by Freddy Gabbay, Benjamin Salomon and Gil Shomron
Mathematics 2022, 10(19), 3679; https://doi.org/10.3390/math10193679 - 8 Oct 2022
Viewed by 1362
Abstract
Convolutional neural networks (CNNs) offer significant advantages when used in various image classification tasks and computer vision applications. CNNs are increasingly deployed in environments from edge and Internet of Things (IoT) devices to high-end computational infrastructures, such as supercomputers, cloud computing, and data [...] Read more.
Convolutional neural networks (CNNs) offer significant advantages when used in various image classification tasks and computer vision applications. CNNs are increasingly deployed in environments from edge and Internet of Things (IoT) devices to high-end computational infrastructures, such as supercomputers, cloud computing, and data centers. The growing amount of data and the growth in their model size and computational complexity, however, introduce major computational challenges. Such challenges present entry barriers for IoT and edge devices as well as increase the operational expenses of large-scale computing systems. Thus, it has become essential to optimize CNN algorithms. In this paper, we introduce the S-VELCRO compression algorithm, which exploits value locality to trim filters in CNN models utilized for specialized tasks. S-VELCRO uses structured compression, which can save costs and reduce overhead compared with unstructured compression. The algorithm runs in two steps: a preprocessing step identifies the filters with a high degree of value locality, and a compression step trims the selected filters. As a result, S-VELCRO reduces the computational load of the channel activation function and avoids the convolution computation of the corresponding trimmed filters. Compared with typical CNN compression algorithms that run heavy back-propagation training computations, S-VELCRO has significantly fewer computational requirements. Our experimental analysis shows that S-VELCRO achieves a compression-saving ratio between 6% and 30%, with no degradation in accuracy for ResNet-18, MobileNet-V2, and GoogLeNet when used for specialized tasks. Full article
Show Figures

Figure 1

Back to TopTop