Optimization Models and Algorithms in Data Science

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 20 July 2025 | Viewed by 9437

Special Issue Editors

College of Mathematical Sciences, Harbin Engineering University, Harbin 150001, China
Interests: tensors; low rank model; multi-view clustering; signal processing; image processing; sparse coding; machine learning; data science
1. School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA 70503, USA
2. School of Physical and Electrical Engineering, Northeast Petroleum University, Daqing 163318, China
Interests: data mining with cross-domain data; unconventional oil and gas reservoir development; machine learning; computer vision

Special Issue Information

Dear Colleagues,

This Special Issue focuses on the optimization and applications of models and algorithms in data science. The papers in this Special Issue cover various aspects of data science, including novel algorithms and models, theoretical analysis, and applications in real-world problems. Some of the topics covered include tensor decomposition, tensor robust principal component analysis, tensor completion, low-rank models, multi-view clustering and sparse coding. Tensor decomposition is a powerful tool for modeling high-dimensional data and has applications in a wide range of fields, including image processing, signal processing, and machine learning. The papers also showcase the latest developments in low-rank models and their application in data science.

Dr. Ming Yang
Dr. Liqun Shan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • tensor decomposition
  • low-rank models
  • multi-view clustering
  • sparse coding
  • tensor completion
  • tensor robust principal component analysis
  • machine learning
  • data science
  • signal processing
  • image processing
  • high-dimensional data

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

27 pages, 1142 KiB  
Article
Multiview Deep Autoencoder-Inspired Layerwise Error-Correcting Non-Negative Matrix Factorization
by Yuan Liu, Yuan Wan, Zaili Yang and Huanhuan Li
Mathematics 2025, 13(9), 1422; https://doi.org/10.3390/math13091422 - 26 Apr 2025
Viewed by 73
Abstract
Multiview Clustering (MVC) plays a crucial role in the holistic analysis of complex data by leveraging complementary information from multiple perspectives, a necessity in the era of big data. Non-negative Matrix Factorization (NMF)-based methods have demonstrated their effectiveness and broad applicability in clustering [...] Read more.
Multiview Clustering (MVC) plays a crucial role in the holistic analysis of complex data by leveraging complementary information from multiple perspectives, a necessity in the era of big data. Non-negative Matrix Factorization (NMF)-based methods have demonstrated their effectiveness and broad applicability in clustering tasks, as they generate meaningful attribute distributions and cluster assignments. However, existing shallow NMF approaches fail to capture the hierarchical structures inherent in real-world data, while deep NMF ones overlook the accumulation of reconstruction errors across layers by solely focusing on a global loss function. To address these limitations, this study aims to develop a novel method that integrates an autoencoder-inspired structure into the deep NMF framework, incorporating layerwise error-correcting constraints. This approach can facilitate the extraction of hierarchical features while effectively mitigating reconstruction error accumulation in deep architectures. Additionally, repulsion-attraction manifold learning is incorporated at each layer to preserve intrinsic geometric structures within the data. The proposed model is evaluated on five real-world multiview datasets, with experimental results demonstrating its effectiveness in capturing hierarchical representations and improving clustering performance. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

14 pages, 314 KiB  
Article
RMT: Real-Time Multi-Level Transformer for Detecting Downgrades of User Experience in Live Streams
by Wei Jiang, Jian-Ping Li, Xin-Yan Li and Xuan-Qi Lin
Mathematics 2025, 13(5), 834; https://doi.org/10.3390/math13050834 - 2 Mar 2025
Viewed by 493
Abstract
Live-streaming platforms such as TikTok have been recently experiencing exponential growth, attracting millions of daily viewers. This surge in network traffic often results in increased latency, even on resource-rich nodes during peak times, leading to the downgrade of Quality of Experience (QoE) for [...] Read more.
Live-streaming platforms such as TikTok have been recently experiencing exponential growth, attracting millions of daily viewers. This surge in network traffic often results in increased latency, even on resource-rich nodes during peak times, leading to the downgrade of Quality of Experience (QoE) for users. This study aims to predict QoE downgrade events by leveraging cross-layer device data through real-time predictions and monitoring. We propose a Real-time Multi-level Transformer (RMT) model to predict the QoE of live streaming by integrating time-series data from multiple network layers. Unlike existing approaches, which primarily assess the immediate impact of network conditions on video quality, our method introduces a device-mask pretraining (DMP) technique that applies pretraining on cross-layer device data to capture the correlations among devices, thereby improving the accuracy of QoE predictions. To facilitate the training of RMT, we further built a Live Stream Quality of Experience (LSQE) dataset by collecting 5,000,000 records from over 300,000 users in a 7-day period. By analyzing the temporal evolution of network conditions in real-time, the RMT model provides more accurate predictions of user experience. The experimental results demonstrate that the proposed pretraining task significantly enhances the model’s prediction accuracy, and the overall method outperforms baseline approaches. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

23 pages, 13392 KiB  
Article
Incorporation of Histogram Intersection and Semantic Information into Non-Negative Local Laplacian Sparse Coding for Image Classification
by Ying Shi, Yuan Wan, Xinjian Wang and Huanhuan Li
Mathematics 2025, 13(2), 219; https://doi.org/10.3390/math13020219 - 10 Jan 2025
Viewed by 579
Abstract
Traditional sparse coding has proven to be an effective method for image feature representation in recent years, yielding promising results in image classification. However, it faces several challenges, such as sensitivity to feature variations, code instability, and inadequate distance measures. Additionally, image representation [...] Read more.
Traditional sparse coding has proven to be an effective method for image feature representation in recent years, yielding promising results in image classification. However, it faces several challenges, such as sensitivity to feature variations, code instability, and inadequate distance measures. Additionally, image representation and classification often operate independently, potentially resulting in the loss of semantic relationships. To address these issues, a new method is proposed, called Histogram intersection and Semantic information-based Non-negativity Local Laplacian Sparse Coding (HS-NLLSC) for image classification. This method integrates Non-negativity and Locality into Laplacian Sparse Coding (NLLSC) optimisation, enhancing coding stability and ensuring that similar features are encoded into similar codewords. In addition, histogram intersection is introduced to redefine the distance between feature vectors and codebooks, effectively preserving their similarity. By comprehensively considering both the processes of image representation and classification, more semantic information is retained, thereby leading to a more effective image representation. Finally, a multi-class linear Support Vector Machine (SVM) is employed for image classification. Experimental results on four standard and three maritime image datasets demonstrate superior performance compared to the previous six algorithms. Specifically, the classification accuracy of our approach improved by 5% to 19% compared to the previous six methods. This research provides valuable insights for various stakeholders in selecting the most suitable method for specific circumstances. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

19 pages, 3327 KiB  
Article
Mixed Multi-Strategy Improved Aquila Optimizer and Its Application in Path Planning
by Tianyue Bao, Jiaxin Zhao, Yanchang Liu, Xusheng Guo and Tianshuo Chen
Mathematics 2024, 12(23), 3818; https://doi.org/10.3390/math12233818 - 2 Dec 2024
Viewed by 766
Abstract
With the growing prevalence of drone technology across various sectors, efficient and safe path planning has emerged as a critical research priority. Traditional Aquila Optimizers, while effective, face limitations such as uneven population initialization, a tendency to get trapped in local optima, and [...] Read more.
With the growing prevalence of drone technology across various sectors, efficient and safe path planning has emerged as a critical research priority. Traditional Aquila Optimizers, while effective, face limitations such as uneven population initialization, a tendency to get trapped in local optima, and slow convergence rates. This study presents a multi-strategy fusion of the improved Aquila Optimizer, aiming to enhance its performance by integrating diverse optimization techniques, particularly in the context of path planning. Key enhancements include the integration of Bernoulli chaotic mapping to improve initial population diversity, a spiral stepping strategy to boost search precision and diversity, and a “stealing” mechanism from the Dung Beetle Optimization algorithm to enhance global search capabilities and convergence. Additionally, a nonlinear balance factor is employed to dynamically manage the exploration–exploitation trade-off, thereby increasing the optimization of speed and accuracy. The effectiveness of the mixed multi-strategy improved Aquila Optimizer is validated through simulations on benchmark test functions, CEC2017 complex functions, and path planning scenarios. Comparative analysis with seven other optimization algorithms reveals that the proposed method significantly improves both convergence speed and optimization accuracy. These findings highlight the potential of mixed multi-strategy improved Aquila Optimizer in advancing drone path planning performance, offering enhanced safety and efficiency. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

38 pages, 8511 KiB  
Article
Robust Parameter Optimisation of Noise-Tolerant Clustering for DENCLUE Using Differential Evolution
by Omer Ajmal, Humaira Arshad, Muhammad Asad Arshed, Saeed Ahmed and Shahzad Mumtaz
Mathematics 2024, 12(21), 3367; https://doi.org/10.3390/math12213367 - 27 Oct 2024
Viewed by 1155
Abstract
Clustering samples based on similarity remains a significant challenge, especially when the goal is to accurately capture the underlying data clusters of complex arbitrary shapes. Existing density-based clustering techniques are known to be best suited for capturing arbitrarily shaped clusters. However, a key [...] Read more.
Clustering samples based on similarity remains a significant challenge, especially when the goal is to accurately capture the underlying data clusters of complex arbitrary shapes. Existing density-based clustering techniques are known to be best suited for capturing arbitrarily shaped clusters. However, a key limitation of these methods is the difficulty in automatically finding the optimal set of parameters adapted to dataset characteristics, which becomes even more challenging when the data contain inherent noise. In our recent work, we proposed a Differential Evolution-based DENsity CLUstEring (DE-DENCLUE) to optimise DENCLUE parameters. This study evaluates DE-DENCLUE for its robustness in finding accurate clusters in the presence of noise in the data. DE-DENCLUE performance is compared against three other density-based clustering algorithms—DPC based on weighted local density sequence and nearest neighbour assignment (DPCSA), Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Variable Kernel Density Estimation–based DENCLUE (VDENCLUE)—across several datasets (i.e., synthetic and real). The study has consistently shown superior results for DE-DENCLUE compared to other models for most datasets with different noise levels. Clustering quality metrics such as the Silhouette Index (SI), Davies–Bouldin Index (DBI), Adjusted Rand Index (ARI), and Adjusted Mutual Information (AMI) consistently show superior SI, ARI, and AMI values across most datasets at different noise levels. However, in some cases regarding DBI, the DPCSA performed better. In conclusion, the proposed method offers a reliable and noise-resilient clustering solution for complex datasets. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

12 pages, 972 KiB  
Article
Regularized Discrete Optimal Transport for Class-Imbalanced Classifications
by Jiqiang Chen, Jie Wan and Litao Ma
Mathematics 2024, 12(4), 524; https://doi.org/10.3390/math12040524 - 7 Feb 2024
Cited by 1 | Viewed by 1298
Abstract
Imbalanced class data are commonly observed in pattern analysis, machine learning, and various real-world applications. Conventional approaches often resort to resampling techniques in order to address the imbalance, which inevitably alter the original data distribution. This paper proposes a novel classification method that [...] Read more.
Imbalanced class data are commonly observed in pattern analysis, machine learning, and various real-world applications. Conventional approaches often resort to resampling techniques in order to address the imbalance, which inevitably alter the original data distribution. This paper proposes a novel classification method that leverages optimal transport for handling imbalanced data. Specifically, we establish a transport plan between training and testing data without modifying the original data distribution, drawing upon the principles of optimal transport theory. Additionally, we introduce a non-convex interclass regularization term to establish connections between testing samples and training samples with the same class labels. This regularization term forms the basis of a regularized discrete optimal transport model, which is employed to address imbalanced classification scenarios. Subsequently, in line with the concept of maximum minimization, a maximum minimization algorithm is introduced for regularized discrete optimal transport. Subsequent experiments on 17 Keel datasets with varying levels of imbalance demonstrate the superior performance of the proposed approach compared to 11 other widely used techniques for class-imbalanced classification. Additionally, the application of the proposed approach to water quality evaluation confirms its effectiveness. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

20 pages, 2711 KiB  
Article
A Novel Memory Concurrent Editing Model for Large-Scale Video Streams in Edge Computing
by Haitao Liu, Qingkui Chen and Puchen Liu
Mathematics 2023, 11(14), 3175; https://doi.org/10.3390/math11143175 - 19 Jul 2023
Cited by 1 | Viewed by 1243
Abstract
Efficient management and utilization of edge server memory buffers are crucial for improving the efficiency of concurrent editing in the concurrent editing application scenario of large-scale video in edge computing. In order to elevate the efficiency of concurrent editing and the satisfaction of [...] Read more.
Efficient management and utilization of edge server memory buffers are crucial for improving the efficiency of concurrent editing in the concurrent editing application scenario of large-scale video in edge computing. In order to elevate the efficiency of concurrent editing and the satisfaction of service users under the constraint of limited memory buffer resources, the allocation of memory buffers of concurrent editing servers is transformed into the bin-packing problem, which is solved using an ant colony algorithm to achieve the least loaded utilization batch. Meanwhile, a new distributed online concurrent editing algorithm for video streams is designed for the conflict problem of large-scale video editing in an edge computing environment. It incorporates dual-buffer read-and-write technology to solve the difficult problem of concurrent inefficiency of editing and writing disks. The experimental results of the simulation show that the scheme not only achieves a good performance in the scheduling of concurrent editing but also implements the editing resource allocation function in an efficient and reasonable way. Compared with the benchmark traditional single-exclusive editing scheme, the proposed optimized scheme can simultaneously enhance editing efficiency and user satisfaction under the restriction of providing the same memory buffer computing resources. The proposed model has a wide application to video real-time processing application scenarios in edge computing. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

20 pages, 5406 KiB  
Article
An Optimization Method of Large-Scale Video Stream Concurrent Transmission for Edge Computing
by Haitao Liu, Qingkui Chen and Puchen Liu
Mathematics 2023, 11(12), 2622; https://doi.org/10.3390/math11122622 - 8 Jun 2023
Cited by 4 | Viewed by 2533
Abstract
Concurrent access to large-scale video data streams in edge computing is an important application scenario that currently faces a high cost of network access equipment and high data packet loss rate. To solve this problem, a low-cost link aggregation video stream data concurrent [...] Read more.
Concurrent access to large-scale video data streams in edge computing is an important application scenario that currently faces a high cost of network access equipment and high data packet loss rate. To solve this problem, a low-cost link aggregation video stream data concurrent transmission method is proposed. Data Plane Development Kit (DPDK) technology supports the concurrent receiving and forwarding function of multiple Network Interface Cards (NICs). The Q-learning data stream scheduling model is proposed to solve the load scheduling of multiple queues of multiple NICs. The Central Processing Unit (CPU) transmission processing unit was dynamically selected by data stream classification, as well as a reward function, to achieve the dynamic load balancing of data stream transmission. The experiments conducted demonstrate that this method expands the bandwidth by 3.6 times over the benchmark scheme for a single network port, and reduces the average CPU load ratio by 18%. Compared to the UDP and DPDK schemes, it lowers the average system latency by 21%, reduces the data transmission packet loss rate by 0.48%, and improves the overall system transmission throughput. This transmission optimization scheme can be applied in data centers and edge computing clusters to improve the communication performance of big data processing. Full article
(This article belongs to the Special Issue Optimization Models and Algorithms in Data Science)
Show Figures

Figure 1

Back to TopTop