MDPI - Publisher of Open Access Journals

43 pages, 6462 KiB

Open AccessArticle

An Integrated Mechanical Fault Diagnosis Framework Using Improved GOOSE-VMD, RobustICA, and CYCBD

by Jingzong Yang and Xuefeng Li

Machines 2025, 13(7), 631; https://doi.org/10.3390/machines13070631 - 21 Jul 2025

Viewed by 259

Rolling element bearings serve as critical transmission components in industrial automation systems, yet their fault signatures are susceptible to interference from strong background noise, complex operating conditions, and nonlinear impact characteristics. Addressing the limitations of conventional methods in adaptive parameter optimization and weak [...] Read more.

Rolling element bearings serve as critical transmission components in industrial automation systems, yet their fault signatures are susceptible to interference from strong background noise, complex operating conditions, and nonlinear impact characteristics. Addressing the limitations of conventional methods in adaptive parameter optimization and weak feature enhancement, this paper proposes an innovative diagnostic framework integrating Improved Goose optimized Variational Mode Decomposition (IGOOSE-VMD), RobustICA, and CYCBD. First, to mitigate modal aliasing issues caused by empirical parameter dependency in VMD, we fuse a refraction-guided reverse learning mechanism with a dynamic mutation strategy to develop the IGOOSE. By employing an energy-feature-driven fitness function, this approach achieves synergistic optimization of the mode number and penalty factor. Subsequently, a multi-channel observation model is constructed based on optimal component selection. Noise interference is suppressed through the robust separation capabilities of RobustICA, while CYCBD introduces cyclostationarity-based prior constraints to formulate a blind deconvolution operator with periodic impact enhancement properties. This significantly improves the temporal sparsity of fault-induced impact components. Experimental results demonstrate that, compared to traditional time–frequency analysis techniques (e.g., EMD, EEMD, LMD, ITD) and deconvolution methods (including MCKD, MED, OMEDA), the proposed approach exhibits superior noise immunity and higher fault feature extraction accuracy under high background noise conditions. Full article

(This article belongs to the Special Issue Advances in Bearing Modeling, Fault Diagnosis, RUL Prediction (2nd Edition))

► Show Figures

Figure 1

22 pages, 3431 KiB

Open AccessArticle

Safety–Efficiency Balanced Navigation for Unmanned Tracked Vehicles in Uneven Terrain Using Prior-Based Ensemble Deep Reinforcement Learning

by Yiming Xu, Songhai Zhu, Dianhao Zhang, Yinda Fang and Mien Van

World Electr. Veh. J. 2025, 16(7), 359; https://doi.org/10.3390/wevj16070359 - 27 Jun 2025

Viewed by 330

Abstract

This paper proposes a novel navigation approach for Unmanned Tracked Vehicles (UTVs) using prior-based ensemble deep reinforcement learning, which fuses the policy of the ensemble Deep Reinforcement Learning (DRL) and Dynamic Window Approach (DWA) to enhance both exploration efficiency and deployment safety in [...] Read more.

This paper proposes a novel navigation approach for Unmanned Tracked Vehicles (UTVs) using prior-based ensemble deep reinforcement learning, which fuses the policy of the ensemble Deep Reinforcement Learning (DRL) and Dynamic Window Approach (DWA) to enhance both exploration efficiency and deployment safety in unstructured off-road environments. First, by integrating kinematic analysis, we introduce a novel state and an action space that account for rugged terrain features and track–ground interactions. Local elevation information and vehicle pose changes over consecutive time steps are used as inputs to the DRL model, enabling the UTVs to implicitly learn policies for safe navigation in complex terrains while minimizing the impact of slipping disturbances. Then, we introduce an ensemble Soft Actor–Critic (SAC) learning framework, which introduces the DWA as a behavioral prior, referred to as the SAC-based Hybrid Policy (SAC-HP). Ensemble SAC uses multiple policy networks to effectively reduce the variance of DRL outputs. We combine the DRL actions with the DWA method by reconstructing the hybrid Gaussian distribution of both. Experimental results indicate that the proposed SAC-HP converges faster than traditional SAC models, which enables efficient large-scale navigation tasks. Additionally, a penalty term in the reward function about energy optimization is proposed to reduce velocity oscillations, ensuring fast convergence and smooth robot movement. Scenarios with obstacles and rugged terrain have been considered to prove the SAC-HP’s efficiency, robustness, and smoothness when compared with the state of the art. Full article

► Show Figures

Figure 1

21 pages, 807 KiB

Open AccessArticle

Multi-Source Data-Driven Terrestrial Multi-Algorithm Fusion Path Planning Technology

by Xiao Ji, Peng Liu, Meng Zhang, Chengchun Zhang, Shuang Yu, Bing Qi and Man Zhao

Sensors 2025, 25(12), 3595; https://doi.org/10.3390/s25123595 - 7 Jun 2025

Viewed by 444

Abstract

This paper presents a multi-source data-driven hybrid path planning framework that integrates global A* search with local Deep Q-Network (DQN) optimization to address complex terrestrial routing challenges. By fusing ASTER GDEM terrain data with OpenStreetMap (OSM) road networks, we construct a standardized geospatial [...] Read more.

This paper presents a multi-source data-driven hybrid path planning framework that integrates global A* search with local Deep Q-Network (DQN) optimization to address complex terrestrial routing challenges. By fusing ASTER GDEM terrain data with OpenStreetMap (OSM) road networks, we construct a standardized geospatial database encompassing elevation, traffic, and road attributes. A dynamic-heuristic A* algorithm is proposed, incorporating traffic signals and congestion penalties, and is enhanced by a DQN-based local decision module to improve adaptability to dynamic environments. Experimental results on a realistic urban dataset demonstrate that the proposed method achieves superior performance in risk avoidance, travel time reduction, and dynamic obstacle handling compared to traditional models. This study contributes a unified architecture that enhances planning robustness and lays the foundation for real-time applications in emergency response and smart logistics. Full article

(This article belongs to the Special Issue Industrial Soft-Sensing Technology Based on Data-Driven and Artificial Intelligence Technologies)

► Show Figures

Figure 1

26 pages, 422 KiB

Open AccessArticle

Incorporating Prior Information in Latent Structures Identification for Panel Data Models

by Yi Li, Xingxing Luo and Mengqi Liao

Mathematics 2025, 13(9), 1505; https://doi.org/10.3390/math13091505 - 2 May 2025

Viewed by 288

Abstract

In this paper, we explore the latent structures for panel data models in presence of available prior information. The latent structure in panel models allows individuals to be classified into several distinct groups, where the individuals within the same group share the same [...] Read more.

In this paper, we explore the latent structures for panel data models in presence of available prior information. The latent structure in panel models allows individuals to be classified into several distinct groups, where the individuals within the same group share the same slope parameters, while the group-specific parameters are heterogeneous. To incorporate the prior information, we design a new alternating direction method of multipliers (ADMM) algorithm based on the pairwise group fused Lasso penalty approach. The asymptotic properties and the convergence of ADMM algorithm are well established. Simulation studies demonstrate the advantages of the proposed method over existing methods in terms of both estimation efficiency and detection accuracy. We illustrate the practical utility of the proposed procedure by analyzing the relationship between electricity consumption and GDP in China. Full article

► Show Figures

Figure 1

17 pages, 2167 KiB

Open AccessArticle

Enhanced TSMixer Model for the Prediction and Control of Particulate Matter

by Chaoqiong Yang, Haoru Li, Yue Ma, Yubin Huang and Xianghua Chu

Sustainability 2025, 17(7), 2933; https://doi.org/10.3390/su17072933 - 26 Mar 2025

Viewed by 587

Abstract

This study presents an improved deep-learning model, termed Enhanced Time Series Mixer (E-TSMixer), for the prediction of particulate matter. By analyzing the temporal evolution of PM_2.5 concentrations from multivariate monitoring data, the model demonstrates significant prediction capabilities while maintaining consistency with observed [...] Read more.

This study presents an improved deep-learning model, termed Enhanced Time Series Mixer (E-TSMixer), for the prediction of particulate matter. By analyzing the temporal evolution of PM_2.5 concentrations from multivariate monitoring data, the model demonstrates significant prediction capabilities while maintaining consistency with observed pollutant transport characteristics in the urban boundary layer. In E-TSMixer, a fully connected output layer is proposed to enhance the predictive capability for complex spatiotemporal dependencies. The relevant data on air quality and traffic flow are fused to achieve high-precision predictions of PM_2.5 concentrations through a multivariate time-series forecasting model. An asymmetric penalty mechanism is added to dynamically optimize the loss function. Experimental results indicate that the proposed E-TSMixer model achieves higher accuracy for the prediction of PM_2.5, which significantly outperforms the traditional models. Additionally, an intelligent dual regulation of fixed and dynamic threshold model is introduced and combined with E-TSMixer for the decision-making model of the real-time adjustments of the frequency, routes, and timing of water truck operation in practice. Full article

► Show Figures

Figure 1

26 pages, 3007 KiB

Open AccessArticle

EDRNet: Edge-Enhanced Dynamic Routing Adaptive for Depth Completion

by Fuyun Sun, Baoquan Li and Qiaomei Zhang

Mathematics 2025, 13(6), 953; https://doi.org/10.3390/math13060953 - 13 Mar 2025

Viewed by 705

Abstract

Depth completion is a technique to densify the sparse depth maps acquired by depth sensors (e.g., RGB-D cameras, LiDAR) to generate complete and accurate depth maps. This technique has important application value in autonomous driving, robot navigation, and virtual reality. Currently, deep learning [...] Read more.

Depth completion is a technique to densify the sparse depth maps acquired by depth sensors (e.g., RGB-D cameras, LiDAR) to generate complete and accurate depth maps. This technique has important application value in autonomous driving, robot navigation, and virtual reality. Currently, deep learning has become a mainstream method for depth completion. Therefore, we propose an edge-enhanced dynamically routed adaptive depth completion network, EDRNet, to achieve efficient and accurate depth completion through lightweight design and boundary optimisation. Firstly, we introduce the Canny operator (a classical image processing technique) to explicitly extract and fuse the object contour information and fuse the acquired edge maps with RGB images and sparse depth map inputs to provide the network with clear edge-structure information. Secondly, we design a Sparse Adaptive Dynamic Routing Transformer block called SADRT, which can effectively combine the global modelling capability of the Transformer and the local feature extraction capability of CNN. The dynamic routing mechanism introduced in this block can dynamically select key regions for efficient feature extraction, and the amount of redundant computation is significantly reduced compared with the traditional Transformer. In addition, we design a loss function with additional penalties for the depth error of the object edges, which further enhances the constraints on the edges. The experimental results demonstrate that the method presented in this paper achieves significant performance improvements on the public datasets KITTI DC and NYU Depth v2, especially in the edge region’s depth prediction accuracy and computational efficiency. Full article

(This article belongs to the Special Issue Recent Advances in Artificial Intelligence and Machine Learning, 2nd Edition)

► Show Figures

Figure 1

26 pages, 3823 KiB

Open AccessArticle

Enhanced Conformer-Based Speech Recognition via Model Fusion and Adaptive Decoding with Dynamic Rescoring

by Junhao Geng, Dongyao Jia, Zihao He, Nengkai Wu and Ziqi Li

Appl. Sci. 2024, 14(24), 11583; https://doi.org/10.3390/app142411583 - 11 Dec 2024

Viewed by 2014

Abstract

Speech recognition is widely applied in fields like security, education, and healthcare. While its development drives global information infrastructure and AI strategies, current models still face challenges such as overfitting, local optima, and inefficiencies in decoding accuracy and computational cost. These issues cause [...] Read more.

Speech recognition is widely applied in fields like security, education, and healthcare. While its development drives global information infrastructure and AI strategies, current models still face challenges such as overfitting, local optima, and inefficiencies in decoding accuracy and computational cost. These issues cause instability and long response times, hindering AI’s competitiveness. Therefore, addressing these technical bottlenecks is critical for advancing national scientific progress and global information infrastructure. In this paper, we propose improvements to the model structure fusion and decoding algorithms. First, based on the Conformer network and its variants, we introduce a weighted fusion method using training loss as an indicator, adjusting the weights, thresholds, and other related parameters of the fused models to balance the contributions of different model structures, thereby creating a more robust and generalized model that alleviates overfitting and local optima. Second, for the decoding phase, we design a dynamic adaptive decoding method that combines traditional decoding algorithms such as connectionist temporal classification and attention-based models. This ensemble approach enables the system to adapt to different acoustic environments, improving its robustness and overall performance. Additionally, to further optimize the decoding process, we introduce a penalty function mechanism as a regularization technique to reduce the model’s dependence on a single decoding approach. The penalty function limits the weights of decoding strategies to prevent over-reliance on any single decoder, thus enhancing the model’s generalization. Finally, we validate our model on the Librispeech dataset, a large-scale English speech corpus containing approximately 1000 h of audio data. Experimental results demonstrate that the proposed method achieves word error rates (WERs) of 3.92% and 4.07% on the development and test sets, respectively, significantly improving over single-model and traditional decoding methods. Notably, the method reduces WER by approximately 0.4% on complex datasets compared to several advanced mainstream models, underscoring its superior robustness and adaptability in challenging acoustic environments. The effectiveness of the proposed method in addressing overfitting and improving accuracy and efficiency during the decoding phase was validated, highlighting its significance in advancing speech recognition technology. Full article

(This article belongs to the Special Issue Deep Learning for Speech, Image and Language Processing)

► Show Figures

Figure 1

24 pages, 22137 KiB

Open AccessArticle

Feature Extraction and Classification of Motor Imagery EEG Signals in Motor Imagery for Sustainable Brain–Computer Interfaces

by Yuyi Lu, Wenbo Wang, Baosheng Lian and Chencheng He

Sustainability 2024, 16(15), 6627; https://doi.org/10.3390/su16156627 - 2 Aug 2024

Cited by 4 | Viewed by 3382

Abstract

Motor imagery brain–computer interface (MI-BCI) systems hold the potential to restore motor function and offer the opportunity for sustainable autonomous living for individuals with a range of motor and sensory impairments. The feature extraction and classification of motor imagery EEG signals related to [...] Read more.

Motor imagery brain–computer interface (MI-BCI) systems hold the potential to restore motor function and offer the opportunity for sustainable autonomous living for individuals with a range of motor and sensory impairments. The feature extraction and classification of motor imagery EEG signals related to motor imagery brain–computer interface systems has become a research hotspot. To address the challenges of difficulty in feature extraction and low recognition rates of motor imagery EEG signals caused by individual variations in EEG signals, a classification algorithm for EEG signals based on multi-feature fusion and the SVM-AdaBoost algorithm was proposed to improve the recognition accuracy of motor imagery EEG signals. Initially, the electroencephalography (EEG) signals are preprocessed using Finite Impulse Response (FIR) filters, and a multi-wavelet framework is constructed based on the Morlet wavelet and the Haar wavelet. Subsequently, the preprocessed signals undergo multi-wavelet decomposition to extract energy features, Common Spatial Patterns (CSP) features, Autoregressive (AR) features, and Power Spectral Density (PSD) features. The extracted features are then fused, and the fused feature vector is normalized. Following that, classification is implemented within the SVM-AdaBoost algorithm. To enhance the adaptability of SVM-AdaBoost, the Grid Search method is employed to optimize the penalty parameter and kernel function parameter of the SVM. Concurrently, the Whale Optimization Algorithm is utilized to optimize the learning rate and number of weak learners within the AdaBoost ensemble, thereby refining the overall performance. In addition, the classification performance of the algorithm is validated using a brain-computer interface (BCI) dataset. In this study, it was found that the classification accuracy reached 95.37%. Via the analysis of motor imagery electroencephalography (EEG) signals, the activation patterns in different regions of the brain can be detected and identified, enabling the inference of user intentions and facilitating communication and control between the human brain and external devices. Full article

► Show Figures

Figure 1

25 pages, 2999 KiB

Open AccessArticle

GFLASSO-LR: Logistic Regression with Generalized Fused LASSO for Gene Selection in High-Dimensional Cancer Classification

by Ahmed Bir-Jmel, Sidi Mohamed Douiri, Souad El Bernoussi, Ayyad Maafiri, Yassine Himeur, Shadi Atalla, Wathiq Mansoor and Hussain Al-Ahmad

Computers 2024, 13(4), 93; https://doi.org/10.3390/computers13040093 - 6 Apr 2024

Cited by 2 | Viewed by 3514

Abstract

Advancements in genomic technologies have paved the way for significant breakthroughs in cancer diagnostics, with DNA microarray technology standing at the forefront of identifying genetic expressions associated with various cancer types. Despite its potential, the vast dimensionality of microarray data presents a formidable [...] Read more.

Advancements in genomic technologies have paved the way for significant breakthroughs in cancer diagnostics, with DNA microarray technology standing at the forefront of identifying genetic expressions associated with various cancer types. Despite its potential, the vast dimensionality of microarray data presents a formidable challenge, necessitating efficient dimension reduction and gene selection methods to accurately identify cancerous tumors. In response to this challenge, this study introduces an innovative strategy for microarray data dimension reduction and crucial gene set selection, aiming to enhance the accuracy of cancerous tumor identification. Leveraging DNA microarray technology, our method focuses on pinpointing significant genes implicated in tumor development, aiding the development of sophisticated computerized diagnostic tools. Our technique synergizes gene selection with classifier training within a logistic regression framework, utilizing a generalized Fused LASSO (GFLASSO-LR) regularizer. This regularization incorporates two penalties: one for selecting pertinent genes and another for emphasizing adjacent genes of importance to the target class, thus achieving an optimal trade-off between gene relevance and redundancy. The optimization challenge posed by our approach is tackled using a sub-gradient algorithm, designed to meet specific convergence prerequisites. We establish that our algorithm’s objective function is convex, Lipschitz continuous, and possesses a global minimum, ensuring reliability in the gene selection process. A numerical evaluation of the method’s parameters further substantiates its effectiveness. Experimental outcomes affirm the GFLASSO-LR methodology’s high efficiency in processing high-dimensional microarray data for cancer classification. It effectively identifies compact gene subsets, significantly enhancing classification performance and demonstrating its potential as a powerful tool in cancer research and diagnostics. Full article

(This article belongs to the Special Issue Machine and Deep Learning in the Health Domain 2024)

► Show Figures

Figure 1

18 pages, 6533 KiB

Open AccessArticle

Rotating Machinery Fault Diagnosis with Limited Multisensor Fusion Samples by Fused Attention-Guided Wasserstein GAN

by Wenlong Fu, Ke Yang, Bin Wen, Yahui Shan, Shuai Li and Bo Zheng

Symmetry 2024, 16(3), 285; https://doi.org/10.3390/sym16030285 - 1 Mar 2024

Cited by 17 | Viewed by 1931

Abstract

As vital equipment in modern industry, the health state of rotating machinery influences the production process and equipment safety. However, rotating machinery generally operates in a normal state most of the time, which results in limited fault data, thus greatly constraining the performance [...] Read more.

As vital equipment in modern industry, the health state of rotating machinery influences the production process and equipment safety. However, rotating machinery generally operates in a normal state most of the time, which results in limited fault data, thus greatly constraining the performance of intelligent fault diagnosis methods. To solve this problem, this paper proposes a novel fault diagnosis method for rotating machinery with limited multisensor fusion samples based on the fused attention-guided Wasserstein generative adversarial network (WGAN). Firstly, the dimensionality of collected multisensor data is reduced to three channels by principal component analysis, and then the one-dimensional data of each channel are converted into a two-dimensional pixel matrix, of which the RGB images are obtained by fusing the three-channel two-dimensional images. Subsequently, the limited RGB samples are augmented to obtain sufficient samples utilizing the fused attention-guided WGAN combined with the gradient penalty (FAWGAN-GP) method. Lastly, the augmented samples are applied to train a residual convolutional neural network for fault diagnosis. The effectiveness of the proposed method is demonstrated by two case studies. When training samples per class are 50, 35, 25, and 15 on the KAT-bearing dataset, the average classification accuracy is 99.9%, 99.65%, 99.6%, and 98.7%, respectively. Meanwhile, the methods of multisensor fusion and the fused attention mechanism have an average improvement of 1.51% and 1.09%, respectively, by ablation experiments on the WT gearbox dataset. Full article

► Show Figures

Figure 1

16 pages, 504 KiB

Open AccessArticle

Regularized Mislevy-Wu Model for Handling Nonignorable Missing Item Responses

by Alexander Robitzsch

Information 2023, 14(7), 368; https://doi.org/10.3390/info14070368 - 28 Jun 2023

Cited by 3 | Viewed by 1820

Abstract

Missing item responses are frequently found in educational large-scale assessment studies. In this article, the Mislevy-Wu item response model is applied for handling nonignorable missing item responses. This model allows that the missingness of an item depends on the item itself and a [...] Read more.

Missing item responses are frequently found in educational large-scale assessment studies. In this article, the Mislevy-Wu item response model is applied for handling nonignorable missing item responses. This model allows that the missingness of an item depends on the item itself and a further latent variable. However, with low to moderate amounts of missing item responses, model parameters for the missingness mechanism are difficult to estimate. Hence, regularized estimation using a fused ridge penalty is applied to the Mislevy-Wu model to stabilize estimation. The fused ridge penalty function is separately defined for multiple-choice and constructed response items because previous research indicated that the missingness mechanisms strongly differed for the two item types. In a simulation study, it turned out that regularized estimation improves the stability of item parameter estimation. The method is also illustrated using international data from the progress in international reading literacy study (PIRLS) 2011 data. Full article

(This article belongs to the Topic Soft Computing)

► Show Figures

Figure 1

15 pages, 1078 KiB

Open AccessArticle

Research Based on High-Dimensional Fused Lasso Partially Linear Model

by Aifen Feng, Jingya Fan, Zhengfen Jin, Mengmeng Zhao and Xiaogai Chang

Mathematics 2023, 11(12), 2726; https://doi.org/10.3390/math11122726 - 16 Jun 2023

Viewed by 1455

Abstract

In this paper, a partially linear model based on the fused lasso method is proposed to solve the problem of high correlation between adjacent variables, and then the idea of the two-stage estimation method is used to study the solution of this model. [...] Read more.

In this paper, a partially linear model based on the fused lasso method is proposed to solve the problem of high correlation between adjacent variables, and then the idea of the two-stage estimation method is used to study the solution of this model. Firstly, the non-parametric part of the partially linear model is estimated using the kernel function method and transforming the semiparametric model into a parametric model. Secondly, the fused lasso regularization term is introduced into the model to construct the least squares parameter estimation based on the fused lasso penalty. Then, due to the non-smooth terms of the model, the subproblems may not have closed-form solutions, so the linearized alternating direction multiplier method (LADMM) is used to solve the model, and the convergence of the algorithm and the asymptotic properties of the model are analyzed. Finally, the applicability of this model was demonstrated through two types of simulation data and practical problems in predicting worker wages. Full article

► Show Figures

Figure 1

19 pages, 361 KiB

Open AccessArticle

Bicluster Analysis of Heterogeneous Panel Data via M-Estimation

by Weijie Cui and Yong Li

Mathematics 2023, 11(10), 2333; https://doi.org/10.3390/math11102333 - 17 May 2023

Cited by 1 | Viewed by 1500

Abstract

This paper investigates the latent block structure in the heterogeneous panel data model. It is assumed that the regression coefficients have group structures across individuals and structural breaks over time, where change points can cause changes to the group structures and structural breaks [...] Read more.

This paper investigates the latent block structure in the heterogeneous panel data model. It is assumed that the regression coefficients have group structures across individuals and structural breaks over time, where change points can cause changes to the group structures and structural breaks can vary between subgroups. To recover the latent block structure, we propose a robust biclustering approach that utilizes M-estimation and concave fused penalties. An algorithm based on local quadratic approximation is developed to optimize the objective function, which is more compact and efficient than the ADMM algorithm. Moreover, we establish the oracle property of the penalized M-estimators and prove that the proposed estimator recovers the latent block structure with a probability approaching one. Finally, simulation studies on multiple datasets demonstrate the good finite sample performance of the proposed estimators. Full article

(This article belongs to the Special Issue Advances in Statistics: Theory, Methodology, Applications and Data Analysis)

► Show Figures

Figure 1

13 pages, 1249 KiB

Open AccessArticle

A Dual-Path Cross-Modal Network for Video-Music Retrieval

by Xin Gu, Yinghua Shen and Chaohui Lv

Sensors 2023, 23(2), 805; https://doi.org/10.3390/s23020805 - 10 Jan 2023

Cited by 5 | Viewed by 2900

Abstract

In recent years, with the development of the internet, video has become more and more widely used in life. Adding harmonious music to a video is gradually becoming an artistic task. However, artificially adding music takes a lot of time and effort, so [...] Read more.

In recent years, with the development of the internet, video has become more and more widely used in life. Adding harmonious music to a video is gradually becoming an artistic task. However, artificially adding music takes a lot of time and effort, so we propose a method to recommend background music for videos. The emotional message of music is rarely taken into account in current work, but it is crucial for video music retrieval. To achieve this, we design two paths to process content information and emotional information between modals. Based on the characteristics of video and music, we design various feature extraction schemes and common representation spaces. In the content path, the pre-trained network is used as the feature extraction network. As these features contain some redundant information, we use an encoder–decoder structure for dimensionality reduction. Where encoder weights are shared to obtain content sharing features for video and music. In the emotion path, an emotion key frames scheme was used for video and a channel attention mechanism was used for music in order to obtain the emotion information effectively. We also added emotion distinguish loss to guarantee that the network acquires the emotion information effectively. More importantly, we propose a way to combine content information with emotional information. That is, content features are first stitched together with sentiment features and then passed through a fused shared space structured as an MLP to obtain more effective fused shared features. In addition, a polarity penalty factor has been added to the classical metric loss function to make it more suitable for this task. Experiments show that this dual path video music retrieval network can effectively merge information. Compared with existing methods, the retrieval task evaluation index increases Recall@1 by 3.94. Full article

(This article belongs to the Special Issue Multimodal Data Fusion Technologies and Applications in Intelligent System)

► Show Figures

Figure 1

23 pages, 6896 KiB

Open AccessArticle

Recognition of Corrosion State of Water Pipe Inner Wall Based on SMA-SVM under RF Feature Selection

by Qian Zhao, Lu Li, Lihua Zhang and Man Zhao

Coatings 2023, 13(1), 26; https://doi.org/10.3390/coatings13010026 - 23 Dec 2022

Cited by 9 | Viewed by 1964

Abstract

To solve the problem of low detection accuracy of water supply pipeline internal wall damage, a random forest algorithm with simplified features and a slime mold optimization support vector machine detection method was proposed. Firstly, the color statistical characteristics, gray level co-occurrence matrix, [...] Read more.

To solve the problem of low detection accuracy of water supply pipeline internal wall damage, a random forest algorithm with simplified features and a slime mold optimization support vector machine detection method was proposed. Firstly, the color statistical characteristics, gray level co-occurrence matrix, and gray level run length matrix features of the pipeline image are extracted for multi-feature fusion. The contribution of the fused features is analyzed using the feature simplified random forest algorithm, and the feature set with the strongest feature expression ability is selected for classification and recognition. The global search ability of the slime mold optimization algorithm is used to find the optimal kernel function parameters and penalty factors of the support vector machine model. Finally, the optimal parameters are applied to support the vector machine model for classification prediction. The experimental results show that the recognition accuracy of the classification model proposed in this paper reaches 94.710% on the data sets of different corrosion forms on the inner wall of the pipeline. Compared with the traditional Support Vector Machines (SVM) classification model, the SVM model based on differential pollination optimization, the SVM model based on particle swarm optimization, and the back propagation (BP) neural network classification model, it is improved by 4.786%, 3.023%, 4.030%, and 0.503% respectively. Full article

(This article belongs to the Special Issue Recent Advances in Innovative Surface and Materials)

► Show Figures

Figure 1

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (26)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI