MDPI - Publisher of Open Access Journals

21 pages, 2657 KB

Open AccessArticle

Research on ATT-BiLSTM-Based Restoration Method for Deflection Monitoring Data of a Steel Truss Bridge

by Yongjian Chen, Rongzhen Liu, Jianlin Wang, Fan Pan, Fei Lian and Hui Cheng

Appl. Sci. 2025, 15(15), 8622; https://doi.org/10.3390/app15158622 - 4 Aug 2025

Viewed by 227

Given the intricate operating environment of steel truss bridges, data anomalies are frequently initiated by faults in the sensor monitoring system itself during the monitoring process. This paper utilizes a steel truss bridge as a case study in engineering, with a primary focus [...] Read more.

Given the intricate operating environment of steel truss bridges, data anomalies are frequently initiated by faults in the sensor monitoring system itself during the monitoring process. This paper utilizes a steel truss bridge as a case study in engineering, with a primary focus on the deflection of the main girder. The paper establishes an Attention Mechanism-based Bidirectional Long Short-Term Memory Neural Network (ATT-BiLSTM) model, with the objective of accurately repairing abnormal monitoring data. Firstly, correlation heat maps and Gray correlation are employed to detect anomalies in key measurement point data. Subsequently, the ATT-BiLSTM and Support Vector Machine (SVR) models are established to repair the anomalous monitoring data. Finally, various evaluation indexes, including Pearson’s correlation coefficient, mean squared error, and coefficient of determination, are utilized to validate the repairing accuracy of the ATT-BiLSTM model. The findings indicate that the repair efficacy of ATT-BiLSTM on anomalous data surpasses that of SVR. The repaired data exhibited a tendency to decrease in amplitude at the anomalous position, while maintaining the prominence of the data at abrupt deflection change points, thereby preserving the characteristics of the data. The repair rate of anomalous data attained 93.88%, and the mean square error of the actual complete data was only 0.0226, leading to substantial enhancement in the integrity and reliability of the data. Full article

► Show Figures

Figure 1

22 pages, 696 KB

Open AccessArticle

Domain Knowledge-Driven Method for Threat Source Detection and Localization in the Power Internet of Things

by Zhimin Gu, Jing Guo, Jiangtao Xu, Yunxiao Sun and Wei Liang

Electronics 2025, 14(13), 2725; https://doi.org/10.3390/electronics14132725 - 7 Jul 2025

Viewed by 412

Abstract

Although the Power Internet of Things (PIoT) significantly improves operational efficiency by enabling real-time monitoring, intelligent control, and predictive maintenance across the grid, its inherently open and deeply interconnected cyber-physical architecture concurrently introduces increasingly complex and severe security threats. Existing IoT security solutions [...] Read more.

Although the Power Internet of Things (PIoT) significantly improves operational efficiency by enabling real-time monitoring, intelligent control, and predictive maintenance across the grid, its inherently open and deeply interconnected cyber-physical architecture concurrently introduces increasingly complex and severe security threats. Existing IoT security solutions are not fully adapted to the specific requirements of power systems, such as safety-critical reliability, protocol heterogeneity, physical/electrical context awareness, and the incorporation of domain-specific operational knowledge unique to the power sector. These limitations often lead to high false positives (flagging normal operations as malicious) and false negatives (failing to detect actual intrusions), ultimately compromising system stability and security response. To address these challenges, we propose a domain knowledge-driven threat source detection and localization method for the PIoT. The proposed method combines multi-source features—including electrical-layer measurements, network-layer metrics, and behavioral-layer logs—into a unified representation through a multi-level PIoT feature engineering framework. Building on advances in multimodal data integration and feature fusion, our framework employs a hybrid neural architecture combining the TabTransformer to model structured physical and network-layer features with BiLSTM to capture temporal dependencies in behavioral log sequences. This design enables comprehensive threat detection while supporting interpretable and fine-grained source localization. Experiments on a real-world Power Internet of Things (PIoT) dataset demonstrate that the proposed method achieves high detection accuracy and enables the actionable attribution of attack stages aligned with the MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) framework. The proposed approach offers a scalable and domain-adaptable foundation for security analytics in cyber-physical power systems. Full article

► Show Figures

Figure 1

24 pages, 26805 KB

Open AccessArticle

Estimating NOx Emissions in China via Multisource Satellite Data and Deep Learning Model

by Kun Cai, Yanfang Shao, Yinghao Lin, Shenshen Li and Minghu Fan

Remote Sens. 2025, 17(7), 1231; https://doi.org/10.3390/rs17071231 - 30 Mar 2025

Viewed by 805

Abstract

Nitrogen oxides (NOx) are known to be irritant gases, which present considerable risks to human health. TROPOMI NO₂ vertical column density (VCD) is commonly employed to estimate NOx emissions through the integration of complex models. However, satellite data often suffer from incompleteness, [...] Read more.

Nitrogen oxides (NOx) are known to be irritant gases, which present considerable risks to human health. TROPOMI NO₂ vertical column density (VCD) is commonly employed to estimate NOx emissions through the integration of complex models. However, satellite data often suffer from incompleteness, hindering the ability to achieve long-term and comprehensive estimates. In this study, we propose a reconstruction method to achieve comprehensive coverage of NO₂ VCD in China by leveraging the relationship between satellite data and meteorological variables. In addition, the CNN-BiLSTM-ATT model was developed to estimate China’s monthly NOx emissions from 2021 to 2023 in combination with other ancillary data, such as ERA5 meteorological data, topographic data, and nighttime light data, achieving a correlation coefficient (R) of 0.83 and a root mean squared error (RMSE) of 9.05 tons (T). The factors influencing NO₂ VCD were assessed using SHAP values, and the spatiotemporal characteristics and density distribution of NOx emissions were analyzed. Additionally, annual emission trends were evaluated. This study offers valuable insights for air quality management and policymaking, contributing to efforts focused on mitigating the adverse health and environmental impacts of NOx emissions. Full article

(This article belongs to the Special Issue Remote Sensing Applications for Trace Gases and Air Quality)

► Show Figures

Figure 1

22 pages, 2728 KB

Open AccessArticle

Hybrid Dynamic Galois Field with Quantum Resilience for Secure IoT Data Management and Transmission in Smart Cities Using Reed–Solomon (RS) Code

by Abdullah Aljuhni, Amer Aljaedi, Adel R. Alharbi, Ahmed Mubaraki and Moahd K. Alghuson

Symmetry 2025, 17(2), 259; https://doi.org/10.3390/sym17020259 - 8 Feb 2025

Cited by 1 | Viewed by 1149

Abstract

The Internet of Things (IoT), which is characteristic of the current industrial revolutions, is the connection of physical devices through different protocols and sensors to share information. Even though the IoT provides revolutionary opportunities, its connection to the current Internet for smart cities [...] Read more.

The Internet of Things (IoT), which is characteristic of the current industrial revolutions, is the connection of physical devices through different protocols and sensors to share information. Even though the IoT provides revolutionary opportunities, its connection to the current Internet for smart cities brings new opportunities for security threats, especially with the appearance of new threats like quantum computing. Current approaches to protect IoT data are not immune to quantum attacks and are not designed to offer the best data management for smart city applications. Thus, post-quantum cryptography (PQC), which is still in its research stage, aims to solve these problems. To this end, this research introduces the Dynamic Galois Reed–Solomon with Quantum Resilience (DGRS-QR) system to improve the secure management and communication of data in IoT smart cities. The data preprocessing includes K-Nearest Neighbors (KNN) and min–max normalization and then applying the Galois Field Adaptive Expansion (GFAE). Optimization of the quantum-resistant keys is accomplished by applying Artificial Bee Colony (ABC) and Moth Flame Optimization (MFO) algorithms. Also, role-based access control provides strong cloud data security, and quantum resistance is maintained by refreshing keys every five minutes of the active session. For error correction, Reed–Solomon (RS) codes are used which provide data reliability. Data management is performed using an attention-based Bidirectional Long Short-Term Memory (Att-Bi-LSTM) model with skip connections to provide optimized city management. The proposed approach was evaluated using key performance metrics: a key generation time of 2.34 s, encryption time of 4.56 s, decryption time of 3.56 s, PSNR of 33 dB, and SSIM of 0.99. The results show that the proposed system is capable of protecting IoT data from quantum threats while also ensuring optimal data management and processing. Full article

(This article belongs to the Special Issue New Advances in Symmetric Cryptography)

► Show Figures

Figure 1

30 pages, 8556 KB

Open AccessArticle

Optimization of Microgrid Dispatching by Integrating Photovoltaic Power Generation Forecast

by Tianrui Zhang, Weibo Zhao, Quanfeng He and Jianan Xu

Sustainability 2025, 17(2), 648; https://doi.org/10.3390/su17020648 - 15 Jan 2025

Cited by 16 | Viewed by 1519

Abstract

In order to address the impact of the uncertainty and intermittency of a photovoltaic power generation system on the smooth operation of the power system, a microgrid scheduling model incorporating photovoltaic power generation forecast is proposed in this paper. Firstly, the factors affecting [...] Read more.

In order to address the impact of the uncertainty and intermittency of a photovoltaic power generation system on the smooth operation of the power system, a microgrid scheduling model incorporating photovoltaic power generation forecast is proposed in this paper. Firstly, the factors affecting the accuracy of photovoltaic power generation prediction are analyzed by classifying the photovoltaic power generation data using cluster analysis, analyzing its important features using Pearson correlation coefficients, and downscaling the high-dimensional data using PCA. And based on the theories of the sparrow search algorithm, convolutional neural network, and bidirectional long- and short-term memory network, a combined SSA-CNN-BiLSTM prediction model is established, and the attention mechanism is used to improve the prediction accuracy. Secondly, a multi-temporal dispatch optimization model of the microgrid power system, which aims at the economic optimization of the system operation cost and the minimization of the environmental cost, is constructed based on the prediction results. Further, differential evolution is introduced into the QPSO algorithm and the model is solved using this improved quantum particle swarm optimization algorithm. Finally, the feasibility of the photovoltaic power generation forecasting model and the microgrid power system dispatch optimization model, as well as the validity of the solution algorithms, are verified through real case simulation experiments. The results show that the model in this paper has high prediction accuracy. In terms of scheduling strategy, the generation method with the lowest cost is selected to obtain an effective way to interact with the main grid and realize the stable and economically optimized scheduling of the microgrid system. Full article

► Show Figures

Figure 1

30 pages, 8653 KB

Open AccessArticle

CGAOA-AttBiGRU: A Novel Deep Learning Framework for Forecasting CO₂ Emissions

by Haijun Liu, Yang Wu, Dongqing Tan, Yi Chen and Haoran Wang

Mathematics 2024, 12(18), 2956; https://doi.org/10.3390/math12182956 - 23 Sep 2024

Cited by 1 | Viewed by 985

Abstract

Accurately predicting carbon dioxide (CO₂) emissions is crucial for environmental protection. Currently, there are two main issues with predicting CO₂ emissions: (1) existing CO₂ emission prediction models mainly rely on Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU) [...] Read more.

Accurately predicting carbon dioxide (CO₂) emissions is crucial for environmental protection. Currently, there are two main issues with predicting CO₂ emissions: (1) existing CO₂ emission prediction models mainly rely on Long Short-Term Memory (LSTM) and Gate Recurrent Unit (GRU) models, which can only model unidirectional temporal features, resulting in insufficient accuracy: (2) existing research on CO₂ emissions mainly focuses on designing predictive models, without paying attention to model optimization, resulting in models being unable to achieve their optimal performance. To address these issues, this paper proposes a framework for predicting CO₂ emissions, called CGAOA-AttBiGRU. In this framework, Attentional-Bidirectional Gate Recurrent Unit (AttBiGRU) is a prediction model that uses BiGRU units to extract bidirectional temporal features from the data, and adopts an attention mechanism to adaptively weight the bidirectional temporal features, thereby improving prediction accuracy. CGAOA is an improved Arithmetic Optimization Algorithm (AOA) used to optimize the five key hyperparameters of the AttBiGRU. We first validated the optimization performance of the improved CGAOA algorithm on 24 benchmark functions. Then, CGAOA was used to optimize AttBiGRU and compared with 12 optimization algorithms. The results indicate that the AttBiGRU optimized by CGAOA has the best predictive performance. Full article

(This article belongs to the Special Issue Advanced Analyses and Algorithms for Trustworthy AI Systems and Applications)

► Show Figures

Figure 1

21 pages, 6438 KB

Open AccessArticle

Weighted Averages and Polynomial Interpolation for PM2.5 Time Series Forecasting

by Anibal Flores, Hugo Tito-Chura, Victor Yana-Mamani, Charles Rosado-Chavez and Alejandro Ecos-Espino

Computers 2024, 13(9), 238; https://doi.org/10.3390/computers13090238 - 18 Sep 2024

Cited by 1 | Viewed by 1161

Abstract

This article describes a novel method for the multi-step forecasting of PM2.5 time series based on weighted averages and polynomial interpolation. Multi-step prediction models enable decision makers to build an understanding of longer future terms than the one-step-ahead prediction models, allowing for more [...] Read more.

This article describes a novel method for the multi-step forecasting of PM2.5 time series based on weighted averages and polynomial interpolation. Multi-step prediction models enable decision makers to build an understanding of longer future terms than the one-step-ahead prediction models, allowing for more timely decision-making. As the cases for this study, hourly data from three environmental monitoring stations from Ilo City in Southern Peru were selected. The results show average RMSEs of between 1.60 and 9.40 ug/m³ and average MAPEs of between 17.69% and 28.91%. Comparing the results with those derived using the presently implemented benchmark models (such as LSTM, BiLSTM, GRU, BiGRU, and LSTM-ATT) in different prediction horizons, in the majority of environmental monitoring stations, the proposed model outperformed them by between 2.40% and 17.49% in terms of the average MAPE derived. It is concluded that the proposed model constitutes a good alternative for multi-step PM2.5 time series forecasting, presenting similar and superior results to the benchmark models. Aside from the good results, one of the main advantages of the proposed model is that it requires fewer data in comparison with the benchmark models. Full article

► Show Figures

Figure 1

18 pages, 1106 KB

Open AccessArticle

MKDAT: Multi-Level Knowledge Distillation with Adaptive Temperature for Distantly Supervised Relation Extraction

by Jun Long, Zhuoying Yin, Yan Han and Wenti Huang

Information 2024, 15(7), 382; https://doi.org/10.3390/info15070382 - 30 Jun 2024

Cited by 2 | Viewed by 2071

Abstract

Distantly supervised relation extraction (DSRE), first used to address the limitations of manually annotated data via automatically annotating the data with triplet facts, is prone to issues such as mislabeled annotations due to the interference of noisy annotations. To address the interference of [...] Read more.

Distantly supervised relation extraction (DSRE), first used to address the limitations of manually annotated data via automatically annotating the data with triplet facts, is prone to issues such as mislabeled annotations due to the interference of noisy annotations. To address the interference of noisy annotations, we leveraged a novel knowledge distillation (KD) method which was different from the conventional models on DSRE. More specifically, we proposed a model-agnostic KD method, Multi-Level Knowledge Distillation with Adaptive Temperature (MKDAT), which mainly involves two modules: Adaptive Temperature Regulation (ATR) and Multi-Level Knowledge Distilling (MKD). ATR allocates adaptive entropy-based distillation temperatures to different training instances for providing a moderate softening supervision to the student, in which label hardening is possible for instances with great entropy. MKD combines the bag-level and instance-level knowledge of the teacher as supervisions of the student, and trains the teacher and student at the bag and instance levels, respectively, which aims at mitigating the effects of noisy annotation and improving the sentence-level prediction performance. In addition, we implemented three MKDAT models based on the CNN, PCNN, and ATT-BiLSTM neural networks, respectively, and the experimental results show that our distillation models outperform the baseline models on bag-level and instance-level evaluations. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 6521 KB

Open AccessArticle

Predict Future Transient Fire Heat Release Rates Based on Fire Imagery and Deep Learning

by Lei Xu, Jinyuan Dong and Delei Zou

Fire 2024, 7(6), 200; https://doi.org/10.3390/fire7060200 - 14 Jun 2024

Cited by 5 | Viewed by 2811

Abstract

The fire heat release rate (HRR) is a crucial parameter for describing the combustion process and its thermal effects. In recent years, some studies have employed fire scene images and deep learning algorithms to predict real-time fire HRR, which has led to the [...] Read more.

The fire heat release rate (HRR) is a crucial parameter for describing the combustion process and its thermal effects. In recent years, some studies have employed fire scene images and deep learning algorithms to predict real-time fire HRR, which has led to the advancement of HRR prediction in terms of both lightweightness and real-time monitoring. Nevertheless, the development of an early-stage monitoring system for fires and the ability to predict future HRR based on current moment data represents a crucial foundation for evaluating the scale of indoor fires and enhancing the capacity to prevent and control such incidents. This paper proposes a deep learning model based on continuous fire scene images (containing both flame and smoke features) and their time-series information to predict the future transient fire HRR. The model (Att-BiLSTM) comprises three bi-directional long- and short-term memory (Bi-LSTM) layers and one attention layer. The model employs a bidirectional feature extraction approach, followed by the introduction of an attention mechanism to highlight the image features that have a critical impact on the prediction results. In this paper, a large-scale dataset is constructed by collecting 27,231 fire scene images with instantaneous HRR annotations from 40 different fire trials from the NIST database. The experimental results demonstrate that Att-BiLSTM is capable of effectively utilizing fire scene image features and temporal information to accurately predict future transient HRR, including those in high-brightness fire environments and complex fire source situations. The research presented in this paper offers novel insights and methodologies for fire monitoring and emergency response. Full article

(This article belongs to the Special Issue The Use of Remote Sensing Technology for Forest Fire)

► Show Figures

Figure 1

15 pages, 2228 KB

Open AccessArticle

Wind Power Prediction Based on EMD-KPCA-BiLSTM-ATT Model

by Zhiyan Zhang, Aobo Deng, Zhiwen Wang, Jianyong Li, Hailiang Zhao and Xiaoliang Yang

Energies 2024, 17(11), 2568; https://doi.org/10.3390/en17112568 - 26 May 2024

Cited by 7 | Viewed by 1548

Abstract

In order to improve wind power utilization efficiency and reduce wind power prediction errors, a combined prediction model of EMD-KPCA-BilSTM-ATT is proposed, which includes a data processing method combining empirical mode decomposition (EMD) and kernel principal component analysis (KPCA), and a prediction model [...] Read more.

In order to improve wind power utilization efficiency and reduce wind power prediction errors, a combined prediction model of EMD-KPCA-BilSTM-ATT is proposed, which includes a data processing method combining empirical mode decomposition (EMD) and kernel principal component analysis (KPCA), and a prediction model combining bidirectional long short-term memory (BiLSTM) and an attention mechanism (ATT). Firstly, the influencing factors of wind power are analyzed. The quartile method is used to identify and eliminate the original abnormal data of wind power, and the linear interpolation method is used to replace the abnormal data. Secondly, EMD is used to decompose the preprocessed wind power data into Intrinsic Mode Function (IMF) components and residual components, revealing the changes in data signals at different time scales. Subsequently, KPCA is employed to screen the key components as the input of the BiLSTM-ATT prediction model. Finally, a prediction is made taking an actual wind farm in Anhui Province as an example, and the results show that the EMD-KPCAM-BiLSTM-ATT combined model has higher prediction accuracy compared to the comparative model. Full article

(This article belongs to the Special Issue Advances in AI Methods for Wind Power Forecasting and Monitoring)

► Show Figures

Figure 1

15 pages, 3839 KB

Open AccessArticle

Knowledge Graph Construction and Representation Method for Potato Diseases and Pests

by Wanxia Yang, Sen Yang, Guanping Wang, Yan Liu, Jing Lu and Weiwei Yuan

Agronomy 2024, 14(1), 90; https://doi.org/10.3390/agronomy14010090 - 29 Dec 2023

Cited by 7 | Viewed by 2262

Abstract

Potato diseases and pests have a serious impact on the quality and yield of potatoes, and timely prevention and control of potato diseases and pests is essential. A rich knowledge reserve of potato diseases and pests is one of the most important prevention [...] Read more.

Potato diseases and pests have a serious impact on the quality and yield of potatoes, and timely prevention and control of potato diseases and pests is essential. A rich knowledge reserve of potato diseases and pests is one of the most important prevention and control measures; however, valuable knowledge is buried in the massive data of potato diseases and pests, making it difficult for potato growers and managers to obtain and use it in a timely manner and to develop the potential of knowledge. Therefore, this paper explores the construction method of a knowledge graph for automatic knowledge extraction, which extracts the knowledge of potato diseases and pests scattered in heterogeneous data from multiple sources, organises it into a semantically related knowledge base, and provides potato growers with professional knowledge and timely guidance to effectively prevent and control potato diseases and pests. In this paper, a data corpus on potato diseases and pests, called PotatoRE, is first constructed. Then, a model of ALBert-BiLSTM-Self_Att-CRF is designed to extract knowledge from the corpus to form a triplet structure, which is imported into the Neo4j graph database for storage and visualisation. Furthermore, the performance of the model constructed in this paper is compared and verified using the datasets PotatoRE and People’s Daily. The results show that compared to the SOTA models of ALBert BiLSTM-CRF and ALBert BiGRU-CRF, the accuracy of our model has been improved by 2.92% and 3.12%, respectively, using PotatoRE. Compared to the Bert BiLSTM-CRF model on two datasets, our model not only improves the accuracy, recall, and F1 values, but also has a higher efficiency. The model in this paper solves the problem of the difficult recognition of nested entities. On this basis, through comparative experiments, the TransH model is used to effectively represent the constructed knowledge graph, which lays the foundation for achieving inference, extension, and automatic updating of the knowledge base. The achievements of the thesis have made certain contributions to the automatic construction of large-scale knowledge bases. Full article

(This article belongs to the Topic Applications of Big Data and Machine Learning in Smart Agriculture)

► Show Figures

Figure 1

22 pages, 16394 KB

Open AccessArticle

Attention-Based Hybrid Deep Learning Network for Human Activity Recognition Using WiFi Channel State Information

by Sakorn Mekruksavanich, Wikanda Phaphan, Narit Hnoohom and Anuchit Jitpattanakul

Appl. Sci. 2023, 13(15), 8884; https://doi.org/10.3390/app13158884 - 1 Aug 2023

Cited by 30 | Viewed by 3578

Abstract

The recognition of human movements is a crucial aspect of AI-related research fields. Although methods using vision and sensors provide more valuable data, they come at the expense of inconvenience to users and social limitations including privacy issues. WiFi-based sensing methods are increasingly [...] Read more.

The recognition of human movements is a crucial aspect of AI-related research fields. Although methods using vision and sensors provide more valuable data, they come at the expense of inconvenience to users and social limitations including privacy issues. WiFi-based sensing methods are increasingly being used to collect data on human activity due to their ubiquity, versatility, and high performance. Channel state information (CSI), a characteristic of WiFi signals, can be employed to identify various human activities. Traditional machine learning approaches depend on manually designed features, so recent studies propose leveraging deep learning capabilities to automatically extract features from raw CSI data. This research introduces a versatile framework for recognizing human activities by utilizing CSI data and evaluates its effectiveness on different deep learning networks. A hybrid deep learning network called CNN-GRU-AttNet is proposed to automatically extract informative spatial-temporal features from raw CSI data and efficiently classify activities. The effectiveness of a hybrid model is assessed by comparing it with five conventional deep learning models (CNN, LSTM, BiLSTM, GRU, and BiGRU) on two widely recognized benchmark datasets (CSI-HAR and StanWiFi). The experimental results demonstrate that the CNN-GRU-AttNet model surpasses previous state-of-the-art techniques, leading to an average accuracy improvement of up to 4.62%. Therefore, the proposed hybrid model is suitable for identifying human actions using CSI data. Full article

(This article belongs to the Special Issue Human Activity Recognition (HAR) in Healthcare)

► Show Figures

Figure 1

17 pages, 5406 KB

Open AccessArticle

Bi-LS-AttM: A Bidirectional LSTM and Attention Mechanism Model for Improving Image Captioning

by Tian Xie, Weiping Ding, Jinbao Zhang, Xusen Wan and Jiehua Wang

Appl. Sci. 2023, 13(13), 7916; https://doi.org/10.3390/app13137916 - 6 Jul 2023

Cited by 16 | Viewed by 4575

Abstract

The discipline of automatic image captioning represents an integration of two pivotal branches of artificial intelligence, namely computer vision (CV) and natural language processing (NLP). The principal functionality of this technology lies in transmuting the extracted visual features into semantic information of a [...] Read more.

The discipline of automatic image captioning represents an integration of two pivotal branches of artificial intelligence, namely computer vision (CV) and natural language processing (NLP). The principal functionality of this technology lies in transmuting the extracted visual features into semantic information of a higher order. The bidirectional long short-term memory (Bi-LSTM) has garnered wide acceptance in executing image captioning tasks. Of late, scholarly attention has been focused on modifying suitable models for innovative and precise subtitle captions, although tuning the parameters of the model does not invariably yield optimal outcomes. Given this, the current research proposes a model that effectively employs the bidirectional LSTM and attention mechanism (Bi-LS-AttM) for image captioning endeavors. This model exploits the contextual comprehension from both anterior and posterior aspects of the input data, synergistically with the attention mechanism, thereby augmenting the precision of visual language interpretation. The distinctiveness of this research is embodied in its incorporation of Bi-LSTM and the attention mechanism to engender sentences that are both structurally innovative and accurately reflective of the image content. To enhance temporal efficiency and accuracy, this study substitutes convolutional neural networks (CNNs) with fast region-based convolutional networks (Fast RCNNs). Additionally, it refines the process of generation and evaluation of common space, thus fostering improved efficiency. Our model was tested for its performance on Flickr30k and MSCOCO datasets (80 object categories). Comparative analyses of performance metrics reveal that our model, leveraging the Bi-LS-AttM, surpasses unidirectional and Bi-LSTM models. When applied to caption generation and image-sentence retrieval tasks, our model manifests time economies of approximately 36.5% and 26.3% vis-a-vis the Bi-LSTM model and the deep Bi-LSTM model, respectively. Full article

(This article belongs to the Special Issue Recent Trends in Automatic Image Captioning Systems)

► Show Figures

Figure 1

17 pages, 5625 KB

Open AccessArticle

Sentiment Analysis of Comment Data Based on BERT-ETextCNN-ELSTM

by Lujuan Deng, Tiantian Yin, Zuhe Li and Qingxia Ge

Electronics 2023, 12(13), 2910; https://doi.org/10.3390/electronics12132910 - 3 Jul 2023

Cited by 11 | Viewed by 4554

Abstract

With the rapid popularity and continuous development of social networks, users’ communication and interaction through platforms such as microblogs and forums have become more and more frequent. The comment data on these platforms reflect users’ opinions and sentiment tendencies, and sentiment analysis of [...] Read more.

With the rapid popularity and continuous development of social networks, users’ communication and interaction through platforms such as microblogs and forums have become more and more frequent. The comment data on these platforms reflect users’ opinions and sentiment tendencies, and sentiment analysis of comment data has become one of the hot spots and difficulties in current research. In this paper, we propose a BERT-ETextCNN-ELSTM (Bidirectional Encoder Representations from Transformers–Enhanced Convolution Neural Networks–Enhanced Long Short-Term Memory) model for sentiment analysis. The model takes text after word embedding and BERT encoder processing and feeds it to an optimized CNN layer for convolutional operations in order to extract local features of the text. The features from the CNN layer are then fed into the LSTM layer for time-series modeling to capture long-term dependencies in the text. The experimental results proved that compared with TextCNN (Convolution Neural Networks), LSTM (Long Short-Term Memory), TextCNN-LSTM (Convolution Neural Networks–Long Short-Term Memory), and BiLSTM-ATT (Bidirectional Long Short-Term Memory Network–Attention), the model proposed in this paper was more effective in sentiment analysis. In the experimental data, the model reached a maximum of 0.89, 0.88, and 0.86 in terms of accuracy, F1 value, and macro-average F1 value, respectively, on both datasets, proving that the model proposed in this paper was more effective in sentiment analysis of comment data. The proposed model achieved better performance in the review sentiment analysis task and significantly outperformed the other comparable models. Full article

(This article belongs to the Special Issue AI-Driven Network Security and Privacy)

► Show Figures

Figure 1

15 pages, 3510 KB

Open AccessArticle

Text Emotion Recognition Based on XLNet-BiGRU-Att

by Tian Han, Zhu Zhang, Mingyuan Ren, Changchun Dong, Xiaolin Jiang and Quansheng Zhuang

Electronics 2023, 12(12), 2704; https://doi.org/10.3390/electronics12122704 - 16 Jun 2023

Cited by 16 | Viewed by 3206

Abstract

Text emotion recognition (TER) is an important natural language processing (NLP) task which is widely used in human–computer interaction, public opinion analysis, mental health analysis, and social network analysis. In this paper, a deep learning model based on XLNet with bidirectional recurrent unit [...] Read more.

Text emotion recognition (TER) is an important natural language processing (NLP) task which is widely used in human–computer interaction, public opinion analysis, mental health analysis, and social network analysis. In this paper, a deep learning model based on XLNet with bidirectional recurrent unit and attention mechanism (XLNet-BiGRU-Att) is proposed in order to improve TER performance. XLNet is used to build bidirectional language models which can learn contextual information simultaneously, while the bidirectional gated recurrent unit (BiGRU) helps to extract more effective features which can pay attention to current and previous states using hidden layers and the attention mechanism (Att) provides different weights to enhance the ’attention’ paid to important information, thereby improving the quality of word vectors and the accuracy of sentiment analysis model judgments. The proposed model composed of XLNet, BiGRU, and Att improves performance on the whole TER task. Experiments on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) database and the Chinese Academy of Sciences Institute of Automation (CASIA) dataset were carried out to compare XLNet-BiGRU-Att, XLNet, BERT, and BERT-BiLSTM, and the results show that the model proposed in this paper has superior performance compared to the others. Full article

(This article belongs to the Special Issue Deep Learning in Image Processing and Pattern Recognition)

► Show Figures

Figure 1

Search Results (33)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (33)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI