MDPI - Publisher of Open Access Journals

28 pages, 16728 KB

Open AccessArticle

Deep Learning-Based DNA Methylation Detection in Cervical Cancer Using the One-Hot Character Representation Technique

by Apoorva, Vikas Handa, Shalini Batra and Vinay Arora

Diagnostics 2025, 15(17), 2263; https://doi.org/10.3390/diagnostics15172263 - 7 Sep 2025

Viewed by 76

Background: Cervical cancer is among the most prevalent malignancies in women worldwide, and early detection of epigenetic alterations such as Deoxyribose Nucleic Acid (DNA) methylation is of utmost significance for improving clinical results. This study introduces a novel deep learning-based framework for [...] Read more.

Background: Cervical cancer is among the most prevalent malignancies in women worldwide, and early detection of epigenetic alterations such as Deoxyribose Nucleic Acid (DNA) methylation is of utmost significance for improving clinical results. This study introduces a novel deep learning-based framework for predicting DNA methylation in cervical cancer, utilizing a UNet architecture integrated with an innovative one-hot character encoding technique. Methods: Two encoding strategies, monomer and dimer, were systematically evaluated for their ability to capture discriminative features from DNA sequences. Experiments were conducted on Cytosine–Guanine (CG) sites using varying sequence window sizes of 100 bp, 200 bp, and 300 bp, and sample sizes of 5000, 10,000, and 20,000. Model validation was performed on promoter regions of five cervical cancer-associated genes: miR-100, miR-138, miR-484, hTERT, and ERVH48-1. Results: The dimer encoding strategy, combined with a 300-base pair window and 5000 CG sites, emerged as the optimal configuration. The proposed framework demonstrated better predictive performance, with an accuracy of 91.60%, sensitivity of 96.71%, specificity of 87.32%, and an Area Under the Receiver Operating Characteristic (AUROC) score of 96.53, significantly outperforming benchmark deep learning models, including Convolutional Neural Networks and MobileNet. Validation on promoter regions further confirmed the robustness of the model, as it accurately identified 86.27% of methylated CG sites and maintained a strong AUROC of 83.99, demonstrating its precision–recall balance and practical relevance during validation in promoter-region genes. Conclusions: These findings establish the potential of the proposed UNet-based approach as a reliable and scalable tool for early detection of epigenetic modifications. Thus, the work contributes significantly to improving biomarker discovery and diagnostics in cervical cancer research. Full article

(This article belongs to the Special Issue Diagnosis and Management of Gynecological Cancers: Third Edition)

► Show Figures

Figure 1

16 pages, 1007 KB

Open AccessArticle

Learning SMILES Semantics: Word2Vec and Transformer Embeddings for Molecular Property Prediction

by Saya Hashemian, Zak Khan, Pulkit Kalhan and Yang Liu

Algorithms 2025, 18(9), 547; https://doi.org/10.3390/a18090547 - 1 Sep 2025

Viewed by 290

Abstract

This paper investigates the effectiveness of Word2Vec-based molecular representation learning on SMILES (Simplified Molecular Input Line Entry System) strings for a downstream prediction task related to the market approvability of chemical compounds. Here, market approvability is treated as a proxy classification label derived [...] Read more.

This paper investigates the effectiveness of Word2Vec-based molecular representation learning on SMILES (Simplified Molecular Input Line Entry System) strings for a downstream prediction task related to the market approvability of chemical compounds. Here, market approvability is treated as a proxy classification label derived from approval status, where only the molecular structure is analyzed. We train character-level embeddings using Continuous Bag of Words (CBOW) and Skip-Gram with Negative Sampling architectures and apply the resulting embeddings in a downstream classification task using a multi-layer perceptron (MLP). To evaluate the utility of these lightweight embedding techniques, we conduct experiments on a curated SMILES dataset labeled by approval status under both imbalanced and SMOTE-balanced training conditions. In addition to our Word2Vec-based models, we include a ChemBERTa-based baseline using the pretrained ChemBERTa-77M model. Our findings show that while ChemBERTa achieves a higher performance, the Word2Vec-based models offer a favorable trade-off between accuracy and computational efficiency. This efficiency is especially relevant in large-scale compound screening, where rapid exploration of the chemical space can support early-stage cheminformatics workflows. These results suggest that traditional embedding models can serve as viable alternatives for scalable and interpretable cheminformatics pipelines, particularly in resource-constrained environments. Full article

(This article belongs to the Special Issue Mathematical Modelling in Engineering and Human Behaviour (3rd Edition))

► Show Figures

Figure 1

29 pages, 2570 KB

Open AccessArticle

Detecting Zero-Day Web Attacks with an Ensemble of LSTM, GRU, and Stacked Autoencoders

by Vahid Babaey and Hamid Reza Faragardi

Computers 2025, 14(6), 205; https://doi.org/10.3390/computers14060205 - 26 May 2025

Cited by 4 | Viewed by 2218

Abstract

The increasing sophistication of web-based services has intensified the risk of zero-day attacks, exposing critical vulnerabilities in user information security. Traditional detection systems often rely on labeled attack data and struggle to identify novel threats without prior knowledge. This paper introduces a novel [...] Read more.

The increasing sophistication of web-based services has intensified the risk of zero-day attacks, exposing critical vulnerabilities in user information security. Traditional detection systems often rely on labeled attack data and struggle to identify novel threats without prior knowledge. This paper introduces a novel one-class ensemble method for detecting zero-day web attacks, combining the strengths of Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and stacked autoencoders through latent representation concatenation and compression. Additionally, a structured tokenization strategy based on character-level analysis is employed to enhance input consistency and reduce feature dimensionality. The proposed method was evaluated using the CSIC 2012 dataset, achieving 97.58% accuracy, 97.52% recall, 99.76% specificity, and 99.99% precision, with a false positive rate of just 0.2%. Compared to conventional ensemble techniques like majority voting, our approach demonstrates superior anomaly detection performance by fusing diverse feature representations at the latent level rather than the output level. These results highlight the model’s effectiveness in accurately detecting unknown web attacks with low false positives, addressing major limitations of existing detection frameworks. Full article

(This article belongs to the Special Issue Using New Technologies in Cyber Security Solutions (2nd Edition))

► Show Figures

Figure 1

29 pages, 18881 KB

Open AccessArticle

A Novel Entropy-Based Approach for Thermal Image Segmentation Using Multilevel Thresholding

by Thaweesak Trongtirakul, Karen Panetta, Artyom M. Grigoryan and Sos S. Agaian

Entropy 2025, 27(5), 526; https://doi.org/10.3390/e27050526 - 14 May 2025

Viewed by 1049

Abstract

Image segmentation is a fundamental challenge in computer vision, transforming complex image representations into meaningful, analyzable components. While entropy-based multilevel thresholding techniques, including Otsu, Shannon, fuzzy, Tsallis, Renyi, and Kapur approaches, have shown potential in image segmentation, they encounter significant limitations when processing [...] Read more.

Image segmentation is a fundamental challenge in computer vision, transforming complex image representations into meaningful, analyzable components. While entropy-based multilevel thresholding techniques, including Otsu, Shannon, fuzzy, Tsallis, Renyi, and Kapur approaches, have shown potential in image segmentation, they encounter significant limitations when processing thermal images, such as poor spatial resolution, low contrast, lack of color and texture information, and susceptibility to noise and background clutter. This paper introduces a novel adaptive unsupervised entropy algorithm (A-Entropy) to enhance multilevel thresholding for thermal image segmentation. Our key contributions include (i) an image-dependent thermal enhancement technique specifically designed for thermal images to improve visibility and contrast in regions of interest, (ii) a so-called A-Entropy concept for unsupervised thermal image thresholding, and (iii) a comprehensive evaluation using the Benchmarking IR Dataset for Surveillance with Aerial Intelligence (BIRDSAI). Experimental results demonstrate the superiority of our proposal compared to other state-of-the-art methods on the BIRDSAI dataset, which comprises both real and synthetic thermal images with substantial variations in scale, contrast, background clutter, and noise. Comparative analysis indicates improved segmentation accuracy and robustness compared to traditional entropy-based methods. The framework’s versatility suggests promising applications in brain tumor detection, optical character recognition, thermal energy leakage detection, and face recognition. Full article

► Show Figures

Figure 1

21 pages, 2611 KB

Open AccessArticle

Deep Learning-Based Short Text Summarization: An Integrated BERT and Transformer Encoder–Decoder Approach

by Fahd A. Ghanem, M. C. Padma, Hudhaifa M. Abdulwahab and Ramez Alkhatib

Computation 2025, 13(4), 96; https://doi.org/10.3390/computation13040096 - 12 Apr 2025

Viewed by 2253

Abstract

The field of text summarization has evolved from basic extractive methods that identify key sentences to sophisticated abstractive techniques that generate contextually meaningful summaries. In today’s digital landscape, where an immense volume of textual data is produced every day, the need for concise [...] Read more.

The field of text summarization has evolved from basic extractive methods that identify key sentences to sophisticated abstractive techniques that generate contextually meaningful summaries. In today’s digital landscape, where an immense volume of textual data is produced every day, the need for concise and coherent summaries is more crucial than ever. However, summarizing short texts, particularly from platforms like Twitter, presents unique challenges due to character constraints, informal language, and noise from elements such as hashtags, mentions, and URLs. To overcome these challenges, this paper introduces a deep learning framework for automated short text summarization on Twitter. The proposed approach combines bidirectional encoder representations from transformers (BERT) with a transformer-based encoder–decoder architecture (TEDA), incorporating an attention mechanism to improve contextual understanding. Additionally, long short-term memory (LSTM) networks are integrated within BERT to effectively capture long-range dependencies in tweets and their summaries. This hybrid model ensures that generated summaries remain informative, concise, and contextually relevant while minimizing redundancy. The performance of the proposed framework was assessed using three benchmark Twitter datasets—Hagupit, SHShoot, and Hyderabad Blast—with ROUGE scores serving as the evaluation metric. Experimental results demonstrate that the model surpasses existing approaches in accurately capturing key information from tweets. These findings underscore the framework’s effectiveness in automated short text summarization, offering a robust solution for efficiently processing and summarizing large-scale social media content. Full article

(This article belongs to the Section Computational Engineering)

► Show Figures

Figure 1

13 pages, 1802 KB

Open AccessArticle

The Representation of Orientation Semantics in Visual Sensory Memory

by Jingjing Hu, Xutao Zheng and Haokui Xu

Behav. Sci. 2025, 15(1), 1; https://doi.org/10.3390/bs15010001 - 24 Dec 2024

Viewed by 911

Abstract

Visual sensory memory constructs representations of the physical information of visual objects. However, few studies have investigated whether abstract information, such as semantic information, is also involved in these representations. This study utilized a masking technique combined with the partial report paradigm to [...] Read more.

Visual sensory memory constructs representations of the physical information of visual objects. However, few studies have investigated whether abstract information, such as semantic information, is also involved in these representations. This study utilized a masking technique combined with the partial report paradigm to examine whether visual sensory memory representation contains semantic information. Here, we regarded the concept of orientation carried by the visual stimulus as semantic information. In three experiments, participants were asked to remember the orientation of arrows. Visual stimuli with orientation information (triangles, rectangles, and Chinese characters) and without orientation information (circles, squares, and different Chinese characters) were used as masks. The results showed that memory performance was worse when masks contained orientation information compared to when they did not, as similar orientation semantic information between masks and targets created visual representation conflicts. These findings suggest that visual sensory memory representation includes the semantic information of orientation. Full article

► Show Figures

Figure 1

15 pages, 1505 KB

Open AccessArticle

SVTR-SRNet: A Deep Learning Model for Scene Text Recognition via SVTR Framework and Spatial Reduction Mechanism

by Ming Zhao, Yalong Li, Chaolin Zhang, Quan Du and Shenglung Peng

Electronics 2024, 13(23), 4756; https://doi.org/10.3390/electronics13234756 - 2 Dec 2024

Viewed by 2007

Abstract

Most deep learning models suffer from the problems of large computational complexity and insufficient feature extraction. To achieve a dynamic balance and tradeoff between computational complexity and performance, an enhanced SVTR-based scene text recognition model (SVTR-SRNet) was designed in this paper. In the [...] Read more.

Most deep learning models suffer from the problems of large computational complexity and insufficient feature extraction. To achieve a dynamic balance and tradeoff between computational complexity and performance, an enhanced SVTR-based scene text recognition model (SVTR-SRNet) was designed in this paper. In the SVTR-SRNet, we first created a bottom-up jump connection network that increases the number of information transfer pathways between the top and bottom features and improves the accuracy of information extraction. Second, we modified the attention mechanism by adding a new intermediate parameter called SR(Q) (Spatial Reduction (Q)), which finds a suitable compromise between the representational power and computing efficiency. In contrast to the conventional attention mechanism, the novel technique maintains the ability to model the global context while also enhancing efficiency. Ultimately, we developed a novel adaptive hybrid loss function to mitigate the shortcomings of a singular loss function’s inadequate generalization capacity and enhance the model’s resilience in handling a variety of challenging scenarios. Our technique outperforms existing standard models in terms of recognition performance on both the English and Chinese datasets, which deal with a high number of similar characters. As the model possesses great efficiency and outstanding cross-linguistic adaptability, it has a wide range of practical applications. Full article

(This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence)

► Show Figures

Figure 1

29 pages, 1792 KB

Open AccessArticle

AbstractTrace: The Use of Execution Traces to Cluster, Classify, Prioritize, and Optimize a Bloated Test Suite

by Ziad A. Al-Sharif and Clinton L. Jeffery

Appl. Sci. 2024, 14(23), 11168; https://doi.org/10.3390/app142311168 - 29 Nov 2024

Cited by 1 | Viewed by 888

Abstract

Due to the incremental and iterative nature of the software testing process, a test suite may become bloated with redundant, overlapping, and similar test cases. This paper aims to optimize a bloated test suite by employing an execution trace that encodes runtime events [...] Read more.

Due to the incremental and iterative nature of the software testing process, a test suite may become bloated with redundant, overlapping, and similar test cases. This paper aims to optimize a bloated test suite by employing an execution trace that encodes runtime events into a sequence of characters forming a string. A dataset of strings, each of which represents the code coverage and execution behavior of a test case, is analyzed to identify similarities between test cases. This facilitates the de-bloating process by providing a formal mechanism to identify, remove, and reduce extra test cases without compromising software quality. This form of analysis allows for the clustering and classification of test cases based on their code coverage and similarity score. This paper explores three levels of execution traces and evaluates different techniques to measure their similarities. Test cases with the same code coverage should generate the exact string representation of runtime events. Various string similarity metrics are assessed to find the similarity score, which is used to classify, detect, and rank test cases accordingly. Additionally, this paper demonstrates the validity of the approach with two case studies. The first shows how to classify the execution behavior of various test cases, which can provide insight into each test case’s internal behavior. The second shows how to identify similar test cases based on their code coverage. Full article

(This article belongs to the Special Issue Artificial Intelligence in Software Engineering)

► Show Figures

Figure 1

16 pages, 4090 KB

Open AccessArticle

Enhancing Chinese Dialogue Generation with Word–Phrase Fusion Embedding and Sparse SoftMax Optimization

by Shenrong Lv, Siyu Lu, Ruiyang Wang, Lirong Yin, Zhengtong Yin, Salman A. AlQahtani, Jiawei Tian and Wenfeng Zheng

Systems 2024, 12(12), 516; https://doi.org/10.3390/systems12120516 - 24 Nov 2024

Cited by 3 | Viewed by 947

Abstract

Chinese dialogue generation faces multiple challenges, such as semantic understanding, information matching, and response fluency. Generative dialogue systems for Chinese conversation are somehow difficult to construct because of the flexible word order, the great impact of word replacement on semantics, and the complex [...] Read more.

Chinese dialogue generation faces multiple challenges, such as semantic understanding, information matching, and response fluency. Generative dialogue systems for Chinese conversation are somehow difficult to construct because of the flexible word order, the great impact of word replacement on semantics, and the complex implicit context. Existing methods still have limitations in addressing these issues. To tackle these problems, this paper proposes an improved Chinese dialogue generation model based on transformer architecture. The model uses a multi-layer transformer decoder as the backbone and introduces two key techniques, namely incorporating pre-trained language model word embeddings and optimizing the sparse Softmax loss function. For word-embedding fusion, we concatenate the word vectors from the pre-trained model with character-based embeddings to enhance the semantic information of word representations. The sparse Softmax optimization effectively mitigates the overfitting issue by introducing a sparsity regularization term. Experimental results on the Chinese short text conversation (STC) dataset demonstrate that our proposed model significantly outperforms the baseline models on automatic evaluation metrics, such as BLEU and Distinct, with an average improvement of 3.5 percentage points. Human evaluations also validate the superiority of our model in generating fluent and relevant responses. This work provides new insights and solutions for building more intelligent and human-like Chinese dialogue systems. Full article

(This article belongs to the Section Artificial Intelligence and Digital Systems Engineering)

► Show Figures

Figure 1

23 pages, 315 KB

Open AccessArticle

Lucidity of Space and Gendered Performativity in Arabic Digital Literature

by Manal al-Natour

Humanities 2024, 13(5), 112; https://doi.org/10.3390/h13050112 - 2 Sep 2024

Viewed by 2049

Abstract

This article seeks to examine a new trend in Arabic women’s literature that not only aims to forge women’s communities but also creates resistance. Digital media is the mechanism that some Arab women authors employ to implement and foster a self-authority that acknowledges [...] Read more.

This article seeks to examine a new trend in Arabic women’s literature that not only aims to forge women’s communities but also creates resistance. Digital media is the mechanism that some Arab women authors employ to implement and foster a self-authority that acknowledges flexible identities in an age of revolutions and search for freedom. As a case study, I examine Ahlam Mosteghanemi’s Nessayne com and Rajaa Alsanea’s novel Girls of Riyadh, which originally appearing as compendiums, and Ibrahim Alsaqir’s novel Girls of Riyadh: The Complete Picture that comes as a literary response to the resistance of cultural and gender establishments. I suggest that the digital realm provides an arena for women to resist oppressing social establishments and that literary works and digital practices like Alsanea’s create spaces of and for resistance. Moreover, Alsanea’s and Mosteghanemi’s works are committed to promoting change in Arab societies, bridging the public and the private sphere by means of digital content. Arab women writers’ sites and blogs address subjects that challenge prevalent gendered structures in the Arab world, deconstruct cultural norms, give visibility and focus on the implications of gender on memory, love, masculinity and femininity, and sexuality. They do so by employing chats as a narrative technique that engages readers and women’s communities in the characters’ experiences and thereby inviting them to participate in making their work a site of challenge to gender and cultural establishments. As Alsanea’s representations of women subjectivities are uncommon and her characters defy the notion of the universality of woman as a shared gender, they are prohibited, criticized, and challenged. Those who defy gender performativity, such as Alsanea and Mosteghanemi, enact feminist resistance. The study engages with MENA gender and masculinity literature. It is also informed by Judith Butler’s notion of performativity, the construction of gender, and the demystification of the universalistic notion of “woman”. Full article

21 pages, 2246 KB

Open AccessArticle

A Novel Rational Medicine Use System Based on Domain Knowledge Graph

by Chaoping Qin, Zhanxiang Wang, Jingran Zhao, Luyi Liu, Feng Xiao and Yi Han

Electronics 2024, 13(16), 3156; https://doi.org/10.3390/electronics13163156 - 9 Aug 2024

Cited by 2 | Viewed by 1512

Abstract

Medication errors, which could often be detected in advance, are a significant cause of patient deaths each year, highlighting the critical importance of medication safety. The rapid advancement of data analysis technologies has made intelligent medication assistance applications possible, and these applications rely [...] Read more.

Medication errors, which could often be detected in advance, are a significant cause of patient deaths each year, highlighting the critical importance of medication safety. The rapid advancement of data analysis technologies has made intelligent medication assistance applications possible, and these applications rely heavily on medical knowledge graphs. However, current knowledge graph construction techniques are predominantly focused on general domains, leaving a gap in specialized fields, particularly in the medical domain for medication assistance. The specialized nature of medical knowledge and the distinct distribution of vocabulary between general and biomedical texts pose challenges. Applying general natural language processing techniques directly to the medical domain often results in lower accuracy due to the inadequate utilization of contextual semantics and entity information. To address these issues and enhance knowledge graph production, this paper proposes an optimized model for named entity recognition and relationship extraction in the Chinese medical domain. Key innovations include utilizing Medical Bidirectional Encoder Representations from Transformers (MCBERT) for character-level embeddings pre-trained on Chinese biomedical corpora, employing Bi-directional Gated Recurrent Unit (BiGRU) networks for extracting enriched contextual features, integrating a Conditional Random Field (CRF) layer for optimal label sequence output, using the Piecewise Convolutional Neural Network (PCNN) to capture comprehensive semantic information and fusing it with entity features for better classification accuracy, and implementing a microservices architecture for the medication assistance review system. These enhancements significantly improve the accuracy of entity relationship classification in Chinese medical texts. The model achieved good performance in recognizing most entity types, with an accuracy of 88.3%, a recall rate of 85.8%, and an F1 score of 87.0%. In the relationship extraction stage, the accuracy reached 85.7%, the recall rate 82.5%, and the F1 score 84.0%. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

18 pages, 2938 KB

Open AccessArticle

Facial Animation Strategies for Improved Emotional Expression in Virtual Reality

by Hyewon Song and Beom Kwon

Electronics 2024, 13(13), 2601; https://doi.org/10.3390/electronics13132601 - 2 Jul 2024

Cited by 4 | Viewed by 3695

Abstract

The portrayal of emotions by virtual characters is crucial in virtual reality (VR) communication. Effective communication in VR relies on a shared understanding, which is significantly enhanced when virtual characters authentically express emotions that align with their spoken words. While human emotions are [...] Read more.

The portrayal of emotions by virtual characters is crucial in virtual reality (VR) communication. Effective communication in VR relies on a shared understanding, which is significantly enhanced when virtual characters authentically express emotions that align with their spoken words. While human emotions are often conveyed through facial expressions, existing facial animation techniques have mainly focused on lip-syncing and head movements to improve naturalness. This study investigates the influence of various factors in facial animation on the emotional representation of virtual characters. We conduct a comparative and analytical study using an audio-visual database, examining the impact of different animation factors. To this end, we utilize a total of 24 voice samples, representing 12 different speakers, with each emotional voice segment lasting approximately 4–5 s. Using these samples, we design six perceptual experiments to investigate the impact of facial cues—including facial expression, lip movement, head motion, and overall appearance—on the expression of emotions by virtual characters. Additionally, we engaged 20 participants to evaluate and select appropriate combinations of facial expressions, lip movements, head motions, and appearances that align with the given emotion and its intensity. Our findings indicate that emotional representation in virtual characters is closely linked to facial expressions, head movements, and overall appearance. Conversely, lip-syncing, which has been a primary focus in prior studies, seems less critical for conveying emotions, as its accuracy is difficult to perceive with the naked eye. The results of our study can significantly benefit the VR community by aiding in the development of virtual characters capable of expressing a diverse range of emotions. Full article

► Show Figures

Figure 1

24 pages, 545 KB

Open AccessArticle

Neural Architecture Comparison for Bibliographic Reference Segmentation: An Empirical Study

by Rodrigo Cuéllar Hidalgo, Raúl Pinto Elías, Juan-Manuel Torres-Moreno, Osslan Osiris Vergara Villegas , Gerardo Reyes Salgado and Andrea Magadán Salazar

Data 2024, 9(5), 71; https://doi.org/10.3390/data9050071 - 18 May 2024

Viewed by 2204

Abstract

In the realm of digital libraries, efficiently managing and accessing scientific publications necessitates automated bibliographic reference segmentation. This study addresses the challenge of accurately segmenting bibliographic references, a task complicated by the varied formats and styles of references. Focusing on the empirical evaluation [...] Read more.

In the realm of digital libraries, efficiently managing and accessing scientific publications necessitates automated bibliographic reference segmentation. This study addresses the challenge of accurately segmenting bibliographic references, a task complicated by the varied formats and styles of references. Focusing on the empirical evaluation of Conditional Random Fields (CRF), Bidirectional Long Short-Term Memory with CRF (BiLSTM + CRF), and Transformer Encoder with CRF (Transformer + CRF) architectures, this research employs Byte Pair Encoding and Character Embeddings for vector representation. The models underwent training on the extensive Giant corpus and subsequent evaluation on the Cora Corpus to ensure a balanced and rigorous comparison, maintaining uniformity across embedding layers, normalization techniques, and Dropout strategies. Results indicate that the BiLSTM + CRF architecture outperforms its counterparts by adeptly handling the syntactic structures prevalent in bibliographic data, achieving an F1-Score of 0.96. This outcome highlights the necessity of aligning model architecture with the specific syntactic demands of bibliographic reference segmentation tasks. Consequently, the study establishes the BiLSTM + CRF model as a superior approach within the current state-of-the-art, offering a robust solution for the challenges faced in digital library management and scholarly communication. Full article

(This article belongs to the Special Issue Advances in Text Mining Techniques and Applications for Knowledge Discovery)

► Show Figures

Figure 1

13 pages, 2651 KB

Open AccessArticle

Speech Recognition for Air Traffic Control Utilizing a Multi-Head State-Space Model and Transfer Learning

by Haijun Liang, Hanwen Chang and Jianguo Kong

Aerospace 2024, 11(5), 390; https://doi.org/10.3390/aerospace11050390 - 14 May 2024

Cited by 1 | Viewed by 1987

Abstract

In the present study, a novel end-to-end automatic speech recognition (ASR) framework, namely, ResNeXt-Mssm-CTC, has been developed for air traffic control (ATC) systems. This framework is built upon the Multi-Head State-Space Model (Mssm) and incorporates transfer learning techniques. Residual Networks with Cardinality (ResNeXt) [...] Read more.

In the present study, a novel end-to-end automatic speech recognition (ASR) framework, namely, ResNeXt-Mssm-CTC, has been developed for air traffic control (ATC) systems. This framework is built upon the Multi-Head State-Space Model (Mssm) and incorporates transfer learning techniques. Residual Networks with Cardinality (ResNeXt) employ multi-layered convolutions with residual connections to augment the extraction of intricate feature representations from speech signals. The Mssm is endowed with specialized gating mechanisms, which incorporate parallel heads that acquire knowledge of both local and global temporal dynamics in sequence data. Connectionist temporal classification (CTC) is utilized in the context of sequence labeling, eliminating the requirement for forced alignment and accommodating labels of varying lengths. Moreover, the utilization of transfer learning has been shown to improve performance on the target task by leveraging knowledge acquired from a source task. The experimental results indicate that the model proposed in this study exhibits superior performance compared to other baseline models. Specifically, when pretrained on the Aishell corpus, the model achieves a minimum character error rate (CER) of 7.2% and 8.3%. Furthermore, when applied to the ATC corpus, the CER is reduced to 5.5% and 6.7%. Full article

(This article belongs to the Special Issue Application of Multidisciplinary Optimization and Artificial Intelligence Techniques to Aerospace Engineering (Volume II))

► Show Figures

Figure 1

34 pages, 7984 KB

Open AccessArticle

Assessment of Coastal Cultural Ecosystem Services and Well-Being for Integrating Stakeholder Values into Coastal Planning

by Kristina Veidemane, Agnese Reke, Anda Ruskule and Ivo Vinogradovs

Land 2024, 13(3), 362; https://doi.org/10.3390/land13030362 - 13 Mar 2024

Cited by 5 | Viewed by 2446

Abstract

Coastal areas provide ecosystem services (ES), including a wide range of cultural ecosystem services (CES). This study aims to operationalize the ES approach for integrated assessment and mapping of coastal CES through the case of the eastern Baltic Sea coast in Latvia. It [...] Read more.

Coastal areas provide ecosystem services (ES), including a wide range of cultural ecosystem services (CES). This study aims to operationalize the ES approach for integrated assessment and mapping of coastal CES through the case of the eastern Baltic Sea coast in Latvia. It explores an interdisciplinary approach to enhance coastal planning, leveraging the strengths of plural disciplines to ensure a more holistic representation of coastal CES. A set of methods and techniques from landscape ecology (e.g., landscape characterization, quality assessment, biophysical mapping) and social sciences (participatory GIS, stakeholder engagement events, nationwide survey) are developed and tested, particularly demonstrating links and correlations between landscape character and CES values and well-being dimensions. The results illuminate the main perceived well-being benefits that people gain from the coastal areas, highlighting the different perspectives of stakeholders. Finally, the integrated assessment results helped to construct proposals for sustainable tourism development in the area. The outcomes of the study are intended to assist planners and decision-makers in evaluating the potential for development and trade-offs in coastal regions. This research contributes to the advancement of coastal spatial planning methodologies, emphasizing the importance of stakeholder engagement and ES assessment for informed decision-making. Full article

(This article belongs to the Special Issue Ecological and Cultural Ecosystem Services in Coastal Areas)

► Show Figures

Figure 1

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI