MDPI - Publisher of Open Access Journals

27 pages, 4269 KB

Open AccessArticle

A Self-Supervised Method for Speaker Recognition in Real Sound Fields with Low SNR and Strong Reverberation

by Xuan Zhang, Jun Tang, Huiliang Cao, Chenguang Wang, Chong Shen and Jun Liu

Appl. Sci. 2025, 15(6), 2924; https://doi.org/10.3390/app15062924 - 7 Mar 2025

Cited by 1 | Viewed by 3034

Speaker recognition is essential in smart voice applications for personal identification. Current state-of-the-art techniques primarily focus on ideal acoustic conditions. However, the traditional spectrogram struggles to differentiate between noise, reverberation, and speech. To overcome this challenge, MFCC can be replaced with the output from a self-supervised learning model. This study introduces a TDNN enhanced with a pre-trained model for robust performance in noisy and reverberant environments, referred to as PNR-TDNN. The PNR-TDNN employs HuBERT as its backbone, while the TDNN is an improved ECAPA-TDNN. The pre-trained model employs the Canopy/Mini Batch k-means++ strategy. In the TDNN architecture, several enhancements are implemented, including a cross-channel fusion mechanism based on Res2Net. Additionally, a non-average attention mechanism is applied to the pooling operation, focusing on the weight information of each channel within the Squeeze-and-Excitation Net. Furthermore, the contribution of individual channels to the pooling of time-domain frames is enhanced by substituting attentive statistics with multi-head attention statistics. Validated by zhvoice in noisy conditions, the minimized PNR-TDNN demonstrates a 5.19% improvement in EER compared to CAM++. In more challenging environments with noise and reverberation, the minimized PNR-TDNN further improves EER by 3.71% and 9.6%, respectively, and MinDCF by 3.14% and 3.77%, respectively. The proposed method has also been validated on the VoxCeleb1 and cn-celeb_v2 datasets, representing a significant breakthrough in the field of speaker recognition under challenging conditions. This advancement is particularly crucial for enhancing safety and protecting personal identification in voice-enabled microphone applications. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

22 pages, 1165 KB

Open AccessArticle

Advanced Comparative Analysis of Machine Learning and Transformer Models for Depression and Suicide Detection in Social Media Texts

by Biodoumoye George Bokolo and Qingzhong Liu

Electronics 2024, 13(20), 3980; https://doi.org/10.3390/electronics13203980 - 10 Oct 2024

Cited by 5 | Viewed by 4313

Abstract

Depression detection through social media analysis has emerged as a promising approach for early intervention and mental health support. This study evaluates the performance of various machine learning and transformer models in identifying depressive content from tweets on X. Utilizing the Sentiment140 and the Suicide-Watch dataset, we built several models which include logistic regression, Bernoulli Naive Bayes, Random Forest, and transformer models such as RoBERTa, DeBERTa, DistilBERT, and SqueezeBERT to detect this content. Our findings indicate that transformer models outperform traditional machine learning algorithms, with RoBERTa and DeBERTa, when predicting depression and suicide rates. This performance is attributed to the transformers’ ability to capture contextual nuances in language. On the other hand, logistic regression models outperform transformers in another dataset with more accurate information. This is attributed to the traditional model’s ability to understand simple patterns especially when the classes are straighforward. We employed a comprehensive cross-validation approach to ensure robustness, with transformers demonstrating higher stability and reliability across splits. Despite limitations like dataset scope and computational constraints, the findings contribute significantly to mental health monitoring and suggest promising directions for future research and real-world applications in early depression detection and mental health screening tools. The various models used performed outstandingly. Full article

(This article belongs to the Special Issue Information Retrieval and Cyber Forensics with Data Science)

► Show Figures

Figure 1

20 pages, 1421 KB

Open AccessArticle

Deep Learning-Based Depression Detection from Social Media: Comparative Evaluation of ML and Transformer Techniques

by Biodoumoye George Bokolo and Qingzhong Liu

Electronics 2023, 12(21), 4396; https://doi.org/10.3390/electronics12214396 - 24 Oct 2023

Cited by 34 | Viewed by 16905

Abstract

Detecting depression from user-generated content on social media platforms has garnered significant attention due to its potential for the early identification and monitoring of mental health issues. This paper presents a comprehensive approach for depression detection from user tweets using machine learning techniques. The study utilizes a dataset of 632,000 tweets and employs data preprocessing, feature selection, and model training with logistic regression, Bernoulli Naive Bayes, random forests, DistilBERT, SqueezeBERT, DeBERTA, and RoBERTa models. Evaluation metrics such as accuracy, precision, recall, and F1 score are employed to assess the models’ performance. The results indicate that the RoBERTa model achieves the highest accuracy ratio of 0.981 and the highest mean accuracy of 0.97 (across 10 cross-validation folds) in detecting depression from tweets. This research demonstrates the effectiveness of machine learning and advanced transformer-based models in leveraging social media data for mental health analysis. The findings offer valuable insights into the potential for early detection and monitoring of depression using online platforms, contributing to the growing field of mental health analysis based on user-generated content. Full article

(This article belongs to the Special Issue AI in Knowledge-Based Information and Decision Support Systems)

► Show Figures

Figure 1

18 pages, 2828 KB

Open AccessArticle

Automatic Construction of Educational Knowledge Graphs: A Word Embedding-Based Approach

by Qurat Ul Ain, Mohamed Amine Chatti, Komlan Gluck Charles Bakar, Shoeb Joarder and Rawaa Alatrash

Information 2023, 14(10), 526; https://doi.org/10.3390/info14100526 - 27 Sep 2023

Cited by 25 | Viewed by 7065

Abstract

Knowledge graphs (KGs) are widely used in the education domain to offer learners a semantic representation of domain concepts from educational content and their relations, termed as educational knowledge graphs (EduKGs). Previous studies on EduKGs have incorporated concept extraction and weighting modules. However, these studies face limitations in terms of accuracy and performance. To address these challenges, this work aims to improve the concept extraction and weighting mechanisms by leveraging state-of-the-art word and sentence embedding techniques. Concretely, we enhance the SIFRank keyphrase extraction method by using SqueezeBERT and we propose a concept-weighting strategy based on SBERT. Furthermore, we conduct extensive experiments on different datasets, demonstrating significant improvements over several state-of-the-art keyphrase extraction and concept-weighting techniques. Full article

(This article belongs to the Special Issue Semantic Interoperability and Knowledge Building)

► Show Figures

Figure 1

22 pages, 724 KB

Open AccessArticle

Semantic Interest Modeling and Content-Based Scientific Publication Recommendation Using Word Embeddings and Sentence Encoders

by Mouadh Guesmi, Mohamed Amine Chatti, Lamees Kadhim, Shoeb Joarder and Qurat Ul Ain

Multimodal Technol. Interact. 2023, 7(9), 91; https://doi.org/10.3390/mti7090091 - 15 Sep 2023

Cited by 2 | Viewed by 4103

Abstract

The fast growth of data in the academic field has contributed to making recommendation systems for scientific papers more popular. Content-based filtering (CBF), a pivotal technique in recommender systems (RS), holds particular significance in the realm of scientific publication recommendations. In a content-based scientific publication RS, recommendations are composed by observing the features of users and papers. Content-based recommendation encompasses three primary steps, namely, item representation, user modeling, and recommendation generation. A crucial part of generating recommendations is the user modeling process. Nevertheless, this step is often neglected in existing content-based scientific publication RS. Moreover, most existing approaches do not capture the semantics of user models and papers. To address these limitations, in this paper we present a transparent Recommendation and Interest Modeling Application (RIMA), a content-based scientific publication RS that implicitly derives user interest models from their authored papers. To address the semantic issues, RIMA combines word embedding-based keyphrase extraction techniques with knowledge bases to generate semantically-enriched user interest models, and additionally leverages pretrained transformer sentence encoders to represent user models and papers and compute their similarities. The effectiveness of our approach was assessed through an offline evaluation by conducting extensive experiments on various datasets along with user study (N = 22), demonstrating that (a) combining SIFRank and SqueezeBERT as an embedding-based keyphrase extraction method with DBpedia as a knowledge base improved the quality of the user interest modeling step, and (b) using the msmarco-distilbert-base-tas-b sentence transformer model achieved better results in the recommendation generation step. Full article

► Show Figures

Figure 1

10 pages, 403 KB

Open AccessEditor’s ChoiceArticle

Performance Study on Extractive Text Summarization Using BERT Models

by Shehab Abdel-Salam and Ahmed Rafea

Information 2022, 13(2), 67; https://doi.org/10.3390/info13020067 - 28 Jan 2022

Cited by 78 | Viewed by 14444

Abstract

The task of summarization can be categorized into two methods, extractive and abstractive. Extractive summarization selects the salient sentences from the original document to form a summary while abstractive summarization interprets the original document and generates the summary in its own words. The task of generating a summary, whether extractive or abstractive, has been studied with different approaches in the literature, including statistical-, graph-, and deep learning-based approaches. Deep learning has achieved promising performances in comparison to the classical approaches, and with the advancement of different neural architectures such as the attention network (commonly known as the transformer), there are potential areas of improvement for the summarization task. The introduction of transformer architecture and its encoder model “BERT” produced an improved performance in downstream tasks in NLP. BERT is a bidirectional encoder representation from a transformer modeled as a stack of encoders. There are different sizes for BERT, such as BERT-base with 12 encoders and BERT-larger with 24 encoders, but we focus on the BERT-base for the purpose of this study. The objective of this paper is to produce a study on the performance of variants of BERT-based models on text summarization through a series of experiments, and propose “SqueezeBERTSum”, a trained summarization model fine-tuned with the SqueezeBERT encoder variant, which achieved competitive ROUGE scores retaining the BERTSum baseline model performance by 98%, with 49% fewer trainable parameters. Full article

(This article belongs to the Special Issue Novel Methods and Applications in Natural Language Processing)

► Show Figures

Figure 1

Search Results (6)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (6)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI