Skip to Content

1,303 Results Found

  • Article
  • Open Access
430 Views
34 Pages

Integration of Road Data Collected Using LSB Audio Steganography

  • Adam Stančić,
  • Ivan Grgurević,
  • Marko Matulin and
  • Marko Periša

Modern traffic-monitoring systems increasingly rely on supplemental analytical data to complement video recordings, yet such data are rarely integrated into video containers without altering the original footage. This paper proposes a lightweight aud...

  • Article
  • Open Access
16 Citations
3,786 Views
23 Pages

Direct Spread Spectrum Technology for Data Hiding in Audio

  • Alexandr Kuznetsov,
  • Alexander Onikiychuk,
  • Olga Peshkova,
  • Tomasz Gancarczyk,
  • Kornel Warwas and
  • Ruslana Ziubina

19 April 2022

Direct spread spectrum technology is traditionally used in radio communication systems with multiple access, for example, in CDMA standards, in global satellite navigation systems, in Wi-Fi network wireless protocols, etc. It ensures high security an...

  • Article
  • Open Access
57 Citations
8,074 Views
24 Pages

5 January 2022

Audio-visual emotion recognition is the research of identifying human emotional states by combining the audio modality and the visual modality simultaneously, which plays an important role in intelligent human-machine interactions. With the help of d...

  • Communication
  • Open Access
8 Citations
4,016 Views
9 Pages

13 January 2022

This paper proposes an audio data augmentation method based on deep learning in order to improve the performance of dereverberation. Conventionally, audio data are augmented using a room impulse response, which is artificially generated by some metho...

  • Article
  • Open Access
4 Citations
2,774 Views
16 Pages

12 October 2019

In the data-hiding field, it is mandatory that proposed schemes are key-secured as required by the Kerckhoff’s principle. Moreover, perceptual transparency must be guaranteed. On the other hand, volumetric attack is of special interest in audio...

  • Article
  • Open Access
2 Citations
4,362 Views
13 Pages

This article describes the development of the digital infrastructure at a research data centre for audio-visual linguistic research data, the Hamburg Centre for Language Corpora (HZSK) at the University of Hamburg in Germany, over the past ten years....

  • Article
  • Open Access
5 Citations
2,894 Views
13 Pages

29 June 2022

Understanding of the perception of emotions or affective states in humans is important to develop emotion-aware systems that work in realistic scenarios. In this paper, the perception of emotions in naturalistic human interaction (audio–visual...

  • Article
  • Open Access
21 Citations
4,612 Views
22 Pages

Discovering Speed Changes of Vehicles from Audio Data

  • Elżbieta Kubera,
  • Alicja Wieczorkowska,
  • Andrzej Kuranc and
  • Tomasz Słowik

11 July 2019

In this paper, we focus on detection of speed changes from audio data, representing recordings of cars passing a microphone placed near the road. The goal of this work is to observe the behavior of drivers near control points, in order to check wheth...

  • Article
  • Open Access
1 Citations
2,544 Views
17 Pages

Secure Audio-Visual Data Exchange for Android In-Vehicle Ecosystems

  • Alfred Anistoroaei,
  • Adriana Berdich,
  • Patricia Iosif and
  • Bogdan Groza

6 October 2021

Mobile device pairing inside vehicles is a ubiquitous task which requires easy to use and secure solutions. In this work we exploit the audio-video domain for pairing devices inside vehicles. In principle, we rely on the widely used elliptical curve...

  • Article
  • Open Access
16 Citations
8,470 Views
19 Pages

Open Set Audio Classification Using Autoencoders Trained on Few Data

  • Javier Naranjo-Alcazar,
  • Sergi Perez-Castanos,
  • Pedro Zuccarello,
  • Fabio Antonacci and
  • Maximo Cobos

3 July 2020

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a know...

  • Article
  • Open Access
6 Citations
4,248 Views
19 Pages

Parallel Simulation of Audio- and Radio-Magnetotelluric Data

  • Nikolay Yavich,
  • Mikhail Malovichko and
  • Arseny Shlykov

31 December 2019

This paper presents a novel numerical method for simulation controlled-source audio-magnetotellurics (CSAMT) and radio-magnetotellurics (CSRMT) data. These methods are widely used in mineral exploration. Interpretation of the CSAMT and CSRMT data col...

  • Article
  • Open Access
14 Citations
6,358 Views
15 Pages

Web Radio Automation for Audio Stream Management in the Era of Big Data

  • Nikolaos Vryzas,
  • Nikolaos Tsipas and
  • Charalampos Dimoulas

11 April 2020

Radio is evolving in a changing digital media ecosystem. Audio-on-demand has shaped the landscape of big unstructured audio data available online. In this paper, a framework for knowledge extraction is introduced, to improve discoverability and enric...

  • Article
  • Open Access
985 Views
17 Pages

Assessment of the Interdependencies Between High-Speed Videoendoscopy and Simultaneously Recorded Audio Data in Various Glottal Pathologies

  • Magdalena M. Pietrzak,
  • Wioletta Pietruszewska,
  • Magda Barańska,
  • Aleksander Rycerz,
  • Konrad Stawiski and
  • Ewa Niebudek-Bogusz

Background: This study aimed to investigate the relationships between kymographic parameters derived from high-speed videoendoscopy (HSV) and simultaneously recorded acoustic signals. The research provides insights into the vibratory dynamics of vari...

  • Article
  • Open Access
1 Citations
2,398 Views
11 Pages

24 December 2020

High-dimensional data recognition problem based on the Gaussian Mixture model has useful applications in many area, such as audio signal recognition, image analysis, and biological evolution. The expectation-maximization algorithm is a popular approa...

  • Article
  • Open Access
26 Citations
9,885 Views
19 Pages

Multimodal Sensing for Depression Risk Detection: Integrating Audio, Video, and Text Data

  • Zhenwei Zhang,
  • Shengming Zhang,
  • Dong Ni,
  • Zhaoguo Wei,
  • Kongjun Yang,
  • Shan Jin,
  • Gan Huang,
  • Zhen Liang,
  • Li Zhang and
  • Jianhong Wang
  • + 3 authors

7 June 2024

Depression is a major psychological disorder with a growing impact worldwide. Traditional methods for detecting the risk of depression, predominantly reliant on psychiatric evaluations and self-assessment questionnaires, are often criticized for thei...

  • Article
  • Open Access
15 Citations
4,058 Views
18 Pages

Evolutionary Detection Accuracy of Secret Data in Audio Steganography for Securing 5G-Enabled Internet of Things

  • Mohammed J. Alhaddad,
  • Monagi H. Alkinani,
  • Mohammed Salem Atoum and
  • Alaa Abdulsalm Alarood

14 December 2020

With the unprecedented growing demand for communication between Internet of Things (IoT) devices, the upcoming 5G and 6G technologies will pave the path to a widespread use of ultra-reliable low-latency applications in such networks. However, with mo...

  • Article
  • Open Access
8 Citations
5,934 Views
20 Pages

Whispered Speech Recognition Based on Audio Data Augmentation and Inverse Filtering

  • Jovan Galić,
  • Branko Marković,
  • Đorđe Grozdić,
  • Branislav Popović and
  • Slavko Šajić

12 September 2024

Modern Automatic Speech Recognition (ASR) systems are primarily designed to recognize normal speech. Due to a considerable acoustic mismatch between normal speech and whisper, ASR systems suffer from a significant loss of performance in whisper recog...

  • Article
  • Open Access
12 Citations
3,229 Views
14 Pages

28 January 2023

Chronic obstructive pulmonary disease (COPD) concerns the serious decline of human lung functions. These have emerged as one of the most concerning health conditions over the last two decades, after cancer around the world. The early diagnosis of COP...

  • Communication
  • Open Access
10 Citations
7,654 Views
12 Pages

3 November 2021

Distinguishing between a dangerous audio event like a gun firing and other non-life-threatening events, such as a plastic bag bursting, can mean the difference between life and death and, therefore, the necessary and unnecessary deployment of public...

  • Article
  • Open Access
1,970 Views
18 Pages

Neural audio reconstruction is an important subtopic of Neural Audio Synthesis (NAS), which is a current emerging topic of modern Artificial Intelligence (AI) applications. The objective of a neural audio reconstruction model is to achieve a viable a...

  • Article
  • Open Access
2 Citations
2,400 Views
25 Pages

14 August 2023

During earthworks, monitoring and controlling the actual productivity of construction machines enables insight into the progress of tasks, calculation of expected duration and costs, favorable use and allocation of machines, and the application of ap...

  • Article
  • Open Access
869 Views
28 Pages

AI-based audio generation has advanced rapidly, enabling deepfake audio to reach levels of naturalness that closely resemble real recordings and complicate the distinction between authentic and synthetic signals. While numerous CNN- and Transformer-b...

  • Data Descriptor
  • Open Access
6,596 Views
18 Pages

9 December 2024

The detection of human activities is an important step in automated systems to understand the context of given situations. It can be useful for applications like healthcare monitoring, smart homes, and energy management systems for buildings. To achi...

  • Article
  • Open Access
126 Citations
12,735 Views
14 Pages

Automatic Detection and Recognition of Pig Wasting Diseases Using Sound Data in Audio Surveillance Systems

  • Yongwha Chung,
  • Seunggeun Oh,
  • Jonguk Lee,
  • Daihee Park,
  • Hong-Hee Chang and
  • Suk Kim

25 September 2013

Automatic detection of pig wasting diseases is an important issue in the management of group-housed pigs. Further, respiratory diseases are one of the main causes of mortality among pigs and loss of productivity in intensive pig farming. In this stud...

  • Article
  • Open Access
32 Citations
8,966 Views
16 Pages

18 August 2018

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Sec...

  • Article
  • Open Access
7 Citations
3,781 Views
21 Pages

Energy-Efficient Audio Processing at the Edge for Biologging Applications

  • Jonathan Miquel,
  • Laurent Latorre and
  • Simon Chamaillé-Jammes

Biologging refers to the use of animal-borne recording devices to study wildlife behavior. In the case of audio recording, such devices generate large amounts of data over several months, and thus require some level of processing automation for the r...

  • Article
  • Open Access
682 Views
20 Pages

17 August 2025

We propose a novel Bayesian wavelet regression approach using a three-component spike-and-slab prior for wavelet coefficients, combining a point mass at zero, a moment (MOM) prior, and an inverse moment (IMOM) prior. This flexible prior supports smal...

  • Data Descriptor
  • Open Access
1 Citations
2,319 Views
8 Pages

31 August 2024

Despite the widespread development and use of chatbots, there is a lack of audio-based interruption datasets. This study provides a dataset of 200 manually annotated interruptions from a broader set of 355 data points of overlapping utterances. The d...

  • Article
  • Open Access
10 Citations
3,491 Views
20 Pages

Novel Method for Detecting Coughing Pigs with Audio-Visual Multimodality for Smart Agriculture Monitoring

  • Heechan Chae,
  • Junhee Lee,
  • Jonggwan Kim,
  • Sejun Lee,
  • Jonguk Lee,
  • Yongwha Chung and
  • Daihee Park

12 November 2024

While the pig industry is crucial in global meat consumption, accounting for 34% of total consumption, respiratory diseases in pigs can cause substantial economic losses to pig farms. To alleviate this issue, we propose an advanced audio-visual monit...

  • Article
  • Open Access
8 Citations
2,429 Views
16 Pages

DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data

  • Vandet Pann,
  • Kyeong-seok Kwon,
  • Byeonghyeon Kim,
  • Dong-Hwa Jang and
  • Jong-Bok Kim

9 July 2024

Since pig vocalization is an important indicator of monitoring pig conditions, pig vocalization detection and recognition using deep learning play a crucial role in the management and welfare of modern pig livestock farming. However, collecting pig s...

  • Article
  • Open Access
136 Citations
16,641 Views
12 Pages

Fault Detection and Diagnosis of Railway Point Machines by Sound Analysis

  • Jonguk Lee,
  • Heesu Choi,
  • Daihee Park,
  • Yongwha Chung,
  • Hee-Young Kim and
  • Sukhan Yoon

16 April 2016

Railway point devices act as actuators that provide different routes to trains by driving switchblades from the current position to the opposite one. Point failure can significantly affect railway operations, with potentially disastrous consequences....

  • Review
  • Open Access
17 Citations
7,523 Views
25 Pages

Recent Advances in Synthesis and Interaction of Speech, Text, and Vision

  • Laura Orynbay,
  • Bibigul Razakhova,
  • Peter Peer,
  • Blaž Meden and
  • Žiga Emeršič

In recent years, there has been increasing interest in the conversion of images into audio descriptions. This is a field that lies at the intersection of Computer Vision (CV) and Natural Language Processing (NLP), and it involves various tasks, inclu...

  • Article
  • Open Access
6 Citations
3,155 Views
22 Pages

Addressing Power Issues in Biologging: An Audio/Inertial Recorder Case Study

  • Jonathan Miquel,
  • Laurent Latorre and
  • Simon Chamaillé-Jammes

26 October 2022

In the past decades, biologging, i.e., the development and deployment of animal-borne loggers, has revolutionized ecology. Despite recent advances, power consumption and battery size however remain central issues and limiting factors, constraining th...

  • Article
  • Open Access
11 Citations
9,486 Views
18 Pages

Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction

  • Sergio Escalera,
  • Xavier Baró,
  • Jordi Vitrià,
  • Petia Radeva and
  • Bogdan Raducanu

7 February 2012

Social interactions are a very important component in people’s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore...

  • Article
  • Open Access
94 Citations
10,692 Views
18 Pages

An Ensemble of Convolutional Neural Networks for Audio Classification

  • Loris Nanni,
  • Gianluca Maguolo,
  • Sheryl Brahnam and
  • Michelangelo Paci

22 June 2021

Research in sound classification and recognition is rapidly advancing in the field of pattern recognition. One important area in this field is environmental sound recognition, whether it concerns the identification of endangered species in different...

  • Article
  • Open Access
25 Citations
4,517 Views
18 Pages

19 May 2021

In 2014, we designed and implemented BeePi, a multi-sensor electronic beehive monitoring system. Since then we have been using BeePi monitors deployed at different apiaries in northern Utah to design audio, image, and video processing algorithms to a...

  • Article
  • Open Access
877 Views
23 Pages

28 November 2025

The increasing use of machine learning models has amplified the demand for high-quality, large-scale multimodal datasets. However, the availability of such datasets, especially those combining acoustic, visual, and textual data, remains limited. This...

  • Article
  • Open Access
7 Citations
4,719 Views
25 Pages

Bat2Web: A Framework for Real-Time Classification of Bat Species Echolocation Signals Using Audio Sensor Data

  • Taslim Mahbub,
  • Azadan Bhagwagar,
  • Priyanka Chand,
  • Imran Zualkernan,
  • Jacky Judas and
  • Dana Dghaym

1 May 2024

Bats play a pivotal role in maintaining ecological balance, and studying their behaviors offers vital insights into environmental health and aids in conservation efforts. Determining the presence of various bat species in an environment is essential...

  • Article
  • Open Access
6 Citations
3,536 Views
19 Pages

24 May 2018

Over the past decades, hardware and software technologies for wireless sensor networks (WSNs) have significantly progressed, and WSNs are widely used in various areas including Internet of Things (IoT). In general, existing WSNs are mainly used for a...

  • Article
  • Open Access
14 Citations
4,584 Views
17 Pages

5 November 2021

Adding network connectivity to any “thing” can certainly provide great value, but it also brings along potential cybersecurity risks. To fully benefit from the Internet of Things “IoT” system’s capabilities, the validity and accuracy of transmitted d...

  • Article
  • Open Access
2 Citations
4,215 Views
26 Pages

An Ensemble of Convolutional Neural Networks for Sound Event Detection

  • Abdinabi Mukhamadiyev,
  • Ilyos Khujayarov,
  • Dilorom Nabieva and
  • Jinsoo Cho

Sound event detection tasks are rapidly advancing in the field of pattern recognition, and deep learning methods are particularly well suited for such tasks. One of the important directions in this field is to detect the sounds of emotional events ar...

  • Feature Paper
  • Article
  • Open Access
34 Citations
4,936 Views
14 Pages

28 November 2022

In this paper, an automatic speech emotion recognition (SER) task of classifying eight different emotions was experimented using parallel based networks trained using the Ryeson Audio-Visual Dataset of Speech and Song (RAVDESS) dataset. A combination...

  • Review
  • Open Access
104 Citations
20,786 Views
32 Pages

Data Augmentation and Deep Learning Methods in Sound Classification: A Systematic Review

  • Olusola O. Abayomi-Alli,
  • Robertas Damaševičius,
  • Atika Qazi,
  • Mariam Adedoyin-Olowe and
  • Sanjay Misra

18 November 2022

The aim of this systematic literature review (SLR) is to identify and critically evaluate current research advancements with respect to small data and the use of data augmentation methods to increase the amount of data available for deep learning cla...

  • Article
  • Open Access
1,163 Views
14 Pages

12 September 2025

Introduction: Advanced identification and intervention for Congenital Heart Defects (CHDs) in pediatric populations are crucial, as approximately 1% of neonates worldwide present with these conditions. Traditional methods of diagnosing CHDs often rel...

  • Article
  • Open Access
14 Citations
3,300 Views
14 Pages

WINkNN: Windowed Intervals’ Number kNN Classifier for Efficient Time-Series Applications

  • Chris Lytridis,
  • Anna Lekova,
  • Christos Bazinas,
  • Michail Manios and
  • Vassilis G. Kaburlasos

13 March 2020

Our interest is in time series classification regarding cyber–physical systems (CPSs) with emphasis in human-robot interaction. We propose an extension of the k nearest neighbor (kNN) classifier to time-series classification using intervals’ numbers...

  • Article
  • Open Access
33 Citations
5,835 Views
23 Pages

Comparison of Smoothing Filters’ Influence on Quality of Data Recorded with the Emotiv EPOC Flex Brain–Computer Interface Headset during Audio Stimulation

  • Natalia Browarska,
  • Aleksandra Kawala-Sterniuk,
  • Jaroslaw Zygarlicki,
  • Michal Podpora,
  • Mariusz Pelc,
  • Radek Martinek and
  • Edward Jacek Gorzelańczyk

13 January 2021

Off-the-shelf, consumer-grade EEG equipment is nowadays becoming the first-choice equipment for many scientists when it comes to recording brain waves for research purposes. On one hand, this is perfectly understandable due to its availability and re...

  • Article
  • Open Access
4 Citations
5,895 Views
9 Pages

From Waste Plastics to Carbon Nanotube Audio Cables

  • Varun Shenoy Gangoli,
  • Tim Yick,
  • Fang Bian and
  • Alvin Orbaek White

25 January 2022

Carbon nanotubes (CNTs) have long been at the forefront of materials research, with applications ranging from composites for increased tensile strength in construction and sports equipment to transistor switches and solar cell electrodes in energy ap...

  • Article
  • Open Access
8 Citations
7,834 Views
16 Pages

29 September 2024

SecureVision is an advanced and trustworthy deepfake detection system created to tackle the growing threat of ‘deepfake’ movies that tamper with media, undermine public trust, and jeopardize cybersecurity. We present a novel approach that...

  • Article
  • Open Access
3 Citations
2,835 Views
18 Pages

17 November 2021

The break-up of the supercontinent Rodinia in the late Neoproterozoic led to the formation of the Nanhua rift basin within the South China Block. The Datangpo-type manganese deposit, which developed in the Nanhua rift basin, is one of the most import...

  • Article
  • Open Access
22 Citations
3,081 Views
31 Pages

Respiratory Condition Detection Using Audio Analysis and Convolutional Neural Networks Optimized by Modified Metaheuristics

  • Nebojsa Bacanin,
  • Luka Jovanovic,
  • Ruxandra Stoean,
  • Catalin Stoean,
  • Miodrag Zivkovic,
  • Milos Antonijevic and
  • Milos Dobrojevic

18 May 2024

Respiratory conditions have been a focal point in recent medical studies. Early detection and timely treatment are crucial factors in improving patient outcomes for any medical condition. Traditionally, doctors diagnose respiratory conditions through...

of 27