Previous Article in Journal
Analyzing the Overturn of Roe v. Wade: A Term Co-Occurrence Network Analysis of YouTube Comments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System

by
Denisa Maria Herlea
,
Bogdan Iancu
and
Eugen-Richard Ardelean
*
Computer Science Department, Technical University of Cluj-Napoca, 400114 Cluj-Napoca, Romania
*
Author to whom correspondence should be addressed.
Informatics 2025, 12(2), 50; https://doi.org/10.3390/informatics12020050 (registering DOI)
Submission received: 12 February 2025 / Revised: 25 April 2025 / Accepted: 15 May 2025 / Published: 16 May 2025
(This article belongs to the Section Machine Learning)

Abstract

This study investigates the ability of well-known deep learning models, such as ResNet and EfficientNet, to perform audio-based infant cry detection. By comparing the performance of different machine learning algorithms, this study seeks to determine the most effective approach for the detection of infant crying, enhancing the functionality of baby monitoring systems and contributing to a more advanced understanding of audio-based deep learning applications. Understanding and accurately detecting a baby’s cries is crucial for ensuring their safety and well-being, a concern shared by new and expecting parents worldwide. Despite advancements in child health, as noted by UNICEF’s 2022 report of the lowest ever recorded child mortality rate, there is still room for technological improvement. This paper presents a comprehensive evaluation of deep learning models for infant cry detection, analyzing the performance of various architectures on spectrogram and MFCC feature representations. A key focus is the comparison between pretrained and non-pretrained models, assessing their ability to generalize across diverse audio environments. Through extensive experimentation, ResNet50 and DenseNet trained on spectrograms emerged as the most effective architectures, significantly outperforming other models in classification accuracy. Additionally, the study investigates the impact of feature extraction techniques, dataset augmentation, and model fine-tuning, providing deeper insights into the role of representation learning in audio classification. The findings contribute to the growing field of audio-based deep learning applications, offering a detailed comparative study of model architectures, feature representations, and training strategies for infant cry detection.
Keywords: deep learning; convolutional neural network; classification; infant crying; ResNet; EfficientNet deep learning; convolutional neural network; classification; infant crying; ResNet; EfficientNet

Share and Cite

MDPI and ACS Style

Herlea, D.M.; Iancu, B.; Ardelean, E.-R. A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System. Informatics 2025, 12, 50. https://doi.org/10.3390/informatics12020050

AMA Style

Herlea DM, Iancu B, Ardelean E-R. A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System. Informatics. 2025; 12(2):50. https://doi.org/10.3390/informatics12020050

Chicago/Turabian Style

Herlea, Denisa Maria, Bogdan Iancu, and Eugen-Richard Ardelean. 2025. "A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System" Informatics 12, no. 2: 50. https://doi.org/10.3390/informatics12020050

APA Style

Herlea, D. M., Iancu, B., & Ardelean, E.-R. (2025). A Study of Deep Learning Models for Audio Classification of Infant Crying in a Baby Monitoring System. Informatics, 12(2), 50. https://doi.org/10.3390/informatics12020050

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop