A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection

Akıncı, Nida; Özbay, Erdal

doi:10.3390/app16136707

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection

by

Nida Akıncı

and

Erdal Özbay

^*

Department of Computer Engineering, Firat University, 23119 Elazig, Türkiye

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(13), 6707; https://doi.org/10.3390/app16136707 (registering DOI)

Submission received: 1 June 2026 / Revised: 27 June 2026 / Accepted: 2 July 2026 / Published: 4 July 2026

Download Versions Notes

Abstract

The widespread adoption of voice-based authentication systems has been accompanied by an escalating threat from deep learning-based synthetic speech generation techniques. This study presents a comparative and experimental investigation of cepstral feature extraction methods for deepfake speech detection. Specifically, Mel-Frequency Cepstral Coefficients (MFCC), Linear-Frequency Cepstral Coefficients (LFCC), and Constant-Q Cepstral Coefficients (CQCC) are systematically evaluated with respect to their frequency scaling characteristics, spectral resolution properties, and capacity to capture artifacts specific to synthetic speech production. Experiments were conducted on 5571 audio samples drawn from the ASVspoof 2021 Logical Access evaluation partition, with all methods assessed under identical classification conditions using a linear Support Vector Machine. Results indicate that CQCC attains the highest numerical performance, achieving 83.59% accuracy, 89.15% ROC-AUC, and 15.83% Equal Error Rate (EER); however, the performance difference between MFCC and CQCC does not reach statistical significance (p = 0.202). Five-fold cross-validation corroborates this finding (CQCC: 87.89% ± 0.81%). McNemar’s test confirms that the performance difference between LFCC and CQCC is statistically significant (p = 0.036). A fine-grained attack-wise analysis across 13 spoofing systems reveals that no single feature representation consistently outperforms the others across all attack types; CQCC achieves the highest accuracy on 6 out of 13 systems, while MFCC remains competitive on several attack categories. The overall findings indicate that deepfake detection performance is highly sensitive not only to the classifier architecture but also to the choice of frequency scale, cepstral transformation design, and data conditions. Empirical motivation is provided that multi-feature strategies integrating complementary frequency representations may offer more robust and generalizable detection solutions.

Keywords: deepfake speech detection; anti-spoofing countermeasure; constant-Q transform; cepstral analysis

Share and Cite

MDPI and ACS Style

Akıncı, N.; Özbay, E. A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection. Appl. Sci. 2026, 16, 6707. https://doi.org/10.3390/app16136707

AMA Style

Akıncı N, Özbay E. A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection. Applied Sciences. 2026; 16(13):6707. https://doi.org/10.3390/app16136707

Chicago/Turabian Style

Akıncı, Nida, and Erdal Özbay. 2026. "A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection" Applied Sciences 16, no. 13: 6707. https://doi.org/10.3390/app16136707

APA Style

Akıncı, N., & Özbay, E. (2026). A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection. Applied Sciences, 16(13), 6707. https://doi.org/10.3390/app16136707

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comparative Investigation of Cepstral Feature Extraction Methods for Deepfake Speech Detection

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI