Next Article in Journal
Low-Cost Sensors for Urban Noise Monitoring Networks—A Literature Review
Next Article in Special Issue
Stress Evaluation in Simulated Autonomous and Manual Driving through the Analysis of Skin Potential Response and Electrocardiogram Signals
Previous Article in Journal
Model and Experimental Study on Optical Fiber CT Based on Terfenol-D
Previous Article in Special Issue
Wireless Sensors System for Stress Detection by Means of ECG and EDA Acquisition
 
 
Article

NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals

1
Electrical and Computer Engineering, University of Texas at Austin, Austin, TX 78712, USA
2
Department of Neurology, University of Texas at Austin, Austin, TX 78712, USA
3
Department of Psychology, University of Texas at Austin, Austin, TX 78712, USA
4
MEG Lab, Dell Children’s Medical Center, Austin, TX 78723, USA
5
Electrical Engineering, University of Texas at Dallas, Richardson, TX 75080, USA
6
Communication Sciences and Disorders, University of Texas at Austin, Austin, TX 78712, USA
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(8), 2248; https://doi.org/10.3390/s20082248
Received: 19 March 2020 / Revised: 11 April 2020 / Accepted: 14 April 2020 / Published: 16 April 2020
(This article belongs to the Special Issue Electromagnetic Sensors for Biomedical Applications)
Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy. View Full-Text
Keywords: brain-computer interface; MEG; wavelet; LSTM-RNN; SVM; VAD; speech-BCI brain-computer interface; MEG; wavelet; LSTM-RNN; SVM; VAD; speech-BCI
Show Figures

Figure 1

MDPI and ACS Style

Dash, D.; Ferrari, P.; Dutta, S.; Wang, J. NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals. Sensors 2020, 20, 2248. https://doi.org/10.3390/s20082248

AMA Style

Dash D, Ferrari P, Dutta S, Wang J. NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals. Sensors. 2020; 20(8):2248. https://doi.org/10.3390/s20082248

Chicago/Turabian Style

Dash, Debadatta, Paul Ferrari, Satwik Dutta, and Jun Wang. 2020. "NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals" Sensors 20, no. 8: 2248. https://doi.org/10.3390/s20082248

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop