- Article
SpeakingFaces: A Large-Scale Multimodal Dataset of Voice Commands with Visual and Thermal Video Streams
- Madina Abdrakhmanova,
- Askat Kuzdeuov,
- Sheikh Jarju,
- Yerbolat Khassanov,
- Michael Lewis and
- Huseyin Atakan Varol
We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interact...