Next Article in Journal
Critical Perspective on Advanced Treatment Processes for Water and Wastewater: AOPs, ARPs, and AORPs
Next Article in Special Issue
Best Practices of Convolutional Neural Networks for Question Classification
Previous Article in Journal
A Novel High Gain Monopole Antenna Array for 60 GHz Millimeter-Wave Communications
Previous Article in Special Issue
A Polarity Capturing Sphere for Word to Vector Representation
Open AccessArticle

Text Normalization Using Encoder–Decoder Networks Based on the Causal Feature Extractor

1
Max Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
2
Department of Computer Science and Systems, University of Murcia, 30100 Murcia, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2020, 10(13), 4551; https://doi.org/10.3390/app10134551
Received: 9 May 2020 / Revised: 26 June 2020 / Accepted: 27 June 2020 / Published: 30 June 2020
The encoder–decoder architecture is a well-established, effective and widely used approach in many tasks of natural language processing (NLP), among other domains. It consists of two closely-collaborating components: An encoder that transforms the input into an intermediate form, and a decoder producing the output. This paper proposes a new method for the encoder, named Causal Feature Extractor (CFE), based on three main ideas: Causal convolutions, dilatations and bidirectionality. We apply this method to text normalization, which is a ubiquitous problem that appears as the first step of many text-to-speech (TTS) systems. Given a text with symbols, the problem consists in writing the text exactly as it should be read by the TTS system. We make use of an attention-based encoder–decoder architecture using a fine-grained character-level approach rather than the usual word-level one. The proposed CFE is compared to other common encoders, such as convolutional neural networks (CNN) and long-short term memories (LSTM). Experimental results show the feasibility of CFE, achieving better results in terms of accuracy, number of parameters, convergence time, and use of an attention mechanism based on attention matrices. The obtained accuracy ranges from 83.5% to 96.8% correctly normalized sentences, depending on the dataset. Moreover, the proposed method is generic and can be applied to different types of input such as text, audio and images. View Full-Text
Keywords: text normalization; natural language processing; deep neural networks; causal encoder text normalization; natural language processing; deep neural networks; causal encoder
Show Figures

Figure 1

MDPI and ACS Style

Javaloy, A.; García-Mateos, G. Text Normalization Using Encoder–Decoder Networks Based on the Causal Feature Extractor. Appl. Sci. 2020, 10, 4551.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop