Next Article in Journal / Special Issue
Developing Amaia: A Conversational Agent for Helping Portuguese Entrepreneurs—An Extensive Exploration of Question-Matching Approaches for Portuguese
Previous Article in Journal
A Methodology for Creating a Macro Action Plan to Improve IT Use and Its Governance in Organizations
Article

Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation

1
Interinstitutional Center for Computational Linguistics, University of São Paulo, São Carlos 13566-590, Brazil
2
Federal Institute of Piauí, Picos 64600-000, Brazil
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in 14th International Conference on the Computational Processing of Portuguese (PROPOR 2020), Evora, Portugal, 2–4 March 2020.
Information 2020, 11(9), 422; https://doi.org/10.3390/info11090422
Received: 29 July 2020 / Revised: 28 August 2020 / Accepted: 28 August 2020 / Published: 1 September 2020
(This article belongs to the Special Issue Selected Papers from PROPOR 2020)
Paraphrase detection is a Natural-Language Processing (NLP) task that aims at automatically identifying whether two sentences convey the same meaning (even with different words). For the Portuguese language, most of the works model this task as a machine-learning solution, extracting features and training a classifier. In this paper, following a different line, we explore a graph structure representation and model the paraphrase identification task over a heterogeneous network. We also adopt a back-translation strategy for data augmentation to balance the dataset we use. Our approach, although simple, outperforms the best results reported for the paraphrase detection task in Portuguese, showing that graph structures may capture better the semantic relatedness among sentences. View Full-Text
Keywords: semantic similarity; paraphrase identification; heterogeneous network semantic similarity; paraphrase identification; heterogeneous network
Show Figures

Figure 1

MDPI and ACS Style

Anchiêta, R.T.; Sousa, R.F.d.; Pardo, T.A.S. Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation. Information 2020, 11, 422. https://doi.org/10.3390/info11090422

AMA Style

Anchiêta RT, Sousa RFd, Pardo TAS. Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation. Information. 2020; 11(9):422. https://doi.org/10.3390/info11090422

Chicago/Turabian Style

Anchiêta, Rafael T., Rogério F.d. Sousa, and Thiago A.S. Pardo 2020. "Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation" Information 11, no. 9: 422. https://doi.org/10.3390/info11090422

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop