Next Article in Journal
The Effects of Facial Expressions on Face Biometric System’s Reliability
Next Article in Special Issue
Data-Driven Critical Tract Variable Determination for European Portuguese
Previous Article in Journal
TechTeach—An Innovative Method to Increase the Students Engagement at Classrooms
Previous Article in Special Issue
The BioVisualSpeech Corpus of Words with Sibilants for Speech Therapy Games Development
Open AccessArticle

Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese

by Pedro Fialho 1,2,*, Luísa Coheur 1,3 and Paulo Quaresma 1,2
1
INESC-ID, Rua Alves Redol 9, 1000-029 Lisboa, Portugal
2
Departamento de Informática, Universidade de Évora, Rua Romão Ramalho, 59 7000-671 Évora, Portugal
3
Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1 1049-001 Lisboa, Portugal
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in PROPOR 2020.
Information 2020, 11(10), 484; https://doi.org/10.3390/info11100484
Received: 1 August 2020 / Revised: 18 September 2020 / Accepted: 29 September 2020 / Published: 15 October 2020
(This article belongs to the Special Issue Selected Papers from PROPOR 2020)
Two sentences can be related in many different ways. Distinct tasks in natural language processing aim to identify different semantic relations between sentences. We developed several models for natural language inference and semantic textual similarity for the Portuguese language. We took advantage of pre-trained models (BERT); additionally, we studied the roles of lexical features. We tested our models in several datasets—ASSIN, SICK-BR and ASSIN2—and the best results were usually achieved with ptBERT-Large, trained in a Brazilian corpus and tuned in the latter datasets. Besides obtaining state-of-the-art results, this is, to the best of our knowledge, the most all-inclusive study about natural language inference and semantic textual similarity for the Portuguese language. View Full-Text
Keywords: natural language inference; semantic textual similarity; multilingual BERT; lexical features natural language inference; semantic textual similarity; multilingual BERT; lexical features
Show Figures

Figure 1

MDPI and ACS Style

Fialho, P.; Coheur, L.; Quaresma, P. Benchmarking Natural Language Inference and Semantic Textual Similarity for Portuguese. Information 2020, 11, 484.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop