Next Article in Journal
Asynchronous Floating-Point Adders and Communication Protocols: A Survey
Next Article in Special Issue
Self-Attentive Multi-Layer Aggregation with Feature Recalibration and Deep Length Normalization for Text-Independent Speaker Verification System
Previous Article in Journal
Utilising Deep Learning Techniques for Effective Zero-Day Attack Detection
Previous Article in Special Issue
Survey of Automatic Spelling Correction
Open AccessArticle

Lex-Pos Feature-Based Grammar Error Detection System for the English Language

Department of Information Security and Communication Technology, Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway
*
Author to whom correspondence should be addressed.
Electronics 2020, 9(10), 1686; https://doi.org/10.3390/electronics9101686
Received: 6 September 2020 / Revised: 29 September 2020 / Accepted: 1 October 2020 / Published: 14 October 2020
(This article belongs to the Special Issue Human Computer Interaction for Intelligent Systems)
This work focuses on designing a grammar detection system that understands both structural and contextual information of sentences for validating whether the English sentences are grammatically correct. Most existing systems model a grammar detector by translating the sentences into sequences of either words appearing in the sentences or syntactic tags holding the grammar knowledge of the sentences. In this paper, we show that both these sequencing approaches have limitations. The former model is over specific, whereas the latter model is over generalized, which in turn affects the performance of the grammar classifier. Therefore, the paper proposes a new sequencing approach that contains both information, linguistic as well as syntactic, of a sentence. We call this sequence a Lex-Pos sequence. The main objective of the paper is to demonstrate that the proposed Lex-Pos sequence has the potential to imbibe the specific nature of the linguistic words (i.e., lexicals) and generic structural characteristics of a sentence via Part-Of-Speech (POS) tags, and so, can lead to a significant improvement in detecting grammar errors. Furthermore, the paper proposes a new vector representation technique, Word Embedding One-Hot Encoding (WEOE) to transform this Lex-Pos into mathematical values. The paper also introduces a new error induction technique to artificially generate the POS tag specific incorrect sentences for training. The classifier is trained using two corpora of incorrect sentences, one with general errors and another with POS tag specific errors. Long Short-Term Memory (LSTM) neural network architecture has been employed to build the grammar classifier. The study conducts nine experiments to validate the strength of the Lex-Pos sequences. The Lex-Pos -based models are observed as superior in two ways: (1) they give more accurate predictions; and (2) they are more stable as lesser accuracy drops have been recorded from training to testing. To further prove the potential of the proposed Lex-Pos -based model, we compare it with some well known existing studies. View Full-Text
Keywords: Natural Language Processing; deep learning; grammar error detection; word embedding Natural Language Processing; deep learning; grammar error detection; word embedding
Show Figures

Figure 1

MDPI and ACS Style

Agarwal, N.; Wani, M.A.; Bours, P. Lex-Pos Feature-Based Grammar Error Detection System for the English Language. Electronics 2020, 9, 1686.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop