Data Mining Applied in Natural Language Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 September 2024 | Viewed by 895

Special Issue Editor

School of Artificial Intelligence, Beijing University of Posts and Communications, Beijing 100876, China
Interests: multimodal intelligence; deep learning; machine learning

Special Issue Information

Dear Colleagues,

The objective of this Special Issue is to invite diverse submissions, fostering a collaborative initiative to comprehensively comprehend the emerging opportunities and challenges in the realm of data mining applied in natural language processing. We aim to identify key tasks, evaluate the current state of the art, showcase inventive methodologies and ideas, introduce substantial real-world systems or applications, propose new datasets, and engage in discussions about future directions. Through this coordinated effort, we aspire to advance our understanding of the intricate interplay between data mining and natural language processing, paving the way for advancements in the field.

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but not limited to) the following areas (in alphabetical order):

  • Computational Social Science and Social Media;
  • Dialogue and Interactive Systems;
  • Discourse and Pragmatics;
  • Information Extraction;
  • Interpretablity and Analysis of Models for NLP;
  • Linguistic Theories, Cognitive Modeling, and Psycholinguistics;
  • Machine Learning for NLP;
  • Machine Translation and Multilinguality;
  • Named Entity Recognition and Text Classification;
  • Phonology, Morphology, and Word Segmentation;
  • Semantics: Lexical, Sentence level, Textual Inference, and Other areas;
  • Sentiment Analysis and Opinion Mining;
  • Summarization;
  • Syntax: Tagging, Chunking, and Parsing;
  • Text Mining and Information Retrieval.

We look forward to receiving your contributions.

Technical Program Committee Member:

Dr. Yu Zhao  Southwestern University of Finance and Economic

Dr. Ruifan Li
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • natural language processing
  • text mining
  • sentiment analysis
  • machine translation
  • deep learning
  • named entity recognition
  • text classification
  • cross-language NLP

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 727 KiB  
Article
Semantic Augmentation in Chinese Adversarial Corpus for Discourse Relation Recognition Based on Internal Semantic Elements
by Zheng Hua, Ruixia Yang, Yanbin Feng and Xiaojun Yin
Electronics 2024, 13(10), 1944; https://doi.org/10.3390/electronics13101944 - 15 May 2024
Viewed by 288
Abstract
This paper proposes incorporating linguistic semantic information into discourse relation recognition and constructing a Semantic Augmented Chinese Discourse Corpus (SACA) comprising 9546 adversative complex sentences. In adversative complex sentences, we suggest a quadruple (P, Q, R, Qβ) [...] Read more.
This paper proposes incorporating linguistic semantic information into discourse relation recognition and constructing a Semantic Augmented Chinese Discourse Corpus (SACA) comprising 9546 adversative complex sentences. In adversative complex sentences, we suggest a quadruple (P, Q, R, Qβ) representing internal semantic elements, where the semantic opposition between Q and Qβ forms the basis of the adversative relationship. P denotes the premise, and R represents the adversative reason. The overall annotation approach of this corpus follows the Penn Discourse Treebank (PDTB), except for the classification of senses. We combined insights from the Chinese Discourse Treebank (CDTB) and obtained eight sense categories for Chinese adversative complex sentences. Based on this corpus, we explore the relationship between sense classification and internal semantic elements within our newly proposed Chinese Adversative Discourse Relation Recognition (CADRR) task. Leveraging deep learning techniques, we constructed various classification models and the model that utilizes internal semantic element features, demonstrating their effectiveness and the applicability of our SACA corpus. Compared with pre-trained models, our model incorporates internal semantic element information to achieve state-of-the-art performance. Full article
(This article belongs to the Special Issue Data Mining Applied in Natural Language Processing)
20 pages, 2744 KiB  
Article
CogCol: Code Graph-Based Contrastive Learning Model for Code Summarization
by Yucen Shi, Ying Yin, Mingqian Yu and Liangyu Chu
Electronics 2024, 13(10), 1816; https://doi.org/10.3390/electronics13101816 - 8 May 2024
Viewed by 341
Abstract
Summarizing source code by natural language aims to help developers better understand existing code, making software development more efficient. Since source code is highly structured, recent research uses code structure information like Abstract Semantic Tree (AST) to enhance the structure understanding rather than [...] Read more.
Summarizing source code by natural language aims to help developers better understand existing code, making software development more efficient. Since source code is highly structured, recent research uses code structure information like Abstract Semantic Tree (AST) to enhance the structure understanding rather than a normal translation task. However, AST can only represent the syntactic relationship of code snippets, which can not reflect high-level relationships like control and data dependency in the program dependency graph. Moreover, researchers treat the AST as the unique structure information of one code snippet corresponding to one summarization. It will be easily affected by simple perturbations as it lacks the understanding of code with similar structure. To handle the above problems, we build CogCol, a Code graph-based Contrastive learning model. CogCol is a Transformer-based model that converts code graphs into unique sequences to enhance the model’s structure learning. In detail, CogCol uses supervised contrastive learning by building several kinds of code graphs as positive samples to enhance the structural representation of code snippets and generalizability. Moreover, experiments on the widely used open-source dataset show that CogCol can significantly improve the state-of-the-art code summarization models under Meteor, BLEU, and ROUGE. Full article
(This article belongs to the Special Issue Data Mining Applied in Natural Language Processing)
Show Figures

Figure 1

Back to TopTop